Claude And ChatGPT For Ad-Hoc Sidequests
- Simon Willison tl;dr: The author demonstrates a quick ”sidequest" task where he converted the shapefile of a largest park in NY to a GeoJSON polygon in just 6 minutes. “One of the greatest misconceptions concerning LLMs is that they’re easy to use. They aren’t: getting great results requires a great deal of experience and hard-fought intuition, combined with deep domain knowledge of the problem you are applying them to.”featured in #501
Meta's New LLM-Based Test Generator Is A Sneak Peek To The Future Of Development
- Leonardo Creed tl;dr: “Meta claims that this “this is the first paper to report on LLM-generated code that has been developed independent of human intervention (other than final review sign off), and landed into large scale industrial production systems with guaranteed assurances for improvement over the existing code base.” Furthermore, there are solid principles that developers can take away in order to use AI effectively themselves.”featured in #492
Engineering Practices For LLM Application Development
- David Tan Jesse Wang tl;dr: “LLM engineering involves much more than just prompt design or prompt engineering. In this article, we share a set of engineering practices that helped us deliver a prototype LLM application rapidly and reliably in a recent project. We'll share techniques for automated testing and adversarial testing of LLM applications, refactoring, as well as considerations for architecting LLM applications and responsible AI.”featured in #489
The Pain Points Of Building A Copilot
- Austin Henley tl;dr: What are the pain points, and what are the opportunities for tools. ”We conducted semi-structured interviews with 26 developers from a variety of companies that are working on copilots. We analyzed their responses to identify themes. Then we conducted two focus group sessions with tool builders that involved reviewing our interview findings and brainstorming possible solutions.” Austin shares the results here.featured in #486
featured in #478
Navigating The Chaos: Why You Don’t Need Another MLOps Tool
tl;dr: AI/ML development lacks systematic processes, leading to errors and biases in deployed models. The MLOps landscape is fragmented, and teams need to glue together a ton of bespoke and third-party tools to meet basic needs. We don’t think you should, so we're building Openlayer to condense and simplify AI evaluation.featured in #469
Effortless Engineering: Quick Tips for Crafting Prompts
- Michael Sickles tl;dr: "This blog will walk you through building out different prompts, exploring the outputs, and optimizing them for better results. Even though we can't guarantee outputs, we can still measure how the prompt is doing in various ways."featured in #461
Embeddings: What They Are And Why They Matter
- Simon Willison tl;dr: “Embeddings are based around one trick: take a piece of content—in this case a blog entry — and turn that piece of content into an array of floating point numbers.” Simon shows us what this looks like and argues that we can learn interesting things about the content this way - “it might capture colors, shapes, concepts or all sorts of other characteristics of the content that has been embedded.” Simon also shows us practical use cases of how this may show up.featured in #459
So We Shipped An AI Product. Did it Work?
- Phillip Carter tl;dr: “Like many companies, earlier this year we saw an opportunity with LLMs and quickly but thoughtfully started building a capability. About a month later, we released Query Assistant to all customers as an experimental feature. We then iterated on it, using data from production to inform a multitude of additional enhancements, and ultimately took Query Assistant out of experimentation and turned it into a core product offering. However, getting Query Assistant from concept to feature diverted R&D and marketing resources, forcing the question: did investing in LLMs do what we wanted it to do?”featured in #454
LLMs Demand Observability-Driven Development
- Charity Majors tl;dr: “Many software engineers are encountering LLMs for the very first time, while many ML engineers are being exposed directly to production systems for the very first time. Both types of engineers are finding themselves plunged into a disorienting new world—one where a particular flavor of production problem they may have encountered occasionally in their careers is now front and center. Namely, that LLMs are black boxes that produce nondeterministic outputs and cannot be debugged or tested using traditional software engineering techniques. Hooking these black boxes up to production introduces reliability and predictability problems that can be terrifying.“ Charity believes that the integration of LLMs will necessitate a shift in development practices, particularly towards Observability-Driven Development, to handle the nondeterministic nature of these models.featured in #450