Engineering Practices For LLM Application Development
- David Tan Jesse Wang tl;dr: “LLM engineering involves much more than just prompt design or prompt engineering. In this article, we share a set of engineering practices that helped us deliver a prototype LLM application rapidly and reliably in a recent project. We'll share techniques for automated testing and adversarial testing of LLM applications, refactoring, as well as considerations for architecting LLM applications and responsible AI.”featured in #489
The Pain Points Of Building A Copilot
- Austin Henley tl;dr: What are the pain points, and what are the opportunities for tools. ”We conducted semi-structured interviews with 26 developers from a variety of companies that are working on copilots. We analyzed their responses to identify themes. Then we conducted two focus group sessions with tool builders that involved reviewing our interview findings and brainstorming possible solutions.” Austin shares the results here.featured in #486
featured in #478
Navigating The Chaos: Why You Don’t Need Another MLOps Tool
tl;dr: AI/ML development lacks systematic processes, leading to errors and biases in deployed models. The MLOps landscape is fragmented, and teams need to glue together a ton of bespoke and third-party tools to meet basic needs. We don’t think you should, so we're building Openlayer to condense and simplify AI evaluation.featured in #469
Effortless Engineering: Quick Tips for Crafting Prompts
- Michael Sickles tl;dr: "This blog will walk you through building out different prompts, exploring the outputs, and optimizing them for better results. Even though we can't guarantee outputs, we can still measure how the prompt is doing in various ways."featured in #461
Embeddings: What They Are And Why They Matter
- Simon Willison tl;dr: “Embeddings are based around one trick: take a piece of content—in this case a blog entry — and turn that piece of content into an array of floating point numbers.” Simon shows us what this looks like and argues that we can learn interesting things about the content this way - “it might capture colors, shapes, concepts or all sorts of other characteristics of the content that has been embedded.” Simon also shows us practical use cases of how this may show up.featured in #459
So We Shipped An AI Product. Did it Work?
- Phillip Carter tl;dr: “Like many companies, earlier this year we saw an opportunity with LLMs and quickly but thoughtfully started building a capability. About a month later, we released Query Assistant to all customers as an experimental feature. We then iterated on it, using data from production to inform a multitude of additional enhancements, and ultimately took Query Assistant out of experimentation and turned it into a core product offering. However, getting Query Assistant from concept to feature diverted R&D and marketing resources, forcing the question: did investing in LLMs do what we wanted it to do?”featured in #454
LLMs Demand Observability-Driven Development
- Charity Majors tl;dr: “Many software engineers are encountering LLMs for the very first time, while many ML engineers are being exposed directly to production systems for the very first time. Both types of engineers are finding themselves plunged into a disorienting new world—one where a particular flavor of production problem they may have encountered occasionally in their careers is now front and center. Namely, that LLMs are black boxes that produce nondeterministic outputs and cannot be debugged or tested using traditional software engineering techniques. Hooking these black boxes up to production introduces reliability and predictability problems that can be terrifying.“ Charity believes that the integration of LLMs will necessitate a shift in development practices, particularly towards Observability-Driven Development, to handle the nondeterministic nature of these models.featured in #450
Build And Keep Your Context Window
- Vicki Boykis tl;dr: “When humans have no external context as guardrails, we end up recreating what’s already been done or, on the other hand, throwing away things that work and glomming onto hype without substance. This is a real problem in production data systems. In order to do this, we need to understand how to build one.” Vicki believes that we must understand the historical context of our engineering decisions if we are to be successful in this brave new LLM world.featured in #448
Lessons From Building A Domain-Specific AI Assistant
- Eric Liu tl;dr: Eric Liu, Engineer at Airplane, discusses how the Airplane team built a domain-specific AI assistant, the lessons they learned along the way, and what's next for the future of AI assistants.featured in #447