Essential Reading For Engineering Leaders

Claude And ChatGPT For Ad-Hoc Sidequests

- Simon Willison

tl;dr: The author demonstrates a quick ”sidequest" task where he converted the shapefile of a largest park in NY to a GeoJSON polygon in just 6 minutes. “One of the greatest misconceptions concerning LLMs is that they’re easy to use. They aren’t: getting great results requires a great deal of experience and hard-fought intuition, combined with deep domain knowledge of the problem you are applying them to.”

featured in #501

Meta's New LLM-Based Test Generator Is A Sneak Peek To The Future Of Development

- Leonardo Creed

tl;dr: “Meta claims that this “this is the first paper to report on LLM-generated code that has been developed independent of human intervention (other than final review sign off), and landed into large scale industrial production systems with guaranteed assurances for improvement over the existing code base.” Furthermore, there are solid principles that developers can take away in order to use AI effectively themselves.”

featured in #492

Engineering Practices For LLM Application Development

- David Tan Jesse Wang

LLM

tl;dr: “LLM engineering involves much more than just prompt design or prompt engineering. In this article, we share a set of engineering practices that helped us deliver a prototype LLM application rapidly and reliably in a recent project. We'll share techniques for automated testing and adversarial testing of LLM applications, refactoring, as well as considerations for architecting LLM applications and responsible AI.”

featured in #489

The Pain Points Of Building A Copilot

- Austin Henley

LLM

tl;dr: What are the pain points, and what are the opportunities for tools. ”We conducted semi-structured interviews with 26 developers from a variety of companies that are working on copilots. We analyzed their responses to identify themes. Then we conducted two focus group sessions with tool builders that involved reviewing our interview findings and brainstorming possible solutions.” Austin shares the results here.

featured in #486

GPT In 500 Lines Of SQL

SQL
LLM

tl;dr: "Before a text can be fed to a neural network, it needs to be converted into a list of numbers. GPT2 uses a variation of the algorithm called Byte pair encoding to do precisely that. Its tokenizer uses a dictionary of 50257 code points - in AI parlance, 'tokens' - that correspond to different byte sequences in UTF-8, plus the 'end of text' as a separate token. This dictionary was built by statistical analysis performed like this: Start with a simple encoding of 256 tokens: one token per byte. Perform the collapse 50000 times over."

featured in #478

Navigating The Chaos: Why You Don’t Need Another MLOps Tool

ML
LLM
Tools

tl;dr: AI/ML development lacks systematic processes, leading to errors and biases in deployed models. The MLOps landscape is fragmented, and teams need to glue together a ton of bespoke and third-party tools to meet basic needs. We don’t think you should, so we're building Openlayer to condense and simplify AI evaluation.

featured in #469

Effortless Engineering: Quick Tips for Crafting Prompts

- Michael Sickles

Tips
LLM

tl;dr: "This blog will walk you through building out different prompts, exploring the outputs, and optimizing them for better results. Even though we can't guarantee outputs, we can still measure how the prompt is doing in various ways."

featured in #461

Embeddings: What They Are And Why They Matter

- Simon Willison

LLM
AI

tl;dr: “Embeddings are based around one trick: take a piece of content—in this case a blog entry — and turn that piece of content into an array of floating point numbers.” Simon shows us what this looks like and argues that we can learn interesting things about the content this way - “it might capture colors, shapes, concepts or all sorts of other characteristics of the content that has been embedded.” Simon also shows us practical use cases of how this may show up.

featured in #459

So We Shipped An AI Product. Did it Work?

- Phillip Carter

tl;dr: “Like many companies, earlier this year we saw an opportunity with LLMs and quickly but thoughtfully started building a capability. About a month later, we released Query Assistant to all customers as an experimental feature. We then iterated on it, using data from production to inform a multitude of additional enhancements, and ultimately took Query Assistant out of experimentation and turned it into a core product offering. However, getting Query Assistant from concept to feature diverted R&D and marketing resources, forcing the question: did investing in LLMs do what we wanted it to do?”

featured in #454

LLMs Demand Observability-Driven Development

- Charity Majors

tl;dr: “Many software engineers are encountering LLMs for the very first time, while many ML engineers are being exposed directly to production systems for the very first time. Both types of engineers are finding themselves plunged into a disorienting new world—one where a particular flavor of production problem they may have encountered occasionally in their careers is now front and center. Namely, that LLMs are black boxes that produce nondeterministic outputs and cannot be debugged or tested using traditional software engineering techniques. Hooking these black boxes up to production introduces reliability and predictability problems that can be terrifying.“ Charity believes that the integration of LLMs will necessitate a shift in development practices, particularly towards Observability-Driven Development, to handle the nondeterministic nature of these models.

featured in #450

/LLM