Context Engineering #

The quality of an LLM response does not just depend on the suitability of our instruction. What’s even more important is whether the LLM has access to all the information it needs to perform our task. Bringing together all this information is what we call context engineering. In this chapter we’ll explore where the LLM’s context comes from, and how we can manipulate it.

Model Context #

As we saw earlier, the context length of a model is the total number of tokens it can process. These tokens come from the system prompt, the user prompts and LLM responses in the conversation, all the documents you’ve uploaded, the results of tools like web search, etc. GPT-5 accepts a maximum of 272,000 input tokens and generate another 128,000 tokens. This may sound enormous, but it’s a limit you can reach fast when you are working with large collections of documents: you can’t just upload all the personal documents on your computer and start asking questions about them. What’s even worse, the performance of even the best models is known to degrade significantly with longer context — a phenomenon we call context rot. For all these reasons, we need to think carefully about the context we give an LLM access to. It needs to contain all required information to fulfil a task, while being as small as possible.

Context engineering is the task of assembling the optimal context for an LLM to fulfil a task. (From Effective context engineering for AI agents, by Anthropic) — Context engineering is the task of assembling the optimal context for an LLM to fulfil a task.
(From Effective context engineering for AI agents, by Anthropic)

Few-Shot Prompting #

Tip: Give examples of similar tasks and optimal responses.

The most direct way of providing an LLM with additional context is by adding examples to your prompt. This is what we call few-shot prompting. Whereas the zero-shot prompting techniques of the previous chapter simply present the LLM with a question or an instruction, few-shot prompts include one or more examples that illustrate both the task and its solution. Such examples often help the model produce more accurate and consistent responses, and they can also enable it to handle more complex tasks that it might not interpret correctly from instructions alone.

Of course, not all examples are created equally. As the prompting guide by Anthropic, the developers of Claude, points out, the best examples are:

relevant: they must reflect actual use cases.
diverse: they cover a variety of cases, including edge cases.
clear: they are set apart form the rest of the prompt, for example by wrapping them in <example> tags.

In a detailed study of few-shot prompting, Min et al. 2022 found that even incorrect (but relevant) examples can give the model helpful information about the range and format of possible answers!

Since the advent of LLMs, social media have been alive with screenshots of seemingly simple instructions that LLMs struggle with. One by now classic example is counting the number of r’s in a word like strawberry or cranberry. Indeed, if we ask GPT-4o how many r’s there are in cranberry, it often (but not always) answers two instead of three. This is one case where few-shot prompting helps: if you give a few examples of correct answers, GPT-4o is far more likely to answer correctly. Note also that its response follows the structure of the examples in the prompt.

Few-shot prompts help LLMs solve tasks they might struggle with otherwise.

Manipulating Context in ChatGPT and Claude #

ChatGPT #

ChatGPT offers various ways of customizing its output. These range from custom instructions that are automatically added to the beginning of your conversation to a built-in memory that stores potentially relevant information about you, and custom GPTs that allow you to set up separate GPTs for particular tasks.

Custom Instructions #

Custom Instructions offer the simplest way of customizing ChatGPT’s output. In the browser app, you’ll find them behind the first letters of your name in the corner of the screen, under Customize ChatGPT. In the pop-up that appears, you can tell ChatGPT what ChatGPT should call you — your first name or your highness, say — and what your job is. Next you can tweak the style of its answers by giving it some traits. You can ask it be talkative, witty, poetic, or any characteristic you can think of. You can ask it to “talk like a member of Gen Z” or more like your grandfather, to “take a forward-thinking view”, or to “have a traditional outlook, valuing the past and how things have always been done”. Finally, you can add anything else ChatGPT should know about you — any interests or values you may have.

If you’re often frustrated by ChatGPT’s long-winded answers, this is where you can tell it to answer short and to the point. If you always want to receive answers in a particular language, this is where you can do it. When you’re done, don’t forget to make sure Enable for new chats at the bottom of the window is toggled on, and to start a new chat for the changes to take effect. The custom instructions will then be added to the system prompt of every new chat, so that the underlying LLM can take them into account.

ChatGPT has only one set of custom instructions, so you can only build one personal profile. However, when you create a new project, you can also add instructions that are restricted to that project context. This feature is a bit more basic, since ChatGPT offers just one text field for your instructions, but it serves the same purpose and works in the same way. Additionally, you can also add files for the LLM in your project to access.

Memory #

Another source of personal information is ChatGPT’s memory feature. This too, can be accessed by clicking your name, then Settings and Personalization. In this menu you can toggle two memory settings: the chatbot can Reference saved memories or Reference chat history.

Saved memories are personal details that ChatGPT stores during your conversations. These can contain your age, occupation and hobbies, preferences and interests, etc. Whenever the chatbot is writing something to its memory, you’ll see the message Updating memory. It can do this without you asking, but you can also trigger this action by explicitly prompting it to remember something: “Remember that I only read books in Dutch.”

A big advantage of saved memories is that you can control them. If you click Manage in the Personalization menu, you can review all memories and delete the ones you don’t want the chatbot to remember. It’s best to go through these memories now and then, as ChatGPT is not very consistent in its choices: some items will indeed contain useful information about you, but others will only be relevant to a task you performed months ago and not be useful anymore.

Chat history is more of a black box. We don’t know a lot about how OpenAI has implemented this feature, but presumably ChatGPT keeps a database of all your previous prompts and responses. Whenever you enter a new prompt, it will search this database for the most relevant past messages and add those to the conversation. In this way the LLM can reference that information to make its new response and useful to you as possible.

Custom GPTs #

The final and most flexible way to customize ChatGPT is by creating a custom GPT. Clicking GPTs in the sidebar will take you to the GPT store, where many organizations and developers offer their own GPTs. These are optimized for particular tasks like image generation, creative writing, academic research, writing a CV, etc. This generally means that they work with a specialized system prompt, have access to custom uploaded files, or use online sources (like for example research databases or weather forecasts) that the standard ChatGPT does not have access to.

You can create your own custom GPT by clicking + Create in the upper right corner of the screen. The software will then take you through a conversation in which you can give your GPT a name, specify the task it should excel at, the conversational tone it should adopt, etc. You can also equip the GPT with tools like web search or image generation, select the LLM it should use (or leave this choice up to the user), tweak its system prompt and upload any files that the LLM should have access to.

When you use ChatGPT in many different contexts, custom GPTs offer the most compelling way to tailor your experience. You can set up a custom GPT for your different professonal projects (each with their own style and relevant documents), your hobbies, etc. Moreover, you can share your GPTs with other users as well. This is an easy way to let people ask questions about a report you compiled, a manual you wrote, and so on.

Claude #

Claude allows you to control its answers with so-called styles. These are similar to ChatGPT’s custom instructions. When you give the model an instruction, you can choose the style it will answer in. There are four standard styles — normal, concise, explanatory and formal — but you can also choose to create your own.

You can create a custom style by clicking Create and Edit Styles below the list of styles. Claude offers two options: you can either define a new style by uploading a writing example, or you can describe the style. If you click Describe style instead you’ll see a menu reminiscent of ChatGPT’s Custom Instructions. Here you can define the objective of your style, its intended audience, its voice and tone, and other general information. When you finally click Generate Style, you can give your style a name and you can select it whenever you want Claude to use it.

You can create a custom style for Claude by defining its objective, audience, and voice & tone.