Context Engineering

Context Engineering is crucial for an Agent to produce correct results. When a model doesn’t answer well, it’s often not due to insufficient capability, but rather because it didn’t receive enough contextual information to infer the correct result. It’s necessary to enhance the Agent’s ability to acquire and manage context through context engineering.

LangGraph divides context into three types:

Model Context
Tool Context
Life-cycle Context

Regardless of the type of Context, its Schema needs to be defined. In this regard, LangGraph provides considerable flexibility - you can use any of dataclasses, pydantic, or TypedDict to create your Context Schema.

# !pip install ipynbname

import os
import uuid
import sqlite3

from typing import Callable
from dotenv import load_dotenv
from dataclasses import dataclass
from langchain_openai import ChatOpenAI
from langchain.tools import tool, ToolRuntime
from langchain.agents import create_agent
from langchain.agents.middleware import dynamic_prompt, wrap_model_call, ModelRequest, ModelResponse, SummarizationMiddleware
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.store.memory import InMemoryStore
from langgraph.store.sqlite import SqliteStore

# Load model configuration
_ = load_dotenv()

# Load model
llm = ChatOpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url=os.getenv("DASHSCOPE_BASE_URL"),
    model="qwen3-coder-plus",
    temperature=0.7,
)

1. Dynamically Modifying System Prompts¶

Context engineering is closely related to the middleware and memory from previous chapters. The specific implementation of context depends on middleware, while context storage relies on the memory system. Specifically, LangGraph provides a pre-built @dynamic_prompt middleware for dynamically modifying system prompts.

Since it’s dynamic modification, there must be certain conditions to trigger the modification. Besides developing trigger logic, we also need to obtain immediate variables required by the trigger logic from the Agent. These variables are usually stored in the following three storage media:

Runtime - All nodes share one Runtime. At the same moment, all nodes get the same Runtime value. Generally used to store information with high timeliness requirements.
Short-term Memory (State) - Passed sequentially between nodes, each node receives the State processed by the previous node. Mainly used to store Prompts and AI Messages.
Long-term Memory (Store) - Responsible for persistent storage, can save information across Workflows/Agents. Can be used to store user preferences, previously calculated statistics, etc.

The following three examples demonstrate how to use context from Runtime, State, and Store to write trigger conditions.

1.1 Using `State` to Manage Context¶

Use information contained in State to manipulate the system prompt.

@dynamic_prompt
def state_aware_prompt(request: ModelRequest) -> str:
    # request.messages is a shortcut for request.state["messages"]
    message_count = len(request.messages)

    base = "You are a helpful assistant."

    if message_count > 6:
        base += "\nThis is a long conversation - be extra concise."

    # Temporarily print base to see the effect
    print(base)

    return base

agent = create_agent(
    model=llm,
    middleware=[state_aware_prompt]
)

result = agent.invoke(
    {"messages": [
        {"role": "user", "content": "How is the weather in Guangzhou today?"},
        {"role": "assistant", "content": "The weather in Guangzhou is great"},
        {"role": "user", "content": "What should I eat?"},
        {"role": "assistant", "content": "How about trying lemongrass eel casserole?"},
        {"role": "user", "content": "What is lemongrass?"},
        {"role": "assistant", "content": "Lemongrass, also known as lemon grass, is commonly found in Thai Tom Yum soup and Vietnamese grilled meat dishes"},
        {"role": "user", "content": "Aww, what are we waiting for? Let's go eat!"},
    ]},
)

for message in result['messages']:
    message.pretty_print()

Change the 6 in message_count > 6 to 7 and see what happens.

1.2 Using `Store` to Manage Context¶

@dataclass
class Context:
    user_id: str

@dynamic_prompt
def store_aware_prompt(request: ModelRequest) -> str:
    user_id = request.runtime.context.user_id

    # Read from Store: get user preferences
    store = request.runtime.store
    user_prefs = store.get(("preferences",), user_id)

    base = "You are a helpful assistant."

    if user_prefs:
        style = user_prefs.value.get("communication_style", "balanced")
        base += f"\nUser prefers {style} responses."

    return base

store = InMemoryStore()

agent = create_agent(
    model=llm,
    middleware=[store_aware_prompt],
    context_schema=Context,
    store=store,
)

# Pre-set two preference records
store.put(("preferences",), "user_1", {"communication_style": "Chinese"})
store.put(("preferences",), "user_2", {"communication_style": "Korean"})

# User 1 prefers concise responses
result = agent.invoke(
    {"messages": [
        {"role": "system", "content": "You are a helpful assistant. Please be extra concise."},
        {"role": "user", "content": 'What is a "hold short line"?'}
    ]},
    context=Context(user_id="user_1"),
)

for message in result['messages']:
    message.pretty_print()

# User 2 prefers detailed responses
result = agent.invoke(
    {"messages": [
        {"role": "system", "content": "You are a helpful assistant. Please be extra concise."},
        {"role": "user", "content": 'What is a "hold short line"?'}
    ]},
    context=Context(user_id="user_2"),
)

for message in result['messages']:
    message.pretty_print()

1.3 Using `Runtime` to Manage Context¶

@dataclass
class Context:
    user_role: str
    deployment_env: str

@dynamic_prompt
def context_aware_prompt(request: ModelRequest) -> str:
    # Read from Runtime Context: user role and environment
    user_role = request.runtime.context.user_role
    env = request.runtime.context.deployment_env

    base = "You are a helpful assistant."

    if user_role == "admin":
        base += "\nYou can use the get_weather tool."
    else:
        base += "\nYou are prohibited from using the get_weather tool."

    if env == "production":
        base += "\nBe extra careful with any data modifications."

    return base

@tool
def get_weather(city: str) -> str:
    """Get weather for a given city."""
    return f"It's always sunny in {city}!"

agent = create_agent(
    model=llm,
    tools=[get_weather],
    middleware=[context_aware_prompt],
    context_schema=Context,
    checkpointer=InMemorySaver(),
)

# Use two variables from Runtime to dynamically control the System prompt
# Set user_role to admin to allow using the weather query tool
config = {'configurable': {'thread_id': str(uuid.uuid4())}}
result = agent.invoke(
    {"messages": [{"role": "user", "content": "How is the weather in Guangzhou today?"}]},
    context=Context(user_role="admin", deployment_env="production"),
    config=config,
)

for message in result['messages']:
    message.pretty_print()

# If user_role is changed to viewer, the weather query tool cannot be used
config = {'configurable': {'thread_id': str(uuid.uuid4())}}
result = agent.invoke(
    {"messages": [{"role": "user", "content": "How is the weather in Guangzhou today?"}]},
    context=Context(user_role="viewer", deployment_env="production"),
    config=config,
)

for message in result['messages']:
    message.pretty_print()

result['messages']

2. Dynamically Modifying Message Lists¶

LangGraph provides a pre-built middleware @wrap_model_call for dynamically modifying message lists. The previous section demonstrated how to obtain context from State, Store, and Runtime. This section will not repeat these demonstrations. In the following example, we mainly demonstrate how to use Runtime to inject content from local files into the message list.

@dataclass
class FileContext:
    uploaded_files: list[dict]

@wrap_model_call
def inject_file_context(
    request: ModelRequest,
    handler: Callable[[ModelRequest], ModelResponse]
) -> ModelResponse:
    """Inject context about files user has uploaded this session."""
    uploaded_files = request.runtime.context.uploaded_files

    try:
        base_dir = os.path.dirname(os.path.abspath(__file__))
    except Exception as e:
        import ipynbname
        import os
        notebook_path = ipynbname.path()
        base_dir = os.path.dirname(notebook_path)

    file_sections = []
    for file in uploaded_files:
        name, ftype = "", ""
        path = file.get("path")
        if path:
            base_filename = os.path.basename(path)
            stem, ext = os.path.splitext(base_filename)
            name = stem or base_filename
            ftype = (ext.lstrip(".") if ext else None)

            # Build file description content
            content_list = [f"Name: {name}"]
            if ftype:
                content_list.append(f"Type: {ftype}")

            # Resolve relative path to absolute path
            abs_path = path if os.path.isabs(path) else os.path.join(base_dir, path)

            # Read file content
            content_block = ""
            if abs_path and os.path.exists(abs_path):
                try:
                    with open(abs_path, "r", encoding="utf-8") as f:
                        content_block = f.read()
                except Exception as e:
                    content_block = f"[Error reading file '{abs_path}': {e}]"
            else:
                content_block = "[File path missing or not found]"

            section = (
                f"---\n"
                f"{chr(10).join(content_list)}\n\n"
                f"{content_block}\n"
                f"---"
            )
            file_sections.append(section)

        file_context = (
            "Loaded session files:\n"
            f"{chr(10).join(file_sections)}"
            "\nPlease refer to these files when answering questions."
        )

        # Inject file context before recent messages
        messages = [  
            *request.messages,
            {"role": "user", "content": file_context},
        ]
        request = request.override(messages=messages)  

    return handler(request)

agent = create_agent(
    model=llm,
    middleware=[inject_file_context],
    context_schema=FileContext,
)

result = agent.invoke(
    {
        "messages": [{
            "role": "user",
            "content": "What should be noted about the faceless passengers in Shanghai Metro?",
        }],
    },
    context=FileContext(uploaded_files=[{"path": "./docs/rule_horror.md"}]),
)

for message in result['messages']:
    message.pretty_print()

3. Using Context in Tools¶

Below, we try to use context information stored in SqliteStore in tools.

# Delete SQLite database
if os.path.exists("user-info.db"):
    os.remove("user-info.db")

# Create SQLite storage
conn = sqlite3.connect("user-info.db", check_same_thread=False, isolation_level=None)
conn.execute("PRAGMA journal_mode=WAL;")
conn.execute("PRAGMA busy_timeout = 30000;")

store = SqliteStore(conn)

# Pre-set two user records
store.put(("user_info",), "Liu Ruyan", {"description": "A cold and talented beauty with extraordinary skills, embarking on a journey into the martial world to uncover the mystery of her origins.", "birthplace": "Wuxing County"})
store.put(("user_info",), "Su Mubai", {"description": "A proud swordsman with superb swordsmanship, bearing the blood feud of his family, hiding in the marketplace seeking the truth.", "birthplace": "Hang County"})

3.1 Basic Example¶

Using ToolRuntime

@tool
def fetch_user_data(
    user_id: str,
    runtime: ToolRuntime
) -> str:
    """
    Fetch user information from the in-memory store.

    :param user_id: The unique identifier of the user.
    :param runtime: The tool runtime context injected by the framework.
    :return: The user's description string if found; an empty string otherwise.
    """
    store = runtime.store
    user_info = store.get(("user_info",), user_id)

    user_desc = ""
    if user_info:
        user_desc = user_info.value.get("description", "")

    return user_desc

agent = create_agent(
    model=llm,
    tools=[fetch_user_data],
    store=store,
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Within five minutes, I want all information about Liu Ruyan"
    }]
})

for message in result['messages']:
    message.pretty_print()

3.2 More Complex Example¶

Using ToolRuntime[Context]

@dataclass
class Context:
    key: str

@tool
def fetch_user_data(
    user_id: str,
    runtime: ToolRuntime[Context]
) -> str:
    """
    Fetch user information from the in-memory store.

    :param user_id: The unique identifier of the user.
    :param runtime: The tool runtime context injected by the framework.
    :return: The user's description string if found; an empty string otherwise.
    """
    key = runtime.context.key

    store = runtime.store
    user_info = store.get(("user_info",), user_id)

    user_desc = ""
    if user_info:
        user_desc = user_info.value.get(key, "")

    return f"{key}: {user_desc}"

agent = create_agent(
    model=llm,
    tools=[fetch_user_data],
    store=store,
)

result = agent.invoke(
    {"messages": [{"role": "user", "content": "Within five minutes, I want all information about Liu Ruyan"}]},
    context=Context(key="birthplace"),
)

for message in result['messages']:
    message.pretty_print()

4. Compressing Context¶

LangChain provides a built-in middleware SummarizationMiddleware for compressing context. This middleware maintains a typical life-cycle context. Unlike the transient updates of model context and tool context, life-cycle context is continuously updated: continuously replacing old messages with summaries.

Unless the context is excessively long, causing model capability degradation, there is no need to use SummarizationMiddleware. Generally, the threshold for triggering summarization can be set quite high. For example:

max_tokens_before_summary: 3000
messages_to_keep: 20

If you want to learn more about Context Rot, the Chroma team published Context Rot: How Increasing Input Tokens Impacts LLM Performance on July 14, 2025, which systematically reveals the phenomenon of model performance degradation caused by long contexts.

# Create short-term memory
checkpointer = InMemorySaver()

# Create Agent with built-in summarization middleware
# The trigger value is set very low to make the configuration work in our example
agent = create_agent(
    model=llm,
    middleware=[
        SummarizationMiddleware(
            model=llm,
            trigger=('tokens', 40),  # Trigger summarization at 40 tokens
            keep=('messages', 1),  # Keep last 1 messages after summary
        ),
    ],
)

result = agent.invoke(
    {"messages": [
        {"role": "user", "content": "How is the weather in Guangzhou today?"},
        {"role": "assistant", "content": "The weather in Guangzhou is great"},
        {"role": "user", "content": "What should I eat?"},
        {"role": "assistant", "content": "How about trying lemongrass eel casserole?"},
        {"role": "user", "content": "What is lemongrass?"},
        {"role": "assistant", "content": "Lemongrass, also known as lemon grass, is commonly found in Thai Tom Yum soup and Vietnamese grilled meat dishes"},
        {"role": "user", "content": "Aww, what are we waiting for? Let's go eat!"},
    ]},
    checkpointer=checkpointer,
)

for message in result['messages']:
    message.pretty_print()