Skip to content

Commit 38c6cb9

Browse files
authored
in-house framework readme (#70)
1 parent 0d7d536 commit 38c6cb9

File tree

1 file changed

+216
-0
lines changed
  • instrumentation-genai/opentelemetry-instrumentation-langchain/examples/multi_agent_travel_planner/in_house_version

1 file changed

+216
-0
lines changed
Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# In-House Instrumentation Example: Multi-Agent Travel Planner
2+
3+
This directory shows how to manually instrument an in‑house (LangGraph / LangChain‑based) multi‑agent workflow using the structured GenAI types provided by `opentelemetry.util.genai.types`.
4+
5+
The core types:
6+
7+
* `Workflow` – high‑level orchestration span (end‑to‑end request lifecycle).
8+
* `AgentInvocation` – one logical agent or tool reasoning step.
9+
* `LLMInvocation` – a single model call (chat / completion / embeddings).
10+
* `InputMessage` / `OutputMessage` – structured message parts (role + list of parts). Each part can be a `Text`, image, etc.
11+
12+
Benefits of using these types instead of ad‑hoc span attributes:
13+
14+
1. Consistency – every model call captures inputs, outputs, tokens the same way.
15+
2. Extensibility – evaluation / replay / redaction layers can rely on stable data shapes.
16+
3. Safety – avoids leaking PII by keeping messages as typed parts you can filter before export.
17+
4. Metrics – token counts populate standard semantic fields without manual key guessing.
18+
19+
---
20+
21+
## Minimal LLMInvocation Example (Single OpenAI Chat Call – Direct OpenAI Client)
22+
23+
```python
24+
from opentelemetry.util.genai.handler import get_telemetry_handler
25+
from opentelemetry.util.genai.types import (
26+
Workflow,
27+
LLMInvocation,
28+
InputMessage,
29+
OutputMessage,
30+
Text,
31+
)
32+
from openai import OpenAI
33+
34+
# Requires: pip install openai ; environment variable OPENAI_API_KEY set.
35+
36+
handler = get_telemetry_handler()
37+
38+
workflow = Workflow(
39+
name="demo_workflow",
40+
workflow_type="single_call",
41+
description="One-off chat completion",
42+
initial_input="Hello, can you summarise OpenTelemetry?",
43+
)
44+
handler.start_workflow(workflow)
45+
46+
llm_invocation = LLMInvocation(
47+
request_model="gpt-4o-mini", # model identifier
48+
operation="chat",
49+
input_messages=[
50+
InputMessage(role="system", parts=[Text(content="You are a concise assistant.")]),
51+
InputMessage(role="user", parts=[Text(content=workflow.initial_input or "")]),
52+
],
53+
)
54+
llm_invocation.provider = "openai"
55+
llm_invocation.framework = "native-client"
56+
handler.start_llm(llm_invocation)
57+
58+
# Convert InputMessages to OpenAI API format (list of {role, content} dicts)
59+
openai_messages = [
60+
{"role": m.role, "content": "".join(part.content for part in m.parts if hasattr(part, "content"))}
61+
for m in llm_invocation.input_messages
62+
]
63+
64+
client = OpenAI()
65+
response = client.chat.completions.create(
66+
model=llm_invocation.request_model,
67+
messages=openai_messages,
68+
temperature=0.2,
69+
)
70+
71+
# Extract assistant answer
72+
choice = response.choices[0]
73+
assistant_text = choice.message.content
74+
75+
llm_invocation.output_messages = [
76+
OutputMessage(role="assistant", parts=[Text(content=assistant_text)], finish_reason=choice.finish_reason or "stop")
77+
]
78+
79+
# Token usage (OpenAI returns usage.prompt_tokens / usage.completion_tokens / usage.total_tokens)
80+
if response.usage:
81+
llm_invocation.input_tokens = response.usage.prompt_tokens
82+
llm_invocation.output_tokens = response.usage.completion_tokens
83+
84+
handler.stop_llm(llm_invocation)
85+
86+
workflow.final_output = assistant_text
87+
handler.stop_workflow(workflow)
88+
```
89+
90+
Key points:
91+
92+
* All user/system inputs are captured up front (`input_messages`).
93+
* The model response becomes `output_messages` (list for multi‑turn or tool streaming scenarios).
94+
* Token counts live on the invocation object – downstream metrics aggregators don’t need to parse raw attributes.
95+
96+
---
97+
98+
## AgentInvocation + LLMInvocation (Typical Pattern – Direct OpenAI Client)
99+
100+
When an agent first reasons about a task (planning, tool selection) you can represent that with `AgentInvocation`. Inside the agent you usually trigger one or more `LLMInvocation`s.
101+
102+
```python
103+
from opentelemetry.util.genai.types import (
104+
Workflow,
105+
AgentInvocation,
106+
LLMInvocation,
107+
InputMessage,
108+
OutputMessage,
109+
Text,
110+
)
111+
from opentelemetry.util.genai.handler import get_telemetry_handler
112+
from openai import OpenAI
113+
114+
handler = get_telemetry_handler()
115+
workflow = Workflow(name="agent_demo", workflow_type="planner", initial_input="Plan a 2-day trip to Rome")
116+
handler.start_workflow(workflow)
117+
118+
agent = AgentInvocation(
119+
name="trip_planner",
120+
agent_type="planner",
121+
model="gpt-4o-mini",
122+
system_instructions="You plan concise city itineraries",
123+
input_context=workflow.initial_input,
124+
)
125+
handler.start_agent(agent)
126+
127+
llm_call = LLMInvocation(
128+
request_model="gpt-4o-mini",
129+
operation="chat",
130+
input_messages=[
131+
InputMessage(role="system", parts=[Text(content="You provide day-by-day plans.")]),
132+
InputMessage(role="user", parts=[Text(content="Plan a 2-day trip to Rome focusing on food and history.")]),
133+
],
134+
)
135+
llm_call.provider = "openai"
136+
llm_call.framework = "native-client"
137+
handler.start_llm(llm_call)
138+
139+
client = OpenAI()
140+
openai_messages = [
141+
{"role": m.role, "content": "".join(p.content for p in m.parts if hasattr(p, "content"))}
142+
for m in llm_call.input_messages
143+
]
144+
response = client.chat.completions.create(
145+
model=llm_call.request_model,
146+
messages=openai_messages,
147+
temperature=0.3,
148+
)
149+
150+
choice = response.choices[0]
151+
assistant_text = choice.message.content
152+
llm_call.output_messages = [
153+
OutputMessage(role="assistant", parts=[Text(content=assistant_text)], finish_reason=choice.finish_reason or "stop")
154+
]
155+
if response.usage:
156+
llm_call.input_tokens = response.usage.prompt_tokens
157+
llm_call.output_tokens = response.usage.completion_tokens
158+
159+
agent.output_result = assistant_text
160+
handler.stop_llm(llm_call)
161+
handler.stop_agent(agent)
162+
workflow.final_output = assistant_text
163+
handler.stop_workflow(workflow)
164+
```
165+
166+
Why this structure helps:
167+
168+
* Multiple `LLMInvocation`s inside one agent (tool lookups, reasoning, synthesis) stay grouped beneath the agent span.
169+
* You can decorate the agent span with evaluation signals later (e.g. quality score) without touching core LLM spans.
170+
* Redaction / filtering can operate at message part granularity before export.
171+
172+
---
173+
174+
## Helper Strategy (Token + Output Auto-Population)
175+
176+
In the travel planner example we use a helper to:
177+
178+
1. Create `output_messages` if the node hasn’t set them yet.
179+
2. Extract token usage from LangChain’s `usage_metadata` or `response_metadata.token_usage`.
180+
181+
Pattern:
182+
183+
```python
184+
_apply_llm_response_metadata(response_message, llm_invocation)
185+
```
186+
187+
Call this immediately after each model invocation (direct OpenAI response object), then stop the LLM span.
188+
189+
---
190+
191+
## Adding Evaluations Later
192+
193+
Because inputs/outputs are normalized:
194+
195+
* You can iterate over finished `LLMInvocation`s and feed them to an evaluation agent (latency, toxicity, factuality).
196+
* Link evaluation spans as children or siblings referencing the `llm_invocation_id`.
197+
198+
---
199+
200+
## Minimal Lifecycle Checklist
201+
202+
1. Start `Workflow` (once per external request).
203+
2. For each logical reasoning component: start `AgentInvocation`.
204+
3. Inside agent: start one or more `LLMInvocation` spans.
205+
4. Populate `input_messages` before the call; populate `output_messages` + tokens right after.
206+
5. Stop spans in reverse order (LLM → Agent → Workflow).
207+
208+
---
209+
210+
## Troubleshooting
211+
212+
* Missing tokens? Ensure your client/library actually returns usage metadata; not all providers do.
213+
* Dropped messages? Confirm you set both `input_messages` and `output_messages` *before* stopping the LLM span.
214+
* Need streaming? Append incremental `OutputMessage` parts as they arrive; finalise with a finish_reason of `stop` or `length`.
215+
216+
---

0 commit comments

Comments
 (0)