Skip to content

Commit b810099

Browse files
add prompt tracking to hallucination (#32998)
* add prompt tracking to hallucination * add version notice * Update content/en/llm_observability/evaluations/managed_evaluations/_index.md Co-authored-by: Heston Hoffman <[email protected]> * Update content/en/llm_observability/evaluations/managed_evaluations/_index.md Co-authored-by: Heston Hoffman <[email protected]> * Update content/en/llm_observability/evaluations/managed_evaluations/_index.md Co-authored-by: Heston Hoffman <[email protected]> * Update content/en/llm_observability/evaluations/managed_evaluations/_index.md Co-authored-by: Heston Hoffman <[email protected]> --------- Co-authored-by: Heston Hoffman <[email protected]>
1 parent 3178478 commit b810099

File tree

1 file changed

+11
-8
lines changed
  • content/en/llm_observability/evaluations/managed_evaluations

1 file changed

+11
-8
lines changed

content/en/llm_observability/evaluations/managed_evaluations/_index.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -171,16 +171,17 @@ This check identifies instances where the LLM makes a claim that disagrees with
171171
| Evaluated on Output | Evaluated using LLM | Hallucination flags any output that disagrees with the context provided to the LLM. |
172172

173173
##### Instrumentation
174-
175-
In order to take advantage of Hallucination detection, you will need to annotate LLM spans with the user query and context:
174+
You can use [Prompt Tracking][6] annotations to track your prompts and set them up for hallucination configuration. Annotate your LLM spans with the user query and context so hallucination detection can evaluate model outputs against the retrieved data.
176175

177176
{{< code-block lang="python" >}}
178177
from ddtrace.llmobs import LLMObs
179-
from ddtrace.llmobs.utils import Prompt
178+
from ddtrace.llmobs.types import Prompt
180179

181180
# if your llm call is auto-instrumented...
182181
with LLMObs.annotation_context(
183182
prompt=Prompt(
183+
id="generate_answer_prompt",
184+
template="Generate an answer to this question :{user_question}. Only answer based on the information from this article : {article}",
184185
variables={"user_question": user_question, "article": article},
185186
rag_query_variables=["user_question"],
186187
rag_context_variables=["article"]
@@ -195,18 +196,20 @@ def generate_answer():
195196
...
196197
LLMObs.annotate(
197198
prompt=Prompt(
199+
id="generate_answer_prompt",
200+
template="Generate an answer to this question :{user_question}. Only answer based on the information from this article : {article}",
198201
variables={"user_question": user_question, "article": article},
199202
rag_query_variables=["user_question"],
200203
rag_context_variables=["article"]
201204
),
202205
)
203206
{{< /code-block >}}
204-
205-
The variables dictionary should contain the key-value pairs your app uses to construct the LLM input prompt (for example, the messages for an OpenAI chat completion request). Set `rag_query_variables` and `rag_context_variables` to indicate which variables constitute the query and the context, respectively. A list of variables is allowed to account for cases where multiple variables make up the context (for example, multiple articles retrieved from a knowledge base).
207+
The `variables` dictionary should contain the key-value pairs your app uses to construct the LLM input prompt (for example, the messages for an OpenAI chat completion request). Use `rag_query_variables` and `rag_context_variables` to specify which variables represent the user query and which represent the retrieval context. A list of variables is allowed to account for cases where multiple variables make up the context (for example, multiple articles retrieved from a knowledge base).
206208

207209
Hallucination detection does not run if either the rag query, the rag context, or the span output is empty.
208210

209-
You can find more examples of instrumentation in the [SDK documentation][6].
211+
Prompt Tracking is available on python starting from the 3.15 version, It also requires an ID for the prompt and the template set up to monitor and track your prompt versions.
212+
You can find more examples of prompt tracking and instrumentation in the [SDK documentation][6].
210213

211214
##### Hallucination configuration
212215
<div class="alert alert-info">Hallucination detection is only available for OpenAI.</div>
@@ -336,8 +339,8 @@ This check ensures that sensitive information is handled appropriately and secur
336339
[2]: https://app.datadoghq.com/llm/evaluations
337340
[3]: https://app.datadoghq.com/llm/applications
338341
[4]: /security/sensitive_data_scanner/
339-
[5]: https://docs.datadoghq.com/api/latest/ip-ranges/
340-
[6]: https://docs.datadoghq.com/llm_observability/setup/sdk/
342+
[5]: /api/latest/ip-ranges/
343+
[6]: /llm_observability/instrumentation/sdk?tab=python#prompt-tracking
341344
[7]: https://app.datadoghq.com/dash/integration/llm_evaluations_token_usage
342345
[9]: https://learnprompting.org/docs/prompt_hacking/offensive_measures/simple-instruction-attack
343346
[10]: https://owasp.org/www-community/attacks/Code_Injection

0 commit comments

Comments
 (0)