Skip to content

Conversation

@zwanzigg
Copy link

@zwanzigg zwanzigg commented Nov 9, 2025

Fixes #3428


Important

Fixes unhashable attributes crash in OpenAI streaming by sanitizing attributes for metric recording.

  • Behavior:
    • Adds _sanitize_attributes_for_metrics() in chat_wrappers.py to convert unhashable attributes to hashable types for metric recording.
    • Applies sanitization in _set_chat_metrics(), _shared_attributes(), _build_from_streaming_response(), and _abuild_from_streaming_response().
  • Tests:
    • New test file test_unhashable_attributes_fix.py added to verify the fix.
    • Tests include handling of lists, dicts, None values, and preserving hashable values.
    • Simulates original issue conditions to ensure the fix prevents crashes.

This description was created by Ellipsis for a1c608d. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed an issue where OpenTelemetry metrics collection could fail when encountering certain attribute value types in OpenAI instrumentation.
  • Tests

    • Added comprehensive test coverage for metrics attribute handling with various data types and edge cases.

…raceloop#3428)

This commit resolves a critical bug where OpenAI streaming instrumentation
crashes with "TypeError: unhashable type: 'list'" when tool definitions
contain lists or other unhashable types.

**Problem:**
- OpenAI tool definitions contain lists (e.g., "required": ["param1", "param2"])
- These unhashable values propagate to metric attributes via Config.get_common_metrics_attributes()
- OpenTelemetry metrics require hashable attributes for aggregation keys
- When creating frozenset(attributes.items()), TypeError occurs and crashes user code

**Solution:**
- Added _sanitize_attributes_for_metrics() function to convert unhashable values to JSON strings
- Applied sanitization in all metric recording paths:
  - ChatStream._shared_attributes() (main streaming path)
  - _set_chat_metrics() (non-streaming path)
  - _build_from_streaming_response() (legacy OpenAI v0 compatibility)
  - _abuild_from_streaming_response() (async legacy compatibility)

**Testing:**
- Added comprehensive test suite covering edge cases
- Verified fix preserves critical attributes while converting problematic values
- All hashable values remain unchanged, unhashable values become JSON strings

Fixes traceloop#3428
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai
Copy link

coderabbitai bot commented Nov 9, 2025

Walkthrough

This PR introduces a _sanitize_attributes_for_metrics() helper function to convert non-hashable metric attribute values (lists, dicts) to JSON strings or string representations, ensuring OpenTelemetry metric compatibility. The function is integrated into five metric recording locations across the ChatStream class and related functions.

Changes

Cohort / File(s) Summary
Metric attribute sanitization
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
Added _sanitize_attributes_for_metrics() helper to convert non-hashable types (lists, dicts) to JSON strings. Applied sanitization in _set_chat_metrics(), ChatStream._shared_attributes(), ChatStream._process_complete_response(), and streaming response builders before metric recording.
Test suite for sanitization fix
packages/opentelemetry-instrumentation-openai/tests/test_unhashable_attributes_fix.py
New test module validating sanitization behavior across edge cases: lists, dicts, None/empty values, and pre-existing hashable values. Includes reproduction of original TypeError crash and verification that sanitized attributes remain hashable.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Sanitization logic correctness: Verify JSON serialization handles edge cases and fallback string conversion is appropriate
  • Integration points: Confirm all five metric recording locations are covered and sanitization is applied before OpenTelemetry record() calls
  • Test coverage: Validate test scenarios comprehensively cover unhashable types (lists, dicts) and ensure mocking of Config.get_common_metrics_attributes() accurately reproduces the production crash
  • Streaming path: Pay special attention to _process_item() and streaming builders where the crash originally occurred

Possibly related PRs

  • PR #3155: Modifies chat_wrappers.py and ChatStream methods including _process_complete_response() and streaming metric emission—direct overlap in implementation areas.

Poem

🐰 Lists and dicts caused quite a fright,
Making metrics crash in streaming's night,
But JSON strings save the day,
Now hashable hops keep crashes at bay! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix unhashable attributes crash in OpenAI streaming instrumentation' directly and accurately summarizes the main change: fixing a critical bug where unhashable attribute values (lists/dicts) crash metric recording in OpenAI streaming instrumentation.
Linked Issues check ✅ Passed The PR successfully addresses all key coding requirements from issue #3428: adds _sanitize_attributes_for_metrics helper to convert unhashable values to hashable forms, integrates sanitization into metric recording paths (_set_chat_metrics, ChatStream._shared_attributes, _process_complete_response, streaming builders), and includes comprehensive tests validating the fix with various attribute types.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the unhashable attributes issue: the new sanitization function, its integration points for metric recording, and the corresponding test module. No unrelated modifications to other functionality or files are present.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to a1c608d in 1 minute and 48 seconds. Click for details.
  • Reviewed 271 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:772
  • Draft comment:
    Good use of _sanitize_attributes_for_metrics in _shared_attributes. If nested unhashable structures can occur inside attribute values, consider whether a recursive sanitization might be needed.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/opentelemetry-instrumentation-openai/tests/test_unhashable_attributes_fix.py:17
  • Draft comment:
    The tests comprehensively cover various unhashable scenarios. Consider adding an extra test for nested unhashable types if such cases are expected in practice.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_rtLrmQxAbRhCvW5c

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

for key, value in attributes.items():
try:
# Test if the value is hashable by creating a frozenset
frozenset({key: value}.items())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider replacing the frozenset({key: value}.items()) check with a direct call to hash(value) to test hashability. This is more idiomatic and may be clearer.

Suggested change
frozenset({key: value}.items())
hash(value)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (1)

234-244: Fix async exception metrics sanitization gap.
achat_wrapper still merges Config.get_common_metrics_attributes() straight into the metric attributes during exception handling. If those common attributes carry lists/dicts (the original failure mode), the subsequent duration_histogram.record/exception_counter.add calls will raise the same TypeError and crash user code. Please sanitize here as well so the fix covers error paths.

-            common_attributes = Config.get_common_metrics_attributes()
-            attributes = {
-                **common_attributes,
+            common_attributes = _sanitize_attributes_for_metrics(
+                Config.get_common_metrics_attributes()
+            )
+            attributes = {
+                **common_attributes,
                 "error.type": e.__class__.__name__,
             }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3468985 and a1c608d.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (5 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/test_unhashable_attributes_fix.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
  • packages/opentelemetry-instrumentation-openai/tests/test_unhashable_attributes_fix.py
🧬 Code graph analysis (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (2)
  • metric_shared_attributes (364-377)
  • _get_openai_base_url (275-281)
packages/opentelemetry-instrumentation-openai/tests/test_unhashable_attributes_fix.py (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (3)
  • _sanitize_attributes_for_metrics (67-95)
  • ChatStream (634-888)
  • _shared_attributes (762-772)
🪛 Flake8 (7.3.0)
packages/opentelemetry-instrumentation-openai/tests/test_unhashable_attributes_fix.py

[error] 8-8: 'unittest.mock.patch' imported but unused

(F401)

cause TypeError when streaming chat completions are used.
"""
import pytest
from unittest.mock import MagicMock, patch
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Drop unused patch import.
Flake8 flags this (F401), so the test module won’t pass lint until patch is removed.

-from unittest.mock import MagicMock, patch
+from unittest.mock import MagicMock
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from unittest.mock import MagicMock, patch
from unittest.mock import MagicMock
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 8-8: 'unittest.mock.patch' imported but unused

(F401)

🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/tests/test_unhashable_attributes_fix.py
around line 8, the import statement includes an unused `patch` (from
unittest.mock) which triggers flake8 F401; remove `patch` from the import so
only `MagicMock` is imported (or delete the entire unused import if not needed
elsewhere) to satisfy linting.

Copy link
Member

@nirga nirga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zwanzigg - can you remove the redundant AI generated comments please? They're hurting the readability.

Also, you should test for hashability using isinstance(..., Hashable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug Report: [Critical Bug] OpenAI streaming instrumentation crashes production with "unhashable type: 'list'" when using tool calls

3 participants