Skip to content

Conversation

@nirga
Copy link
Member

@nirga nirga commented Nov 20, 2025

Summary

Adds support for logging the gen_ai.request.structured_output_schema attribute for Anthropic Claude and Google Gemini APIs, completing coverage across all major LLM providers.

Changes

Anthropic Claude

  • Added logging of output_format parameter with json_schema type
  • Supports Claude's new Structured Outputs feature (launched November 14, 2025)
  • Works with Sonnet 4.5 and Opus 4.1 models
  • Requires beta header: anthropic-beta: structured-outputs-2025-11-13
  • Implementation: packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py

Google Gemini

  • Added logging of response_schema from generation_config parameter
  • Also checks for direct response_schema kwargs
  • Implementation: packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

OpenAI

  • Already supported (no changes needed)
  • Uses existing implementation

Sample Apps

Added demonstration apps for all three providers:

  • packages/sample-app/sample_app/openai_structured_outputs_demo.py (tested ✅)
  • packages/sample-app/sample_app/anthropic_structured_outputs_demo.py
  • packages/sample-app/sample_app/gemini_structured_outputs_demo.py

Testing

OpenAI sample app tested successfully and shows the gen_ai.request.structured_output_schema attribute being logged correctly.

Related Documentation

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Instrumentation now captures and records structured (JSON schema) output definitions for Anthropic.
    • Google Generative AI instrumentation updated to capture structured output schemas from multiple sources.
    • Added sample apps demonstrating structured-output workflows for Anthropic, Gemini, and OpenAI models.

✏️ Tip: You can customize this high-level summary in your review settings.


Important

Adds structured output schema logging for Anthropic and Google Gemini APIs, with sample apps and tests.

  • Behavior:
    • Adds logging of gen_ai.request.structured_output_schema for Anthropic and Google Gemini APIs.
    • Anthropic: Logs output_format with json_schema type in span_utils.py.
    • Google Gemini: Logs response_schema from generation_config or kwargs in span_utils.py.
  • Testing:
    • Adds test_structured_outputs.py for Anthropic, currently skipped due to SDK version.
  • Sample Apps:
    • Adds anthropic_structured_outputs_demo.py, gemini_structured_outputs_demo.py, and openai_structured_outputs_demo.py for demonstration.

This description was created by Ellipsis for ca5f423. You can customize this summary. It will automatically update as commits are pushed.

Add support for logging gen_ai.request.structured_output_schema attribute
for Anthropic Claude and Google Gemini APIs, completing coverage across all
major LLM providers.

Changes:
- Anthropic: Log output_format parameter with json_schema type
  Supports Claude's new structured outputs feature (launched Nov 2025)
  for Sonnet 4.5 and Opus 4.1 models

- Gemini: Log response_schema from generation_config parameter
  Supports both generation_config.response_schema and direct response_schema kwargs

- OpenAI: Already supported (no changes needed)

Sample apps added to demonstrate structured outputs for all three providers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Walkthrough

Adds detection and capture of structured output JSON schemas into instrumentation spans for Anthropic and Google Generative AI; introduces three sample apps demonstrating structured outputs for Anthropic, Google Gemini, and OpenAI.

Changes

Cohort / File(s) Summary
Anthropic Instrumentation
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
In aset_input_attributes, when output_format is a dict with type == "json_schema" and a schema is present, sets LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA to the JSON-dumped schema (no change to other branches).
Google Generative AI Instrumentation
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
In set_model_request_attributes, attempts to set LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA from generation_config.response_schema or response_schema in kwargs; both accesses are guarded with try/except to ignore errors.
Structured Outputs Demos
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py, packages/sample-app/sample_app/gemini_structured_outputs_demo.py, packages/sample-app/sample_app/openai_structured_outputs_demo.py
Adds three demo scripts that load env, initialize Traceloop, define a JSON schema or Pydantic model for a joke+rating, call the respective model with structured-output settings, print parsed results, and run via if __name__ == "__main__": main().

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant App as Demo App
  participant SDK as Model SDK (Anthropic/Gemini/OpenAI)
  participant Instr as Instrumentation/span_utils
  participant Tracer as Tracing Backend

  App->>SDK: send request (includes output_format / response_schema)
  SDK->>Instr: instrumentation hook / set model attributes
  alt output schema present (dict/json_schema or response_schema)
    Instr-->>Instr: extract schema, json-dump
    Instr->>Tracer: set LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA attribute
  else no schema
    Instr->>Tracer: set existing model attributes (unchanged)
  end
  SDK->>App: model response
  Note right of Tracer: Span now contains structured output schema attribute when available
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Reviewers should spot-check:
    • Correct conditional checks for output_format shape in Anthropic utility.
    • Robustness of try/except usage and that exceptions are not overly broad in Google instrumentation.
    • Demo scripts for correct initialization and example schema serialization/parsing.

Suggested reviewers

  • doronkopit5

Poem

🐇 I found a schema in the breeze,

tucked in spans with gentle ease.
Claude, Gemini, OpenAI sing,
JSON dreams the traces bring.
Hop, print, and trace — telemetry glees.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding structured outputs schema logging for Anthropic and Gemini APIs.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/structured-outputs-logging

Comment @coderabbitai help to get the list of available commands and usage tips.

@nirga nirga changed the title feat: add structured outputs schema logging for Anthropic and Gemini fix: add structured outputs schema logging for Anthropic and Gemini Nov 20, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py (1)

170-177: Consider logging structured_output_schema even when prompt capture is disabled

output_format handling sits under should_send_prompts(), so SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA won’t be set when prompt/content capture is turned off, even though this schema is typically configuration rather than user content. Consider moving this block outside the should_send_prompts() guard so the attribute is always populated when output_format is present, aligning with how other providers log this attribute.

packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

395-414: Avoid silent try/except/pass when serializing response_schema

Both blocks swallow all exceptions when calling json.dumps(...), which makes schema/serialization issues hard to debug and triggers Ruff warnings (S110, BLE001). Consider narrowing the exception type and logging instead of passing silently, e.g.:

-    if generation_config and hasattr(generation_config, "response_schema"):
-        try:
-            _set_span_attribute(
-                span,
-                SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA,
-                json.dumps(generation_config.response_schema),
-            )
-        except Exception:
-            pass
+    if generation_config and hasattr(generation_config, "response_schema"):
+        try:
+            _set_span_attribute(
+                span,
+                SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA,
+                json.dumps(generation_config.response_schema),
+            )
+        except (TypeError, ValueError) as exc:
+            logger.debug(
+                "Failed to serialize generation_config.response_schema for span: %s",
+                exc,
+            )
@@
-    if "response_schema" in kwargs:
-        try:
-            _set_span_attribute(
-                span,
-                SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA,
-                json.dumps(kwargs.get("response_schema")),
-            )
-        except Exception:
-            pass
+    if "response_schema" in kwargs:
+        try:
+            _set_span_attribute(
+                span,
+                SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA,
+                json.dumps(kwargs.get("response_schema")),
+            )
+        except (TypeError, ValueError) as exc:
+            logger.debug(
+                "Failed to serialize kwargs['response_schema'] for span: %s",
+                exc,
+            )

This keeps failures non-fatal while giving observability into bad schemas.

Please verify with your supported generation_config.response_schema / response_schema types that json.dumps(...) (or any custom encoder you choose) behaves as expected across the Google Generative AI SDK versions you intend to support.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between da7ec49 and 1de9ffa.

📒 Files selected for processing (5)
  • packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py (1 hunks)
  • packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1 hunks)
  • packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1 hunks)
  • packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1 hunks)
  • packages/sample-app/sample_app/openai_structured_outputs_demo.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/sample-app/sample_app/gemini_structured_outputs_demo.py
  • packages/sample-app/sample_app/anthropic_structured_outputs_demo.py
  • packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
  • packages/sample-app/sample_app/openai_structured_outputs_demo.py
  • packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
🧬 Code graph analysis (5)
packages/sample-app/sample_app/gemini_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
  • Traceloop (37-275)
  • init (49-206)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1)
  • main (15-52)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (1)
  • main (22-35)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
  • Traceloop (37-275)
  • init (49-206)
packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)
  • main (15-45)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (1)
  • main (22-35)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (1)
  • _set_span_attribute (18-22)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
  • SpanAttributes (64-245)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
  • Traceloop (37-275)
  • init (49-206)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1)
  • main (15-52)
packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)
  • main (15-45)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (1)
  • set_span_attribute (21-25)
🪛 Flake8 (7.3.0)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py

[error] 1-1: 'os' imported but unused

(F401)

packages/sample-app/sample_app/openai_structured_outputs_demo.py

[error] 4-4: 'opentelemetry.sdk.trace.export.ConsoleSpanExporter' imported but unused

(F401)

🪛 Ruff (0.14.5)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

403-404: try-except-pass detected, consider logging the exception

(S110)


403-403: Do not catch blind exception: Exception

(BLE001)


413-414: try-except-pass detected, consider logging the exception

(S110)


413-413: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Packages (3.11)
  • GitHub Check: Test Packages (3.12)
  • GitHub Check: Test Packages (3.11)
  • GitHub Check: Test Packages (3.10)
  • GitHub Check: Lint
🔇 Additional comments (1)
packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)

1-49: Gemini structured outputs demo looks good

The demo cleanly configures the client from environment, defines a simple JSON schema, and uses GenerationConfig.response_schema consistently with the other providers. No changes needed from my side.

Remove unused imports to fix flake8 lint errors:
- Remove unused 'os' import from anthropic_structured_outputs_demo.py
- Remove unused 'ConsoleSpanExporter' import from openai_structured_outputs_demo.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (1)

25-25: Consider aligning the prompt with other demos.

The prompt in this demo doesn't explicitly request a rating, while the Anthropic and Gemini demos both ask to "rate it." Although structured outputs will enforce the schema regardless, explicitly requesting the rating improves output quality and consistency across demos.

-        messages=[{"role": "user", "content": "Tell me a joke about OpenTelemetry"}],
+        messages=[{"role": "user", "content": "Tell me a joke about OpenTelemetry and rate it from 1 to 10"}],
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1de9ffa and d6360b2.

📒 Files selected for processing (2)
  • packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1 hunks)
  • packages/sample-app/sample_app/openai_structured_outputs_demo.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/sample-app/sample_app/anthropic_structured_outputs_demo.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/sample-app/sample_app/openai_structured_outputs_demo.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Instrumentation packages must leverage the semantic conventions package and emit OTel-compliant spans
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console

Applied to files:

  • packages/sample-app/sample_app/openai_structured_outputs_demo.py
🧬 Code graph analysis (1)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (3)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
  • Traceloop (37-275)
  • init (49-206)
packages/sample-app/sample_app/anthropic_structured_outputs_demo.py (1)
  • main (14-51)
packages/sample-app/sample_app/gemini_structured_outputs_demo.py (1)
  • main (15-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Packages (3.10)
  • GitHub Check: Test Packages (3.12)
  • GitHub Check: Test Packages (3.11)
  • GitHub Check: Build Packages (3.11)
  • GitHub Check: Lint
🔇 Additional comments (6)
packages/sample-app/sample_app/openai_structured_outputs_demo.py (6)

1-5: LGTM! Unused import issue resolved.

The imports are clean and all used in the code. The previously flagged ConsoleSpanExporter import has been removed.


7-9: LGTM! Proper API key handling.

Environment variables are loaded correctly, and the API key is retrieved from the environment as per coding guidelines.


11-13: LGTM!

Traceloop initialization is correct with an appropriate app name for this demo.


16-18: LGTM!

The Pydantic model is well-defined for structured output validation.


37-38: LGTM!

Standard entry point implementation is correct.


23-27: Model and beta API endpoint verified as available; note known SDK parsing issues.

Verification confirms that gpt-4o-2024-08-06 is still available and actively supported by OpenAI (including for fine-tuning), and the client.beta.chat.completions.parse beta endpoint is available. However, the openai-python SDK has known integration bugs with parse() related to JSON validation and edge cases in parsed responses. Test your structured output handling thoroughly and monitor the openai-python repository for bug fixes.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed d6360b2 in 13 minutes and 44 seconds. Click for details.
  • Reviewed 21 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/sample-app/sample_app/anthropic_structured_outputs_demo.py:1
  • Draft comment:
    Good removal of unused 'os' import to keep the code clean.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/sample-app/sample_app/openai_structured_outputs_demo.py:4
  • Draft comment:
    Removed unused 'ConsoleSpanExporter' import; this is a good cleanup.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_IqIYoUKp7bNNE3SH

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Add comprehensive test coverage for Anthropic structured outputs feature:
- Three test scenarios: legacy attributes, with content events, without content
- Tests verify gen_ai.request.structured_output_schema attribute is logged
- Enhanced span_utils.py to handle both json_schema and json output formats

Note: Tests are currently skipped as they require anthropic SDK >= 0.50.0
which supports the output_format parameter. The feature was announced in
November 2025 but the SDK version (0.49.0) doesn't yet support it.
Tests will be enabled once the SDK is updated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 1de9ffa in 89 minutes and 38 seconds. Click for details.
  • Reviewed 214 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:170
  • Draft comment:
    Consider handling cases where the provided schema might not be JSON serializable. Logging or error handling would help diagnose issues.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment is suggesting defensive programming for json.dumps(), but this appears to be speculative rather than identifying a real issue. The function is already wrapped with @dont_throw decorator which should handle exceptions. Additionally, the same pattern of calling json.dumps() without explicit try-catch is used throughout the file (lines 154, 167, 243, 311), so this would be an inconsistent suggestion unless applied everywhere. The comment doesn't point to a specific bug introduced by this change - it's more of a general code quality suggestion that could apply to many places in the codebase. According to the rules, speculative comments should be removed, and comments should only be kept if there's strong evidence of an issue. Could the schema contain non-serializable objects that would cause json.dumps() to fail? Perhaps the @dont_throw decorator doesn't provide adequate error visibility, and explicit logging would be better for debugging. Maybe this specific case is more prone to serialization issues than the other json.dumps() calls in the file. While it's theoretically possible for the schema to be non-serializable, the comment is speculative and doesn't provide evidence that this is a real issue. The @dont_throw decorator already provides error handling at the function level, and the same pattern is used consistently throughout the file. If this were a real concern, it would apply to all json.dumps() calls, not just this one. The comment doesn't identify a specific problem with the change. This comment should be deleted. It's a speculative suggestion about potential error handling that doesn't identify a specific issue with the code change. The function is already protected by the @dont_throw decorator, and the same json.dumps() pattern is used consistently throughout the file without additional error handling.
2. packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py:395
  • Draft comment:
    If both generation_config.response_schema and kwargs['response_schema'] are provided, the latter overwrites the former. Verify if this override behavior is intended.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the author to verify if the override behavior is intended, which is against the rules. It does not provide a specific suggestion or ask for a test to be written. Therefore, it should be removed.
3. packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py:396
  • Draft comment:
    Consider logging exceptions in the try/except blocks when setting the structured output schema to aid future debugging.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% This comment is about code that was added in the diff (lines 395-414). It's suggesting a code quality improvement - adding logging to exception handlers. The file already has logging infrastructure in place and uses it elsewhere (lines 70, 118 show similar patterns with logger.warning). The suggestion is actionable and clear. However, I need to consider the rules: "Comments that suggest code quality refactors are good! But only if they are actionable and clear." This seems to fit that criteria. But I also need to consider if this is "obvious or unimportant" - the code currently silently swallows exceptions, which could make debugging harder. The comment is making a reasonable suggestion for better observability. However, the pattern of silent exception handling might be intentional here (similar to the @dont_throw decorator used elsewhere). Looking more carefully, there are other places in the code with similar bare exception handlers that don't log (lines 43-44 in _is_image_part). This suggests the silent exception handling might be a deliberate pattern in this codebase. The silent exception handling might be intentional and consistent with other patterns in the codebase (e.g., _is_image_part function). The comment is suggesting a style preference rather than fixing a clear bug. Without knowing if there's a specific debugging issue or if this pattern is deliberately chosen for performance or other reasons, this might be speculative or stylistic rather than clearly necessary. While there are some instances of silent exception handling in the codebase, there are also examples where exceptions ARE logged (lines 70, 118). The new code is handling JSON serialization which could fail in various ways, and logging would help understand why. However, this is still a "nice to have" suggestion rather than a clear bug fix. The rule states comments should be about clear code changes required, not suggestions for improvements unless they're clearly actionable quality refactors. This is a reasonable code quality suggestion, but it's not a required change - it's more of a "nice to have" improvement. The codebase shows mixed patterns (some exceptions logged, some not), and the silent exception handling might be intentional. Given the rule to only keep comments with STRONG EVIDENCE they are correct and necessary, and that by default we should assume comments are not useful, this should be deleted.

Workflow ID: wflow_QU6hLC1XyLCmcwJo

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed ca5f423 in 34 minutes and 3 seconds. Click for details.
  • Reviewed 222 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:178
  • Draft comment:
    Consider caching the result of output_format.get('json_schema') in a variable for clarity before accessing the 'schema' key.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:172
  • Draft comment:
    Add an inline comment explaining the difference between 'json_schema' and 'json' types in output_format to aid future maintenance.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
3. packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py:43
  • Draft comment:
    Remove the duplicate pytest.mark.skip decorator to avoid redundancy.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_ZrOmwwx7Az5swzKf

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@galkleinman galkleinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

neat: consider moving magic strings to consts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants