-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[clickhouse] Add handling for complex attributes to ClickHouse storage #7627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clickhouse] Add handling for complex attributes to ClickHouse storage #7627
Conversation
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
| if err != nil { | ||
| break | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro How should we handle this? It's on the write path. Should we add a warning for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I would still add the warning, it's better than silently ignoring the error. another alternative is to log it, but I prefer storing it int the span.
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (86.95%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #7627 +/- ##
==========================================
- Coverage 96.59% 96.53% -0.07%
==========================================
Files 384 384
Lines 19404 19464 +60
==========================================
+ Hits 18744 18790 +46
- Misses 477 489 +12
- Partials 183 185 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Metrics Comparison SummaryTotal changes across all snapshots: 53 Detailed changes per snapshotsummary_metrics_snapshot_cassandra📊 Metrics Diff SummaryTotal Changes: 53
🆕 Added Metrics
View diff sample+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="+Inf",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="0",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="100",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="1000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="25",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
...View diff sample+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="+Inf",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.005",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.01",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.025",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.05",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.075",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.1",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
...View diff sample+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="+Inf",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="0",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="100",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="1000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="25",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
... |
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
| case pcommon.ValueTypeSlice: | ||
| key := "@slice@" + k | ||
| m := &xpdata.JSONMarshaler{} | ||
| b, err := m.MarshalValue(v) | ||
| if err != nil { | ||
| break | ||
| } | ||
| encoded := base64.StdEncoding.EncodeToString(b) | ||
| out.ComplexKeys = append(out.ComplexKeys, key) | ||
| out.ComplexValues = append(out.ComplexValues, encoded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent data loss on marshal failure. When m.MarshalValue(v) returns an error, the code uses break which only exits the switch statement but continues the Range loop. This causes the slice attribute to be silently dropped with no error logging or warning added to the span.
Impact: Attributes will be lost in production without any indication to users or operators.
Fix: Add error handling similar to the from.go pattern:
case pcommon.ValueTypeSlice:
key := "@slice@" + k
m := &xpdata.JSONMarshaler{}
b, err := m.MarshalValue(v)
if err != nil {
// Log error or add warning to span
// For now, skip this attribute
return true
}
encoded := base64.StdEncoding.EncodeToString(b)
out.ComplexKeys = append(out.ComplexKeys, key)
out.ComplexValues = append(out.ComplexValues, encoded)Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
| case pcommon.ValueTypeMap: | ||
| key := "@map@" + k | ||
| m := &xpdata.JSONMarshaler{} | ||
| b, err := m.MarshalValue(v) | ||
| if err != nil { | ||
| break | ||
| } | ||
| encoded := base64.StdEncoding.EncodeToString(b) | ||
| out.ComplexKeys = append(out.ComplexKeys, key) | ||
| out.ComplexValues = append(out.ComplexValues, encoded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent data loss on marshal failure. When m.MarshalValue(v) returns an error for map attributes, the code uses break which only exits the switch statement but continues the Range loop. This causes the map attribute to be silently dropped with no error logging or warning.
Impact: Map attributes will be lost in production without any indication to users or operators.
Fix: Add error handling similar to the from.go pattern:
case pcommon.ValueTypeMap:
key := "@map@" + k
m := &xpdata.JSONMarshaler{}
b, err := m.MarshalValue(v)
if err != nil {
// Log error or add warning to span
// For now, skip this attribute
return true
}
encoded := base64.StdEncoding.EncodeToString(b)
out.ComplexKeys = append(out.ComplexKeys, key)
out.ComplexValues = append(out.ComplexValues, encoded)| case pcommon.ValueTypeMap: | |
| key := "@map@" + k | |
| m := &xpdata.JSONMarshaler{} | |
| b, err := m.MarshalValue(v) | |
| if err != nil { | |
| break | |
| } | |
| encoded := base64.StdEncoding.EncodeToString(b) | |
| out.ComplexKeys = append(out.ComplexKeys, key) | |
| out.ComplexValues = append(out.ComplexValues, encoded) | |
| case pcommon.ValueTypeMap: | |
| key := "@map@" + k | |
| m := &xpdata.JSONMarshaler{} | |
| b, err := m.MarshalValue(v) | |
| if err != nil { | |
| // Log error or add warning to span | |
| // For now, skip this attribute | |
| return true | |
| } | |
| encoded := base64.StdEncoding.EncodeToString(b) | |
| out.ComplexKeys = append(out.ComplexKeys, key) | |
| out.ComplexValues = append(out.ComplexValues, encoded) |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
Signed-off-by: Mahad Zaryab <[email protected]>
| vm := pcommon.NewValueMap() | ||
| vm.Map().PutStr("key", "value") | ||
| m := &xpdata.JSONMarshaler{} | ||
| buf, err := m.MarshalValue(vm) | ||
| require.NoError(t, err) | ||
| encodedMap := base64.StdEncoding.EncodeToString(buf) | ||
|
|
||
| vs := pcommon.NewValueSlice() | ||
| vs.Slice().AppendEmpty().SetInt(1) | ||
| vs.Slice().AppendEmpty().SetInt(2) | ||
| vs.Slice().AppendEmpty().SetInt(3) | ||
| buf, err = m.MarshalValue(vs) | ||
| require.NoError(t, err) | ||
| encodedSlice := base64.StdEncoding.EncodeToString(buf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro would it better to test using the raw json string here?
| } | ||
| k := strings.TrimPrefix(storedAttrs.ComplexKeys[i], "@bytes@") | ||
| attrs.PutEmptyBytes(k).FromRaw(decoded) | ||
| case strings.HasPrefix(storedAttrs.ComplexKeys[i], "@slice@"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from prefix the code is identical to next case:, can move to shared helper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro The slice and map are added differently
attrs.PutEmptySlice(k).FromRaw(val.Slice().AsRaw())vs.
attrs.PutEmptyMap(k).FromRaw(val.Map().AsRaw())The warning messages are slightly different as well.
| // | ||
| // - pcommon.ValueTypeSlice: | ||
| // Represents an OTLP slice (array). The value is first serialized to JSON, then | ||
| // Base64-encoded before storage. Keys for this type are prefixed with `@slice@`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to encode json as base64? I understand we need it for bytes as they may be unprintable, but json can be stored directly as string.
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Which problem is this PR solving?
Description of the changes
How was this change tested?
Checklist
jaeger:make lint testjaeger-ui:npm run lintandnpm run test