Skip to content

Conversation

@MichaelFerence
Copy link
Contributor

@MichaelFerence MichaelFerence commented May 19, 2025

Current behavior : (link exiting issues here : https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests)

New behavior :

This PR enhances monitoring by adding quota-aligned metrics to track ingestion, query usage, and retained data:

New Metrics

tsdb_metering_samples_ingested_per_min: Estimated samples/min ingested (based on active timeseries + ingest-resolution-millis).

tsdb_metering_query_samples_scanned_per_min: Estimated bytes/min scanned during queries (using active timeseries + max-data-per-shard-query).

tsdb_metering_retained_timeseries: Active timeseries count (mirrors retained data).

Configuration

Added ingest-resolution-millis and max-data-per-shard-query to compute metrics.

Improvements

  • Metrics include _ws_ and _ns_tags for better filtering.

val tags = Map(
"metric_ws" -> ws,
"metric_ns" -> ns,
"dataset" -> dsRef.dataset,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when dsRef.dataset is null. Can you use "unknown" or "null" as the value.


// Update query bytes scanned per minute based on active timeseries and configured max data per shard
val maxDataPerShardQuery = settings.config.getBytes("max-data-per-shard-query").longValue()
val avgBytesPerTs = maxDataPerShardQuery / 1000000 // Convert to MB for better readability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about this. The name is bytes but the value is MB?

Copy link
Contributor

@alextheimer alextheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments; will discuss more offline, as well

private val METRIC_LONGTERM = "tsdb_metering_longterm_timeseries"
// Quota-aligned metrics
private val METRIC_SAMPLES_INGESTED = "tsdb_metering_samples_ingested_per_min"
private val METRIC_QUERY_BYTES_SCANNED = "tsdb_metering_query_samples_scanned_per_min"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this constant be named SAMPLES_SCANNED?

// Quota-aligned metrics
private val METRIC_SAMPLES_INGESTED = "tsdb_metering_samples_ingested_per_min"
private val METRIC_QUERY_BYTES_SCANNED = "tsdb_metering_query_samples_scanned_per_min"
private val METRIC_RETAINED_TIMESERIES = "tsdb_metering_retained_timeseries"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "retained" mean?

Comment on lines +90 to +93
// Update retained timeseries count - directly maps to active timeseries
Kamon.gauge(METRIC_RETAINED_TIMESERIES)
.withTags(TagSet.from(tags))
.update(data.counts.active.toDouble)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same thing as METRIC_ACTIVE? Why are both needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these names are from the design doc. Need Alex Goodwin to confirm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants