feat(om2): add native histograms to OpenMetrics2.0 #2634

krajorama · 2025-04-24T15:04:47Z

Background

Based on https://github.com/prometheus/proposals/blob/main/proposals/2024-01-29_native_histograms_text_format.md
And OpenMetrics 2.0 WG discussions.

Changes

Allow structured complex types marked by "{" and "}" in the specification.
Allow multiple exemplars per complex type value.
~~Require that exemplars for complex type values have the timestamp.~~ (1)
Be permissive about observing NaN , +Inf, -Inf. Discourage observing NaN.
Split histogram into ones with classic and native buckets.
For classic buckets, define behavior when observing NaN.
Define the native buckets and also how NaN is handled.
Define the text format of native histograms and also their exemplars.
~~Be permissive about the existence of _sum and _count series for classic histograms - for backwards compatibility.~~

(1) Not doing this on account of being an implementation limitation and also contradicts having backwards compatibility when exposing classic histograms as complex type.

Out of scope

Use case for exposing results of arithmetic operations (e.g. recording rules) that can have float and negative values for the count and bucket values.
Whitespace use. Probably feat(om2): classic histogram and summary as complex types #2679 will inform that decision.

Open questions / decisions

See OpenMetrics2.0 WG meeting notes tab

Signed-off-by: György Krajcsovits <[email protected]>

krajorama · 2025-04-30T12:39:25Z

content/docs/specs/om/open_metrics_spec_2_0.md

+
+Numbers MUST be either floating points or integers. Note that ingestors of the format MAY only support float64. The non-real values NaN, +Inf and -Inf MUST be supported. NaN MUST NOT be considered a missing value, but it MAY be used to signal a division by zero.
+
+Complex data types MUST contain all information necessary to recreate a Metric Type, with the exception of Created time and Exemplars.


This assume we'll have the created timestamp separate from the JSON like data. prometheus/OpenMetrics#285

Signed-off-by: György Krajcsovits <[email protected]>

krajorama · 2025-05-06T14:58:40Z

Note self: add details how summaries and classic histograms one liners (so not NHCB spans/deltas) fit into it.

Signed-off-by: György Krajcsovits <[email protected]>

Add semantic conventions about where complex types may occure. Allow empty spans and deltas. Be more precise. Signed-off-by: György Krajcsovits <[email protected]>

Signed-off-by: György Krajcsovits <[email protected]>

ArthurSens

Had some minutes to read the PR. Couldn't read everything though, I'll do a more complete review another day!

Reviewed from mobile

content/docs/specs/om/open_metrics_spec_2_0.md

Signed-off-by: György Krajcsovits <[email protected]>

docs/specs/om/open_metrics_spec_2_0.md

I did not want to clutter the examples with adding "# EOF" to all, so I made a new marker to explicitly add it on request. There was only one faulty example where we had an extra space, see the Exemplars section. Signed-off-by: György Krajcsovits <[email protected]>

Allow multiple exemplars for complex types, i.e. native histograms. But require that the timestamp is present. Signed-off-by: György Krajcsovits <[email protected]>

beorn7

Thanks for doing all of this.

Mostly commented about the histogram aspect (but I couldn't resist and referred to OM specific things now and then).

docs/specs/om/open_metrics_spec_2_0.md

Signed-off-by: György Krajcsovits <[email protected]>

beorn7

Some more comments.

Sorry for adding some comments outside of a review.

I also assume that non-outdated non-resolved comments are still on your radar to deal with later.

docs/specs/om/open_metrics_spec_2_0.md

Signed-off-by: György Krajcsovits <[email protected]>

…stograms

Sync with the spec https://prometheus.io/docs/specs/native_histograms/#zero-bucket Signed-off-by: György Krajcsovits <[email protected]>

Signed-off-by: György Krajcsovits <[email protected]>

Allow missing _sum and _count and set rules for interpretation if needed. Signed-off-by: György Krajcsovits <[email protected]>

To reduce scope. Signed-off-by: György Krajcsovits <[email protected]>

beorn7 · 2025-09-03T12:37:21Z

Just one more note from my side. (But as said, I'm not authorized to approve OMv2 related things.)

bwplotka · 2025-09-03T13:14:32Z

I'm planning to go through this, but looks generally good!

Signed-off-by: György Krajcsovits <[email protected]>

beorn7 · 2025-09-03T15:13:41Z

My comments have all been addressed. (I just don't approve this for the reasons stated previously.)

Signed-off-by: György Krajcsovits <[email protected]>

…stograms # Conflicts: # docs/specs/om/open_metrics_spec_2_0.md

krajorama · 2025-11-05T08:07:52Z

docs/specs/om/open_metrics_spec_2_0.md

+##### Histogram with Classic Buckets

-The MetricPoint's Bucket Values Sample MetricNames MUST have the suffix `_bucket`. If present, the MetricPoint's Sum Value Sample MetricName MUST have the suffix `_sum`.
+The MetricPoint's Sum Value Sample MetricName MUST have the suffix `_sum`. The MetricPoint's Count Value Sample MetricName MUST have the suffix `_count`. The MetricPoint's Classic Bucket values Sample MetricNames MUST have the suffix `_bucket`.


note to reviewers: this will go away with #2679

krajorama · 2025-11-05T08:08:35Z

docs/specs/om/open_metrics_spec_2_0.md

+
+Histograms with Native Buckets MUST use the integer native histogram data type.
+
+The integer native histogram data type is represented as structured data with fields. There MUST NOT be any whitespace around fields.


note to reviewers: whitespace is under discussion for readability in complex types in general

Discussion is concluded. Awaiting reviews from approvers.

Signed-off-by: György Krajcsovits <[email protected]>

krajorama · 2025-11-05T14:23:25Z

docs/specs/om/open_metrics_spec_2_0.md

+
+With the exception of the `sum` and `zero_threshold` field, all numbers MUST be integers and MUST NOT include dot '.' or exponent 'e'.
+
+Native Bucket values MUST be ordered by their index, and their values MUST be placed in the `negative_buckets` (and/or `positive_buckets`) fields.


note to reviewers: we don't use delta encoding here, negative_buckets and positive_buckets hold the actual absolute non negative integer value.

dashpole · 2025-11-05T14:33:14Z

docs/specs/om/open_metrics_spec_2_0.md

+
+Numbers MUST be either floating points or integers. Note that ingestors of the format MAY only support float64. The non-real values NaN, +Inf and -Inf MUST be supported. NaN MUST NOT be considered a missing value, but it MAY be used to signal a division by zero.
+
+Complex data types MUST contain all information necessary to recreate a sample of a Metric Type, with the exception of Created Timestamp and Exemplars.


Suggested change

Complex data types MUST contain all information necessary to recreate a sample of a Metric Type, with the exception of Created Timestamp and Exemplars.

Complex data types MUST contain all information necessary to recreate a sample of a Metric Type, with the exception of Start Timestamp and Exemplars.

Assuming this merges after the created -> start PR

I wanted to leave this to #2770 and see which is merged first. But I'll just update it already.

It will be easier to do it after the other PR is merged actually, deferring.

docs/specs/om/open_metrics_spec_2_0.md

dashpole · 2025-11-05T14:40:20Z

docs/specs/om/open_metrics_spec_2_0.md

 Histograms measure distributions of discrete events. Common examples are the latency of HTTP requests, function runtimes, or I/O request sizes.

-A Histogram MetricPoint MUST contain at least one bucket, and SHOULD contain Sum, and Created Timestamp values. Every bucket MUST have a threshold and a value.
+A Histogram MetricPoint MUST contain Count, Sum values.


Does this mean a histogram with negative observations will no longer omit the sum? This differs from OTel's semantics for the histogram's sum: https://github.com/open-telemetry/opentelemetry-proto/blob/a8951735f7801e8adfaec5c0ace9262771cfec6e/opentelemetry/proto/metrics/v1/metrics.proto#L470.

That's the idea, yes. See also #2627

As stated in #2627 , requiring the opposite would be breaking for many people. Also native histograms don't have optional Sum field, so best you could do is set the Sum to 0.0. Which also means that when we convert classic histograms to NHCB, we get 0.0 if Sum is missing.

Now when it comes to conversion, you can easily convert the Prometheus native histogram to OTEL exponential and leave out the Sum if there's negative observations (negative buckets in use) or the zero bucket is in use (since the zero bucket doesn't tell you the sign of the observations).

I don't think this is a problem in reality. Most histograms that I've seen measure time or size which are naturally positive so you don't actually run into problems (unless there's a bug). On the other hand, what if you do want to measure negative values? This requirement would force you to add one more metric just for the Sum or some other trick.

dashpole · 2025-11-05T14:44:08Z

docs/specs/om/open_metrics_spec_2_0.md

+A Histogram MUST measure values that are not NaN in either [Classic Buckets](#classic-buckets) or [Native Buckets](#native-buckets) or both. Measuring NaN is different for Classic and Native Buckets, see in their respective sections.

-Semantically, buckets values are counters so MUST NOT be NaN or negative.
+If a Histogram stops measuring values in either Classic or Native Buckets and keeps measuring values in the other, it MUST clear and not expose the buckets it stopped measuring into. This avoids exposing different distribution from the two kind of buckets at the same time.


Can we phrase this in terms of the requirements on the protocol, rather than requirements on metrics libraries? You can still describe how a metrics library can achieve the desired result, if thats helpful.

Maybe something like: "If both Classic and Native Buckets are in use, a Histogram MUST NOT contain measurements in Classic (or Native) Buckets that was taken later than all measurements in the other kind of Native (or Classic) Buckets. This avoids the Histogram having a recent distribution in one kind of buckets and something outdated in the other kind."

In reality this might be hard to achieve 100% if you get the data async , so a couple of measurements might be taken between exposing the native and classic buckets.

So we can also make this a SHOULD, after all we get independent time series out of the different buckets.
Or in fact we can drop it altogether if we think this is too much.

dashpole · 2025-11-05T14:50:09Z

docs/specs/om/open_metrics_spec_2_0.md

+Every Bucket MUST have well defined boundaries and a value. Boundaries of a Bucket MUST NOT be NaN. Bucket values MUST be integers. Semantically, bucket values are counters so MUST NOT be NaN or negative.

-Negative threshold buckets MAY be used. Bucket thresholds MUST NOT equal NaN. Count and bucket values MUST be integers.
+A Histogram SHOULD refuse to measure NaN value as adding NaN to the Sum will make the Sum equal to NaN and mask the sum of the real measurements until the next reset of the counters. If a Histogram does allow NaN, then NaN MUST be counted in the Count and MUST be added to the Sum, resulting in the Sum becoming NaN.


Suggested change

A Histogram SHOULD refuse to measure NaN value as adding NaN to the Sum will make the Sum equal to NaN and mask the sum of the real measurements until the next reset of the counters. If a Histogram does allow NaN, then NaN MUST be counted in the Count and MUST be added to the Sum, resulting in the Sum becoming NaN.

A Histogram SHOULD NOT include NaN measurements as including NaN in the Sum will make the Sum equal to NaN and mask the sum of the real measurements for the lifetime of the time series. If a Histogram includes NaN measurements, then NaN measurements MUST be counted in the Count and the Sum MUST be NaN.

Same suggestion about phrasing based on the protocol, rather than based on how the metric is aggregated.

Are you also suggesting to move the text elsewhere or just rephrase?

Native histograms or any metric for that matter can be reset within the lifetime of the series, so I would stick to reset of the counters or more accurately reset of the Histogram.

dashpole · 2025-11-05T14:52:20Z

docs/specs/om/open_metrics_spec_2_0.md

-Negative threshold buckets MAY be used. Bucket thresholds MUST NOT equal NaN. Count and bucket values MUST be integers.
+A Histogram SHOULD refuse to measure NaN value as adding NaN to the Sum will make the Sum equal to NaN and mask the sum of the real measurements until the next reset of the counters. If a Histogram does allow NaN, then NaN MUST be counted in the Count and MUST be added to the Sum, resulting in the Sum becoming NaN.
+
+A Histogram MAY refuse to measure +Inf and -Inf values as adding these to the Sum will mask the sum of the real measurements until the next reset of the counters. If a Histogram measures +Inf or -Inf, then +Inf or -Inf MUST be counted in the Count and MUST be added to the Sum, potentially resulting in +Inf, -Inf or NaN in the Sum, the later for example in case of adding +Inf to -Inf.


Do we have guidance for whether they should or not? Or are we actually ambivalent about whether +/- Inf are included?

Not really. I can reword to say : if +-Inf is included, then what to do and add an explanation what it means. So basically flip the two sentences around and not use "MAY" . Something like:
"If a Histogram includes +Inf or -Inf measurement, then +Inf or -Inf MUST be counted in Count and MUST be added to the Sum, potentially resulting in +Inf, -Inf or NaN in the Sum, the later for example in case of adding +Inf to -Inf. Note that in this case the Sum of finite measurements is masked until the next reset of the Histogram."

WDYT?

docs/specs/om/open_metrics_spec_2_0.md

dashpole · 2025-11-05T15:05:28Z

docs/specs/om/open_metrics_spec_2_0.md

+
+A Histogram MetricPoint with Native Buckets MAY contain exemplars.
+
+Exemplars associated with a Histogram MetricPoint with Native Buckets SHOULD have a timestamp. Note: storage implementations may drop exemplars without timestamps if keeping track of exemplars without timestamps is too resource intensive.


Can we move this to the exemplar section (assuming these requirements apply across the board)?

Good question. We should discuss in the WG.

In general it would be great if we required timestamp all the time, but unfortunately getting the current time is usually pretty expensive, so I think saying MUST is out of the question. Prometheus solves this for single series by always comparing the received exemplar value to the last recorded value. If it matches, we drop it (no timestamp) needed.

So at scrape , if the exemplar is coming from a float series and doesn't have a timestamp , we assign it the scrape time as timestamp. The exemplar will not be stored in the exemplar store if it's a duplicate, so no problem. (Although it can lead to weird things as the exemplar storage is finite, so the exemplar might be rotated out of the storage and replaced with the same value , but later timestamp. Exemplars are experimental for a reason https://prometheus.io/docs/prometheus/latest/feature_flags/#exemplars-storage ).

For Histograms, the situation is tricky. If you store the Histogram as a native histogram, then you have a single series and you get multiple exemplars (N) at once . If they don't have a timestamp, then currently we have no way of comparing them to the last (M) exemplars in storage as the storage has no backwards references, no interface to add all exemplars at once to at least get them due to order in the buffer and doesn't store how many exemplars (M) belong together. We could solve this. Maybe help implementing prometheus/prometheus#17469 ?

The thing that we cannot solve at the moment is that Remote-Write may split exemplars between messages, so you might not get all exemplars at the same time. Which means you don't get (N) exemplars, you get (n) . Maybe even this is ok?

We've tried to solve the issue in prometheus/prometheus#13021 once , but didn't manage.

Signed-off-by: György Krajcsovits <[email protected]>

krajorama added 4 commits April 24, 2025 17:04

feat(om2): add native histograms to OpenMetrics2.0

ddbace4

Signed-off-by: György Krajcsovits <[email protected]>

define the counter histogram model

2f4113f

Signed-off-by: György Krajcsovits <[email protected]>

small fixes to histogram model

a47ce0c

Signed-off-by: György Krajcsovits <[email protected]>

update gauge histogram model

10dc43d

Signed-off-by: György Krajcsovits <[email protected]>

krajorama commented Apr 30, 2025

View reviewed changes

wip: wip

d12dfeb

Signed-off-by: György Krajcsovits <[email protected]>

krajorama mentioned this pull request May 8, 2025

histograms: Implement final OM text format prometheus/prometheus#11265

Open

krajorama added 4 commits May 22, 2025 10:05

adjust wording

471a7dd

Signed-off-by: György Krajcsovits <[email protected]>

add abnf and presentation

ec991a4

Signed-off-by: György Krajcsovits <[email protected]>

updates

dc308e2

Add semantic conventions about where complex types may occure. Allow empty spans and deltas. Be more precise. Signed-off-by: György Krajcsovits <[email protected]>

Add gauge histogram syntax

57c6ab8

Signed-off-by: György Krajcsovits <[email protected]>

ArthurSens reviewed May 24, 2025

View reviewed changes

beorn7 self-requested a review May 27, 2025 11:26

krajorama added 2 commits May 27, 2025 15:53

clarify NaN and Inf and be permissive

32a65cf

Signed-off-by: György Krajcsovits <[email protected]>

fix from Arthur's comments

6968e00

Signed-off-by: György Krajcsovits <[email protected]>

krajorama marked this pull request as ready for review May 27, 2025 14:01

Merge branch 'main' into krajo/om2.0-native-histograms

d95eed7

This comment was marked as outdated.

Sign in to view

krajorama commented Jun 3, 2025

View reviewed changes

docs/specs/om/open_metrics_spec_2_0.md Outdated Show resolved Hide resolved

krajorama mentioned this pull request Jun 4, 2025

OM text exposition for NH prometheus/client_python#1087

Merged

krajorama added 2 commits June 11, 2025 09:03

Update with exemplars

7965787

Allow multiple exemplars for complex types, i.e. native histograms. But require that the timestamp is present. Signed-off-by: György Krajcsovits <[email protected]>

krajorama force-pushed the krajo/om2.0-native-histograms branch from c16426b to 7965787 Compare June 11, 2025 07:27

Merge branch 'main' into krajo/om2.0-native-histograms

0a944c4

krajorama mentioned this pull request Jun 12, 2025

OM 2.0: Native Histogram Support in Text format. prometheus/OpenMetrics#279

Open

beorn7 requested changes Jun 12, 2025

View reviewed changes

krajorama added 3 commits June 13, 2025 14:37

fix schema number allocation

52bd9e9

Signed-off-by: György Krajcsovits <[email protected]>

Only prohibit "le" label if there are classic buckets

2b0fdcc

Signed-off-by: György Krajcsovits <[email protected]>

Make the requirement around NaN stronger in classic case.

1dea11f

Signed-off-by: György Krajcsovits <[email protected]>

beorn7 previously requested changes Jun 24, 2025

View reviewed changes

krajorama added 12 commits June 26, 2025 09:37

sync gauge histogram to histogram in metric types

9d02b5b

Signed-off-by: György Krajcsovits <[email protected]>

Fix missing Gsum to Sum mapping

e972b87

Signed-off-by: György Krajcsovits <[email protected]>

Merge branch 'main' into krajo/om2.0-native-histograms

ae5c995

Merge main

abdeb9d

Signed-off-by: György Krajcsovits <[email protected]>

follow up rename of created value

08f0e67

Signed-off-by: György Krajcsovits <[email protected]>

Merge remote-tracking branch 'origin/main' into krajo/om2.0-native-hi…

34e5a83

…stograms

Merge remote-tracking branch 'origin/main' into krajo/om2.0-native-hi…

13fe8d2

…stograms

Merge remote-tracking branch 'origin/main' into krajo/om2.0-native-hi…

2b4e403

…stograms

the zero bucket is not optional

5c4f7a3

Sync with the spec https://prometheus.io/docs/specs/native_histograms/#zero-bucket Signed-off-by: György Krajcsovits <[email protected]>

add comment to spell out what the decimals mean

8f6e05e

Signed-off-by: György Krajcsovits <[email protected]>

Relax rules on what can be accepted as classic histogram for b compat

5fabcff

Allow missing _sum and _count and set rules for interpretation if needed. Signed-off-by: György Krajcsovits <[email protected]>

Remove float native histograms for now

d4c452b

To reduce scope. Signed-off-by: György Krajcsovits <[email protected]>

bwplotka moved this to Done in Open Metrics 2.0 Sep 3, 2025

bwplotka added this to Open Metrics 2.0 Sep 3, 2025

bwplotka moved this from Done to In Progress in Open Metrics 2.0 Sep 3, 2025

krajorama added 2 commits September 3, 2025 16:18

Discourage changing classic bucket thresholds in the model.

03be949

Signed-off-by: György Krajcsovits <[email protected]>

Remove ambiguous statement on sum and count. require them.

3f5f6dc

Signed-off-by: György Krajcsovits <[email protected]>

krajorama added 2 commits September 3, 2025 18:00

Clarify what to do with buckets that have the value 0

e808703

Signed-off-by: György Krajcsovits <[email protected]>

Merge remote-tracking branch 'origin/main' into krajo/om2.0-native-hi…

93104ae

…stograms # Conflicts: # docs/specs/om/open_metrics_spec_2_0.md

krajorama commented Nov 5, 2025

View reviewed changes

Minor wording change

a5ce3aa

Signed-off-by: György Krajcsovits <[email protected]>

krajorama commented Nov 5, 2025

View reviewed changes

dashpole reviewed Nov 5, 2025

View reviewed changes

small corrections from review comments

a8b8ebd

Signed-off-by: György Krajcsovits <[email protected]>


		Numbers MUST be either floating points or integers. Note that ingestors of the format MAY only support float64. The non-real values NaN, +Inf and -Inf MUST be supported. NaN MUST NOT be considered a missing value, but it MAY be used to signal a division by zero.

		Complex data types MUST contain all information necessary to recreate a Metric Type, with the exception of Created time and Exemplars.


		Histograms with Native Buckets MUST use the integer native histogram data type.

		The integer native histogram data type is represented as structured data with fields. There MUST NOT be any whitespace around fields.


		With the exception of the `sum` and `zero_threshold` field, all numbers MUST be integers and MUST NOT include dot '.' or exponent 'e'.

		Native Bucket values MUST be ordered by their index, and their values MUST be placed in the `negative_buckets` (and/or `positive_buckets`) fields.

	A Histogram SHOULD refuse to measure NaN value as adding NaN to the Sum will make the Sum equal to NaN and mask the sum of the real measurements until the next reset of the counters. If a Histogram does allow NaN, then NaN MUST be counted in the Count and MUST be added to the Sum, resulting in the Sum becoming NaN.
	A Histogram SHOULD NOT include NaN measurements as including NaN in the Sum will make the Sum equal to NaN and mask the sum of the real measurements for the lifetime of the time series. If a Histogram includes NaN measurements, then NaN measurements MUST be counted in the Count and the Sum MUST be NaN.


		A Histogram MetricPoint with Native Buckets MAY contain exemplars.

		Exemplars associated with a Histogram MetricPoint with Native Buckets SHOULD have a timestamp. Note: storage implementations may drop exemplars without timestamps if keeping track of exemplars without timestamps is too resource intensive.

feat(om2): add native histograms to OpenMetrics2.0 #2634

Are you sure you want to change the base?

feat(om2): add native histograms to OpenMetrics2.0 #2634

Conversation

krajorama commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

Out of scope

Open questions / decisions

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krajorama commented May 6, 2025

Uh oh!

ArthurSens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

beorn7 commented Sep 3, 2025

Uh oh!

bwplotka commented Sep 3, 2025

Uh oh!

beorn7 commented Sep 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

krajorama commented Apr 24, 2025 •

edited

Loading