Skip to content

Commit 2530b2c

Browse files
bwplotkakrajorama
andauthored
RW 2.0-rc.4 (breaking): Move CT from TimeSeries to Sample; Rename to ST (Start Timestamp) (#2762)
* prw2(breaking): Move 2.0 CT to Sample; Rename to ST (Start Timestamp) Given the recent movement for Prometheus native support of ST ([PROM-60](prometheus/proposals#60)) and plans for delta temporality ([PROM-48](prometheus/proposals#48)) it might be beneficial to make (hopefully) last change to Remote Write 2.0 before stabilizing, so: * Raname Created Timestamp to Start Timestamp * Move CT/ST from TimeSeries to Sample and Histogram messages. * Clarified optionality (0 value meaning unset) See implementation change that will follow: prometheus/prometheus#17411. Notice that only receiver part was implemented for CT/ST. Given no sending part was done we expect this feature (ST/CT) not being used, thus breakage impact is minimal. This has been confirmed with early adopters like Mimir (Grafana), Chronosphere, Thanos, Cortex and Google. See previous discussions and 3 expilcit approvals: prometheus/prometheus#17036 Additionally: * I updated link to proto * Updated links to new compliance tests * Update native histogram spec link Signed-off-by: bwplotka <[email protected]> * Update docs/specs/prw/remote_write_spec_2_0.md Co-authored-by: George Krajcsovits <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]> * Update docs/specs/prw/remote_write_spec_2_0.md Signed-off-by: Bartlomiej Plotka <[email protected]> --------- Signed-off-by: bwplotka <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]> Co-authored-by: George Krajcsovits <[email protected]>
1 parent 044fd99 commit 2530b2c

File tree

1 file changed

+49
-40
lines changed

1 file changed

+49
-40
lines changed

docs/specs/prw/remote_write_spec_2_0.md

Lines changed: 49 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ nav_title: "2.0"
44
sort_rank: 2
55
---
66

7-
* Version: 2.0-rc.3
7+
* Version: 2.0-rc.4
88
* Status: **Experimental**
99
* Date: May 2024
1010

@@ -29,7 +29,10 @@ The Remote-Write protocol is designed to be stateless; there is strictly no inte
2929

3030
The Remote-Write protocol contains opportunities for batching, e.g. sending multiple samples for different series in a single request. It is not expected that multiple samples for the same series will be commonly sent in the same request, although there is support for this in the Protobuf Message.
3131

32-
A test suite can be found at https://github.com/prometheus/compliance/tree/main/remote_write_sender. The compliance tests for remote write 2.0 compatibility are still [in progress](https://github.com/prometheus/compliance/issues/101).
32+
Compliance tests can be found at:
33+
34+
* sender: https://github.com/prometheus/compliance/tree/main/remotewrite/sender
35+
* receiver: https://github.com/prometheus/compliance/tree/main/remotewrite/receiver
3336

3437
### Glossary
3538

@@ -42,10 +45,10 @@ In this document, the following definitions are followed:
4245
* a `Sender` is something that sends Remote-Write data.
4346
* a `Receiver` is something that receives (writes) Remote-Write data. The meaning of `Written` is up to the Receiver e.g. usually it means storing received data in a database, but also just validating, splitting or enhancing it.
4447
* `Written` refers to data the `Receiver` has received and is accepting. Whether or not it has ingested this data to persistent storage, written it to a WAL, etc. is up to the `Receiver`. The only distinction is that the `Receiver` has accepted this data rather than explicitly rejecting it with an error response.
45-
* a `Sample` is a pair of (timestamp, value).
46-
* a `Histogram` is a pair of (timestamp, [histogram value](https://github.com/prometheus/docs/blob/b9657b5f5b264b81add39f6db2f1df36faf03efe/content/docs/concepts/native_histograms.md)).
48+
* a `Sample` is a triplet of (start timestamp, timestamp, value).
49+
* a `Histogram` is a triplet of (start timestamp, timestamp, [histogram value](https://github.com/prometheus/docs/blob/b9657b5f5b264b81add39f6db2f1df36faf03efe/content/docs/concepts/native_histograms.md)).
4750
* a `Label` is a pair of (key, value).
48-
* a `Series` is a list of samples, identified by a unique set of labels.
51+
* a `Series` is a list of samples (or histograms), identified by a unique set of labels.
4952

5053
## Definitions
5154

@@ -178,7 +181,7 @@ Senders SHOULD expect [400 HTTP Bad Request](https://www.rfc-editor.org/rfc/rfc9
178181

179182
#### Invalid Samples
180183

181-
Receivers MAY NOT support certain metric types or samples (e.g. a Receiver might reject sample without metadata type specified or without created timestamp, while another Receiver might accept such sample.). It’s up to the Receiver what sample is invalid. Receivers MUST return a [400 HTTP Bad Request](https://www.rfc-editor.org/rfc/rfc9110.html#name-400-bad-request) status code for write requests that contain any invalid samples unless the [partial retriable write](#retries-on-partial-writes) occurs.
184+
Receivers MAY NOT support certain metric types or samples (e.g. a Receiver might reject sample without metadata type specified or without start timestamp, while another Receiver might accept such sample.). It’s up to the Receiver what sample is invalid. Receivers MUST return a [400 HTTP Bad Request](https://www.rfc-editor.org/rfc/rfc9110.html#name-400-bad-request) status code for write requests that contain any invalid samples unless the [partial retriable write](#retries-on-partial-writes) occurs.
182185

183186
Senders MUST NOT retry on a 4xx HTTP status codes (other than [429](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429)), which MUST be used by Receivers to indicate that the write operation will never be able to succeed and should not be retried. Senders MAY retry on the 415 HTTP status code with a different content type or encoding to see if the Receiver supports it.
184187

@@ -220,18 +223,12 @@ The `io.prometheus.write.v2.Request` references the new Protobuf Message that's
220223
<!---
221224
TODO(bwplotka): Move link to the one on Prometheus main or even buf.
222225
-->
223-
The full schema and source of the truth is in Prometheus repository in [`prompb/io/prometheus/write/v2/types.proto`](https://github.com/prometheus/prometheus/blob/remote-write-2.0/prompb/io/prometheus/write/v2/types.proto#L32). The `gogo` dependency and options CAN be ignored ([will be removed eventually](https://github.com/prometheus/prometheus/issues/11908)). They are not part of the specification as they don't impact the serialized format.
226+
The full schema and source of the truth is in Prometheus repository in [`prompb/io/prometheus/write/v2/types.proto`](https://github.com/prometheus/prometheus/blob/main/prompb/io/prometheus/write/v2/types.proto). The `gogo` dependency and options CAN be ignored ([will be removed eventually](https://github.com/prometheus/prometheus/issues/11908)). They are not part of the specification as they don't impact the serialized format.
224227

225228
The simplified version of the new `io.prometheus.write.v2.Request` is presented below.
226229

227230
```
228-
// Request represents a request to write the given timeseries to a remote destination.
229231
message Request {
230-
// Since Request supersedes 1.0 spec's prometheus.WriteRequest, we reserve the top-down message
231-
// for the deterministic interop between those two.
232-
// Generally it's not needed, because Receivers must use the Content-Type header, but we want to
233-
// be sympathetic to adopters with mistaken implementations and have deterministic error (empty
234-
// message if you use the wrong proto schema).
235232
reserved 1 to 3;
236233
237234
// symbols contains a de-duplicated array of string elements used for various
@@ -249,13 +246,15 @@ message Request {
249246
250247
// TimeSeries represents a single series.
251248
message TimeSeries {
249+
reserved 6;
250+
252251
// labels_refs is a list of label name-value pair references, encoded
253252
// as indices to the Request.symbols array. This list's length is always
254253
// a multiple of two, and the underlying labels should be sorted lexicographically.
255254
//
256255
// Note that there might be multiple TimeSeries objects in the same
257256
// Requests with the same labels e.g. for different exemplars, metadata
258-
// or created timestamp.
257+
// or start timestamp.
259258
repeated uint32 labels_refs = 1;
260259
261260
// Timeseries messages can either specify samples or (native) histogram samples
@@ -271,27 +270,11 @@ message TimeSeries {
271270
272271
// metadata represents the metadata associated with the given series' samples.
273272
Metadata metadata = 5;
274-
275-
// created_timestamp represents an optional created timestamp associated with
276-
// this series' samples in ms format, typically for counter or histogram type
277-
// metrics. Created timestamp represents the time when the counter started
278-
// counting (sometimes referred to as start timestamp), which can increase
279-
// the accuracy of query results.
280-
//
281-
// Note that some receivers might require this and in return fail to
282-
// write such samples within the Request.
283-
//
284-
// For Go, see github.com/prometheus/prometheus/model/timestamp/timestamp.go
285-
// for conversion from/to time.Time to Prometheus timestamp.
286-
//
287-
// Note that the "optional" keyword is omitted due to
288-
// https://cloud.google.com/apis/design/design_patterns.md#optional_primitive_fields
289-
// Zero value means value not set. If you need to use exactly zero value for
290-
// the timestamp, use 1 millisecond before or after.
291-
int64 created_timestamp = 6;
292273
}
293274
294-
// Exemplar represents additional information attached to some series' samples.
275+
// Exemplar is an additional information attached to some series' samples.
276+
// It is typically used to attach an example trace or request ID associated with
277+
// the metric changes.
295278
message Exemplar {
296279
// labels_refs is an optional list of label name-value pair references, encoded
297280
// as indices to the Request.symbols array. This list's len is always
@@ -302,6 +285,7 @@ message Exemplar {
302285
// is attached to a histogram, which only gives an estimated value through buckets.
303286
double value = 2;
304287
// timestamp represents the timestamp of the exemplar in ms.
288+
//
305289
// For Go, see github.com/prometheus/prometheus/model/timestamp/timestamp.go
306290
// for conversion from/to time.Time to Prometheus timestamp.
307291
int64 timestamp = 3;
@@ -312,7 +296,30 @@ message Sample {
312296
// value of the sample.
313297
double value = 1;
314298
// timestamp represents timestamp of the sample in ms.
299+
//
300+
// For Go, see github.com/prometheus/prometheus/model/timestamp/timestamp.go
301+
// for conversion from/to time.Time to Prometheus timestamp.
315302
int64 timestamp = 2;
303+
// start_timestamp represents an optional start timestamp for the sample,
304+
// in ms format. This information is typically used for counter, histogram (cumulative)
305+
// or delta type metrics.
306+
//
307+
// For cumulative metrics, the start timestamp represents the time when the
308+
// counter started counting (sometimes referred to as created timestamp), which
309+
// can increase the accuracy of certain processing and query semantics (e.g. rates).
310+
//
311+
// Note:
312+
// * That some receivers might require start timestamps for certain metric
313+
// types; rejecting such samples within the Request as a result.
314+
// * start timestamp is the same as "created timestamp" name Prometheus used in the past.
315+
//
316+
// For Go, see github.com/prometheus/prometheus/model/timestamp/timestamp.go
317+
// for conversion from/to time.Time to Prometheus timestamp.
318+
//
319+
// Note that the "optional" keyword is omitted due to efficiency and consistency.
320+
// Zero value means value not set. If you need to use exactly zero value for
321+
// the timestamp, use 1 millisecond before or after.
322+
int64 start_timestamp = 3;
316323
}
317324
318325
// Metadata represents the metadata associated with the given series' samples.
@@ -338,10 +345,11 @@ message Metadata {
338345
uint32 unit_ref = 4;
339346
}
340347
341-
// A native histogram, also known as a sparse histogram.
342-
// See https://github.com/prometheus/prometheus/blob/remote-write-2.0/prompb/io/prometheus/write/v2/types.proto#L142
343-
// for a full message that follows the native histogram spec for both sparse
344-
// and exponential, as well as, custom bucketing.
348+
// A native histogram message, supporting
349+
// * sparse exponential bucketing, custom bucketing.
350+
// * float or integer histograms.
351+
//
352+
// See the full spec: https://prometheus.io/docs/specs/native_histograms/
345353
message Histogram { ... }
346354
```
347355

@@ -361,7 +369,6 @@ Rationales: https://github.com/prometheus/proposals/blob/alexg/remote-write-20-p
361369
-->
362370
* `metadata` sub-fields SHOULD be provided. Receivers MAY reject series with unspecified `Metadata.type`.
363371
* Exemplars SHOULD be provided if they exist for a series.
364-
* `created_timestamp` SHOULD be provided for metrics that follow counter semantics (e.g. counters and histograms). Receivers MAY reject those series without `created_timestamp` being set.
365372

366373
The following subsections define some schema elements in detail.
367374

@@ -372,7 +379,7 @@ Rationales: https://github.com/prometheus/proposals/blob/alexg/remote-write-20-p
372379
-->
373380
The `io.prometheus.write.v2.Request` Protobuf Message is designed to [intern all strings](https://en.wikipedia.org/wiki/String_interning) for the proven additional compression and memory efficiency gains on top of the standard compressions.
374381

375-
The `symbols` table MUST be provided and it MUST contain deduplicated strings used in series, exemplar labels, and metadata strings. The first element of the `symbols` table MUST be an empty string, which is used to represent empty or unspecified values such as when `Metadata.unit_ref` or `Metadata.help_ref` are not provided. References MUST point to the existing index in the `symbols` string array.
382+
The `symbols` table MUST be provided, and it MUST contain deduplicated strings used in series, exemplar labels, and metadata strings. The first element of the `symbols` table MUST be an empty string, which is used to represent empty or unspecified values such as when `Metadata.unit_ref` or `Metadata.help_ref` are not provided. References MUST point to the existing index in the `symbols` string array.
376383

377384
#### Series Labels
378385

@@ -405,6 +412,8 @@ Rationales: https://github.com/prometheus/proposals/blob/alexg/remote-write-20-p
405412
-->
406413
Senders MUST send `samples` (or `histograms`) for any given `TimeSeries` in timestamp order. Senders MAY send multiple requests for different series in parallel.
407414

415+
Sample's or Histogram's `start_timestamp` SHOULD be provided for types that follow counter semantics (e.g. counters and counter histograms). Receivers MAY reject those series without `start_timestamp` being set. Given optionality, the 0 value MUST be treated by receivers as unset value. To represent the unlikely 0 unix timestamp in milliseconds, "1" or "-1" value MUST be used.
416+
408417
<!---
409418
Rationales: https://github.com/prometheus/proposals/blob/alexg/remote-write-20-proposal/proposals/2024-04-09_remote-write-20.md#partial-writes#being-pull-vs-push-agnostic
410419
-->
@@ -479,4 +488,4 @@ Samples must be in-order _for a given series_. However, even if a Receiver does
479488
**What are the differences between Remote-Write 2.0 and OpenTelemetry's OTLP protocol?**
480489
[OpenTelemetry OTLP](https://github.com/open-telemetry/opentelemetry-proto/blob/a05597bff803d3d9405fcdd1e1fb1f42bed4eb7a/docs/specification.md) is a protocol for transporting of telemetry data (such as metrics, logs, traces and profiles) between telemetry sources, intermediate nodes and telemetry backends. The recommended transport involves gRPC with protobuf, but HTTP with protobuf or JSON are also described. It was designed from scratch with the intent to support a variety of different observability signals, data types and extra information. For [metrics](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto) that means additional non-identifying labels, flags, temporal aggregations types, resource or scoped metrics, schema URLs and more. OTLP also requires [the semantic convention](https://opentelemetry.io/docs/concepts/semantic-conventions/) to be used.
481490

482-
Remote-Write was designed for simplicity, efficiency and organic growth. The first version was officially released in 2023, when already [dozens of battle-tested adopters in the CNCF ecosystem](./remote_write_spec.md#compatible-senders-and-receivers) had been using this protocol for years. Remote-Write 2.0 iterates on the previous protocol by adding a few new elements (metadata, exemplars, created timestamp and native histograms) and string interning. Remote-Write 2.0 is always stateless, focuses only on metrics and is opinionated; as such it is scoped down to elements that the Prometheus community considers enough to have a robust metric solution. The intention is to ensure the Remote-Write is a stable protocol that is cheaper and simpler to adopt and use than the alternatives in the observability ecosystem.
491+
Remote-Write was designed for simplicity, efficiency and organic growth. The first version was officially released in 2023, when already [dozens of battle-tested adopters in the CNCF ecosystem](./remote_write_spec.md#compatible-senders-and-receivers) had been using this protocol for years. Remote-Write 2.0 iterates on the previous protocol by adding a few new elements (metadata, exemplars, start timestamp and native histograms) and string interning. Remote-Write 2.0 is always stateless, focuses only on metrics and is opinionated; as such it is scoped down to elements that the Prometheus community considers enough to have a robust metric solution. The intention is to ensure the Remote-Write is a stable protocol that is cheaper and simpler to adopt and use than the alternatives in the observability ecosystem.

0 commit comments

Comments
 (0)