Skip to content

Commit 6361fd2

Browse files
Add observability alerts for chargeback integration (#16205)
* Add observability alerts for chargeback integration - Add two ES|QL alerting rules: detect new chargeback groups and detect deployments missing usage data - Add comprehensive documentation for alert setup and configuration - Update Elasticsearch version requirement to 9.2.0+ for smart lookup join support - Add transform startup and monitoring instructions to README * Update changelog with PR #16205 * Remove wrong information * Update chargeback README documentation * Improve observability alert action message formatting * Clarify configuration update vs add new period documentation
1 parent d694959 commit 6361fd2

File tree

9 files changed

+398
-37
lines changed

9 files changed

+398
-37
lines changed

packages/chargeback/_dev/build/docs/README.md

Lines changed: 212 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22

33
_Technical preview: This integration is being developed by Elastic's Customer Engineering team. Please report any issues to the Elastician who shared this integration with you._
44

5-
The Chargeback integration provides FinOps visibility into Elastic usage across tenants. By integrating data from the [**Elasticsearch Service Billing**](https://www.elastic.co/docs/reference/integrations/ess_billing/) and [**Elasticsearch**](https://www.elastic.co/docs/reference/integrations/elasticsearch/) integrations, it enables the determination of value provided by each deployment, data stream, and tier accross the organisation. This allows Centre of Excellence (CoE) teams to accurately allocate costs back to the appropriate tenant.
5+
The Chargeback integration provides FinOps visibility into Elastic usage across tenants. By integrating data from the [**Elasticsearch Service Billing**](https://www.elastic.co/docs/reference/integrations/ess_billing/) and [**Elasticsearch**](https://www.elastic.co/docs/reference/integrations/elasticsearch/) integrations, it enables the determination of value provided by each deployment, data stream, and tier across the organisation. This allows Centre of Excellence (CoE) teams to accurately allocate costs back to the appropriate tenant.
6+
7+
The integration creates several transforms that aggregate billing and usage data into lookup indices optimized for cost analysis and chargeback reporting.
68

79
## What is FinOps?
810

@@ -28,34 +30,233 @@ Currently, Chargeback calculations consider only Elasticsearch data nodes. Contr
2830

2931
This default weighting means storage contributes most to the blended cost calculation, with indexing considered only on the hot tier. Adjust these weights based on your organisation's needs and best judgment.
3032

31-
Chargeback is also present based on a configured rate and unit. These are used to display cost in the local currency, for instance `EUR`, with a rate of `0.85`.
33+
Chargeback costs are presented based on a configured rate and unit. These are used to display cost in your local currency, for instance `EUR`, with a rate of `0.85` per ECU.
34+
35+
## Configuration
36+
37+
Configuration values are stored in the `chargeback_conf_lookup` index. The dashboard automatically applies the correct configuration based on the billing date falling within the `conf_start_date` and `conf_end_date` range.
38+
39+
### Update the default configuration:
3240

33-
All configuration values can be updated, as follows:
41+
Using `_update/config` updates the document with ID `config`:
3442

3543
```
3644
POST chargeback_conf_lookup/_update/config
3745
{
3846
"doc": {
3947
"conf_ecu_rate": 0.85,
4048
"conf_ecu_rate_unit": "EUR",
41-
"conf_indexing_weight": 50,
49+
"conf_indexing_weight": 20,
4250
"conf_query_weight": 20,
43-
"conf_storage_weight": 40
51+
"conf_storage_weight": 40,
52+
"conf_start_date": "2024-01-01T00:00:00.000Z",
53+
"conf_end_date": "2024-12-31T23:tie"
4454
}
4555
}
4656
```
4757

48-
Chargeback data can be viewed in the `[Chargeback] Cost and Consumption breakdown` dashboard.
58+
### Add a new configuration period (for time-based rate changes):
59+
60+
Using `_doc` creates a new document with an auto-generated ID:
61+
62+
```
63+
POST chargeback_conf_lookup/_doc
64+
{
65+
"conf_ecu_rate": 0.95,
66+
"conf_ecu_rate_unit": "EUR",
67+
"conf_indexing_weight": 20,
68+
"conf_query_weight": 20,
69+
"conf_storage_weight": 40,
70+
"conf_start_date": "2025-01-01T00:00:00.000Z",
71+
"conf_end_date": "2025-12-31T23:59:59.999Z"
72+
}
73+
```
74+
75+
This allows you to have different rates for different time periods (e.g., quarterly or annual rate changes).
76+
77+
**Configuration Options:**
78+
- `conf_ecu_rate`: The monetary value per ECU (e.g., 0.85)
79+
- `conf_ecu_rate_unit`: The currency code (e.g., "EUR", "USD", "GBP")
80+
- `conf_indexing_weight`: Weight for indexing operations (default: 20, only applies to hot tier)
81+
- `conf_query_weight`: Weight for query operations (default: 20)
82+
- `conf_storage_weight`: Weight for storage (default: 40)
83+
- `conf_start_date`: Start date/time for the configuration period (ISO 8601 format)
84+
- `conf_end_date`: End date/time for the configuration period (ISO 8601 format)
85+
86+
## Data and Transforms
87+
88+
The integration creates the following transforms to aggregate cost and usage data:
89+
90+
1. **billing_cluster_cost** - Aggregates daily ECU usage per deployment from ESS Billing data, with support for deployment groups via `chargeback_group` tags
91+
2. **cluster_deployment_contribution** - Calculates per-deployment usage metrics (indexing time, query time, storage) from Elasticsearch monitoring data
92+
3. **cluster_datastream_contribution** - Aggregates usage per data stream for detailed cost attribution
93+
4. **cluster_tier_contribution** - Aggregates usage per data tier (hot, warm, cold, frozen)
94+
5. **cluster_tier_and_ds_contribution** - Combined view of usage by both tier and data stream
95+
96+
These transforms produce lookup indices that are queried by the dashboard using ES|QL LOOKUP JOINs to correlate billing costs with actual usage patterns.
97+
98+
### Starting the Transforms
99+
100+
After installing the integration, you need to manually start the four usage-related transforms:
101+
102+
1. Navigate to **Stack Management → Transforms**
103+
2. Filter for `chargeback` to see all Chargeback transforms
104+
3. Start the following transforms:
105+
- `cluster_deployment_contribution`
106+
- `cluster_datastream_contribution`
107+
- `cluster_tier_contribution`
108+
- `cluster_tier_and_ds_contribution`
109+
110+
The `billing_cluster_cost` transform starts automatically and does not require manual intervention.
111+
112+
### Transform Health Monitoring
113+
114+
To set up alerts that notify you when transforms are not working:
115+
116+
1. Navigate to **Stack Management → Transforms**
117+
2. Filter for `chargeback` to see all Chargeback transforms
118+
3. Select a transform and click the **Actions** menu
119+
4. Select **Create alert rule**
120+
5. Configure the alert rule to notify when the transform health status changes
121+
122+
This will create a transform health rule that monitors the selected transform and sends notifications when issues are detected.
123+
124+
## Dashboard
125+
126+
Chargeback data can be viewed in the `[Chargeback] Cost and Consumption breakdown` dashboard, which provides:
127+
128+
- Cost breakdown by deployment, data tier, and data stream
129+
- Time-series cost trends
130+
- Deployment group filtering for team/project-based analysis
131+
- Blended cost metrics combining indexing, querying, and storage usage
132+
- ECU consumption vs. monetary cost comparison
49133

50134
![Cost and Consumption breakdown](../img/chargeback.png)
51135

136+
## Deployment Groups
137+
138+
The integration supports organizing deployments into logical groups using the `chargeback_group` tag on ESS Billing deployments. This enables cost allocation and filtering by teams, projects, or any organizational structure.
139+
140+
To assign a deployment to a chargeback group, add a tag to your deployment in the Elastic Cloud console in the format:
141+
```
142+
chargeback_group:<group-name>
143+
```
144+
145+
For example: `chargeback_group:team-search` or `chargeback_group:project-analytics`
146+
147+
The `billing_cluster_cost` transform automatically extracts these tags from the `deployment_tags` field in ESS Billing data using runtime mappings. The dashboard includes a deployment group filter to view costs by specific groups, making it easy to track expenses per team or project.
148+
149+
**Note:** Each deployment should have only one `chargeback_group` tag. Having multiple tags can cause issues and lead to unpredictable cost allocation.
150+
151+
## Observability Rules
152+
153+
The following are sample observability rules that can help ensure data validity by notifying you when events occur that could compromise the accuracy of your chargeback data:
154+
155+
### Rule 1: New Chargeback Group Detected
156+
157+
Detects when a new `chargeback_group` tag is added to a deployment, allowing teams to be notified when new cost allocation groups are created.
158+
159+
**To create this alert**, navigate to **Dev Tools** in Kibana and run:
160+
```json
161+
POST kbn:/api/alerting/rule/chargeback_new_group_detected
162+
{
163+
"name": "[Chargeback] New chargeback group detected",
164+
"tags": ["Chargeback"],
165+
"consumer": "alerts",
166+
"rule_type_id": ".es-query",
167+
"schedule": {
168+
"interval": "1h"
169+
},
170+
"params": {
171+
"size": 100,
172+
"esqlQuery": {
173+
"esql": "FROM billing_cluster_cost_lookup | STATS count = COUNT(*) BY deployment_group | SORT deployment_group | KEEP deployment_group"
174+
},
175+
"threshold": [0],
176+
"timeField": "@timestamp",
177+
"searchType": "esqlQuery",
178+
"timeWindowSize": 3,
179+
"timeWindowUnit": "d",
180+
"thresholdComparator": ">",
181+
"excludeHitsFromPreviousRun": true
182+
},
183+
"actions": []
184+
}
185+
```
186+
187+
### Rule 2: Deployment with Chargeback Group Missing Usage Data
188+
189+
Detects when a deployment has a chargeback group assigned but is not sending usage/consumption data. This indicates a potential configuration issue or data collection problem.
190+
191+
**To create this alert**, navigate to **Dev Tools** in Kibana and run:
192+
```json
193+
POST kbn:/api/alerting/rule/chargeback_deployment_missing_usage_data
194+
{
195+
"name": "[Chargeback] Deployment with chargeback group missing usage data",
196+
"tags": ["Chargeback"],
197+
"consumer": "alerts",
198+
"rule_type_id": ".es-query",
199+
"schedule": {
200+
"interval": "1h"
201+
},
202+
"params": {
203+
"size": 100,
204+
"esqlQuery": {
205+
"esql": """FROM billing_cluster_cost_lookup
206+
| WHERE deployment_group != ""
207+
| LOOKUP JOIN cluster_deployment_contribution_lookup ON composite_key
208+
| WHERE cluster_name IS NULL
209+
| INLINE STATS count = COUNT(*) BY deployment_id, deployment_name, deployment_group
210+
| EVAL result = CONCAT("Deployment `", deployment_name,"` (`", deployment_id,"`) in deployment group `", deployment_group, "` did not have usage data since ", left(composite_key,10),".")
211+
| STATS result = VALUES(result)
212+
| MV_EXPAND result"""
213+
},
214+
"threshold": [0],
215+
"timeField": "@timestamp",
216+
"searchType": "esqlQuery",
217+
"timeWindowSize": 3,
218+
"timeWindowUnit": "d",
219+
"thresholdComparator": ">",
220+
"excludeHitsFromPreviousRun": true
221+
},
222+
"actions": []
223+
}
224+
```
225+
226+
### Alert actions
227+
228+
**Configure an action** with the following message template appended to the default content (keep the new lines, as it helps with legibility):
229+
230+
```
231+
Details:
232+
233+
{{#context.hits}}
234+
• {{_source}}
235+
236+
{{/context.hits}}
237+
238+
Total: {{context.hits.length}}
239+
```
240+
52241
## Requirements
53242

54243
To use this integration, the following prerequisites must be met:
55244

56-
- The monitoring cluster, where this integration is installed, must be on version 8.18.0+ due to its use of [ES|QL LOOKUP JOIN](https://www.elastic.co/docs/reference/query-languages/esql/esql-lookup-join).
57-
- The [**Elasticsearch Service Billing**](https://www.elastic.co/docs/reference/integrations/ess_billing/) integration (v1.4.1+) must be installed and running.
58-
- The [**Elasticsearch**](https://www.elastic.co/docs/reference/integrations/elasticsearch/) integration (v1.16.0+) must be installed and collecting [usage data](https://www.elastic.co/docs/reference/integrations/elasticsearch/#indices-and-data-streams-usage-analysis) from all relevant deployments.
59-
- The Transform named `logs-elasticsearch.index_pivot-default-{VERSION}` must be running, which is an asset of the **Elasticsearch** integration.
245+
**Monitoring Cluster:**
246+
- Must be on Elasticsearch version **9.2.0+** due to the use of smart [ES|QL LOOKUP JOIN](https://www.elastic.co/docs/reference/query-languages/esql/esql-lookup-join) (conditional joins) in transforms and dashboard queries
247+
- This is where the Chargeback integration should be installed
248+
249+
**Required Integrations:**
250+
- [**Elasticsearch Service Billing**](https://www.elastic.co/docs/reference/integrations/ess_billing/) integration (v1.4.1+) must be installed and collecting billing data from your Elastic Cloud organization
251+
- [**Elasticsearch**](https://www.elastic.co/docs/reference/integrations/elasticsearch/) integration (v1.16.0+) must be installed and collecting [usage data](https://www.elastic.co/docs/reference/integrations/elasticsearch/#indices-and-data-streams-usage-analysis) from all deployments you want to include in chargeback calculations
252+
253+
**Required Transforms:**
254+
- The transform `logs-elasticsearch.index_pivot-default-{VERSION}` (from the Elasticsearch integration) must be running to aggregate usage metrics per index
255+
256+
**Data Flow:**
257+
1. ESS Billing data is collected into `metrics-ess_billing.billing-*`
258+
2. Elasticsearch usage data is collected into `metrics-elasticsearch.stack_monitoring.*` (or `monitoring-indices` for Stack Monitoring)
259+
3. Chargeback transforms process and correlate this data
260+
4. Dashboard queries the resulting lookup indices using ES|QL
60261

61-
This integration must be installed on the **Monitoring cluster** where the above mentioned relevant usage and billing data is collected.
262+
**Note:** This integration must be installed on a centralized monitoring cluster that has visibility to both billing and usage data from your deployments.

packages/chargeback/changelog.yml

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,29 @@
11
# newer versions go on top
2+
- version: 0.2.7
3+
changes:
4+
- description: "Add observability alerting rule templates and documentation for monitoring new chargeback groups and missing usage data. Update Elasticsearch version requirement to 9.2.0+ for smart lookup join support."
5+
type: enhancement
6+
link: https://github.com/elastic/integrations/pull/16205
27
- version: 0.2.6
38
changes:
49
- description: "Fixing bug around sku based cost allocation"
5-
type: enhancement
6-
link: https://github.com/elastic/integrations/pull/14545
10+
type: bugfix
11+
link: https://github.com/elastic/integrations/pull/16192
712
- version: 0.2.5
813
changes:
914
- description: "Add deployment_group field extracted from ESS Billing deployment tags using runtime mappings to enable tag-based cost allocation and filtering. Fix transforms to use correct field type for elasticsearch.cluster.name."
1015
type: enhancement
11-
link: https://github.com/elastic/integrations/pull/14545
16+
link: https://github.com/elastic/integrations/pull/16185
1217
- version: 0.2.4
1318
changes:
1419
- description: "Adding sku and cost_type to the billing_cluster_cost_lookup for future utilization"
1520
type: enhancement
16-
link: https://github.com/elastic/integrations/pull/14545
21+
link: https://github.com/elastic/integrations/pull/16182
1722
- version: 0.2.3
1823
changes:
1924
- description: "Adding deployment filter, dataview and moving config portion to bottom of dashboard for better usability."
2025
type: enhancement
21-
link: https://github.com/elastic/integrations/pull/14545
26+
link: https://github.com/elastic/integrations/pull/16153
2227
- version: 0.2.2
2328
changes:
2429
- description: "Allow setting the Conversion Rate per time window in the configuration lookup index and adding collapsable sections in the dashboard for better usability."
@@ -47,27 +52,27 @@
4752
- version: 0.1.5
4853
changes:
4954
- description: "Fixing the control error in the dashboard by adding a data view."
50-
type: bugfix
55+
type: enhancement
5156
link: https://github.com/elastic/integrations/pull/14545
5257
- version: 0.1.4
5358
changes:
5459
- description: "Consistent naming of `datastream`. Add `| LIMIT 5000` to ESQL top query to cater for large organisations."
55-
type: bugfix
60+
type: enhancement
5661
link: https://github.com/elastic/integrations/pull/14545
5762
- version: 0.1.3
5863
changes:
5964
- description: "Made sure the colour palette is predictable by using the eui_amsterdam_color_blind palate. Add ECU rate to the dashboard."
60-
type: bugfix
65+
type: enhancement
6166
link: https://github.com/elastic/integrations/pull/14545
6267
- version: 0.1.2
6368
changes:
6469
- description: "Added the necessary fields to the billing_cluster_cost_lookup in the Elasticsearch transform to allow for correlation with the ES integration."
65-
type: bugfix
70+
type: enhancement
6671
link: https://github.com/elastic/integrations/pull/14545
6772
- version: 0.1.1
6873
changes:
6974
- description: "Fixed the dashboard chargeback timeframe calculation for cost and ECU utilisation"
70-
type: bugfix
75+
type: enhancement
7176
link: https://github.com/elastic/integrations/pull/14545
7277
- version: 0.1.0
7378
changes:

0 commit comments

Comments
 (0)