Skip to content

Commit fa9f093

Browse files
ti_opencti: support filtering of indicators and deduplication mechanism (#15876)
This PR contains changes listed below for the OpenCTI integration: - Added support for the following filters: - Pattern Types - Indicator Types - Revoked Status - Valid From - Valid Until - Label IDs (UUIDs) - Minimum Confidence Level - Author IDs (UUIDs) - Creator User IDs (UUIDs) - Created After - Modified After - Marking Definition IDs (UUIDs) - Added tracking of the last modified timestamp in state to prevent re-fetching already processed indicators. - Added fingerprint processor to prevent duplicate indicators. - Added useful fields to events for the creation of detection rules. - Updated OpenCTI logo.
1 parent 843356f commit fa9f093

File tree

13 files changed

+586
-153
lines changed

13 files changed

+586
-153
lines changed

packages/ti_opencti/_dev/build/docs/README.md

Lines changed: 93 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Each event in the Indicator data stream collected by the OpenCTI integration is
1515

1616
This integration requires Filebeat version 8.16.0, or later.
1717

18-
It has been updated for OpenCTI version 5.12.24 and requires that version or later.
18+
It has been updated for OpenCTI version 6.1.0 and requires that version or later.
1919

2020
## Setup
2121

@@ -25,6 +25,98 @@ When adding the OpenCTI integration, you will need to provide a base URL for the
2525

2626
The simplest authentication method to use is an API key (bearer token). You can find a value for the API key on your profile page in the OpenCTI user interface. Advanced integration settings can be used to configure various OAuth2-based authentication arrangements, and to enter SSL settings for mTLS authentication and for other purposes. For information on setting up the OpenCTI side of an authentication strategy, please refer to [OpenCTI's authentication documentation](https://docs.opencti.io/latest/deployment/authentication/).
2727

28+
### Filtering
29+
30+
The OpenCTI integration supports advanced filtering capabilities to help you control which indicators are ingested. This allows you to focus on specific types of indicators, confidence levels, authors, or time ranges that are most relevant to your security operations.
31+
32+
#### Available Filters
33+
34+
The following filters can be configured when setting up the integration (Note: The integration automatically filters for entity type 'Indicator' only):
35+
36+
- **Pattern Types**: Filter indicators by pattern type (e.g., 'stix'). The values are customizable in OpenCTI, and any custom pattern types defined in your OpenCTI instance are supported (if an observable is associated).
37+
38+
- **Indicator Types**: Filter indicators by type. Values are customizable in OpenCTI. Common defaults include: 'malicious-activity', 'attribution', 'benign', 'anomalous-activity', 'compromised', 'unknown'. Custom types defined in your OpenCTI instance are also supported.
39+
40+
- **Revoked Status**: Filter by revoked status. Set to 'true' to get only revoked indicators, 'false' for only active (non-revoked) indicators, or leave empty to get all indicators regardless of revoked status.
41+
42+
- **Valid From (Start Date)**: Filter indicators with valid_from date after this date. Use ISO 8601 format (e.g., '2024-01-01T00:00:00Z') or relative date expressions (e.g., 'now-30d', 'now-7d').
43+
44+
- **Valid Until (End Date)**: Filter indicators with valid_until date before this date. Use ISO 8601 format (e.g., '2024-12-31T23:59:59Z') or relative date expressions (e.g., 'now+30d', 'now+7d').
45+
46+
- **Label IDs**: Filter by label IDs. Enter the UUIDs of the labels to filter indicators that have these labels applied. **Important: You must use label IDs (UUIDs), not label names.** You can find label IDs in the OpenCTI interface by navigating to Settings > Taxonomies > Labels, or via the API.
47+
48+
- **Minimum Confidence Level**: Filter indicators with confidence level greater than or equal to a specified value (0-100).
49+
50+
- **Author IDs**: Filter by author IDs (createdBy relationship). Enter the UUIDs of the authors to filter indicators created by them. **Important: You must use author IDs (UUIDs), not author names.** You can find author IDs in the OpenCTI interface by clicking on an entity and checking its details, or via the API.
51+
52+
- **Creator IDs**: Filter by technical creator IDs. Enter the UUIDs of the internal users who created the indicators in OpenCTI.
53+
54+
- **Created After**: Filter indicators created after a specific date. Use ISO 8601 format (e.g., '2024-01-01T00:00:00Z') or relative date expressions (e.g., 'now-30d', 'now-7d', 'now-24h').
55+
56+
- **Modified After**: Filter indicators modified after a specific date. Use ISO 8601 format (e.g., '2024-01-01T00:00:00Z') or relative date expressions (e.g., 'now-30d', 'now-7d', 'now-24h').
57+
58+
- **Marking Definition IDs**: Filter by marking definitions (e.g., TLP levels). Enter the UUIDs of the marking definitions. **Important: You must use marking definition IDs (UUIDs), not names.** Common TLP marking IDs:
59+
- TLP:CLEAR (TLP:WHITE): `marking-definition--34098fce-860f-48ae-8e50-ebd3cc5e41da`
60+
- TLP:GREEN: `marking-definition--bab4a63c-aed9-4cf5-a766-dfca5abac2bb`
61+
- TLP:AMBER: `marking-definition--55d920b0-5e8b-4f79-9ee9-91f868d9b421`
62+
- TLP:RED: `marking-definition--e828b379-4e03-4974-9ac4-e53a884c97c1`
63+
64+
#### Filter Examples
65+
66+
Here are some practical examples of filter configurations:
67+
68+
1. **High-confidence indicators only**: Set `Minimum Confidence Level` to 75 to ingest only indicators with high confidence.
69+
70+
2. **Active threat indicators**: Set `Indicator Types` to ['malicious-activity', 'compromised'] and `Revoked Status` to 'false' to focus on active, non-revoked threats.
71+
72+
3. **Currently valid indicators**: Set `Valid From (Start Date)` to 'now-365d' and `Valid Until (End Date)` to 'now+30d' to get indicators that are currently within their validity period.
73+
74+
4. **Recent indicators**: Set `Created After` to 'now-7d' to collect only indicators created in the last 7 days.
75+
76+
5. **Specific pattern types**: Set `Pattern Types` to ['stix'] to collect only STIX pattern indicators, or include your custom pattern types defined in OpenCTI.
77+
78+
6. **Specific campaign tracking**: Use `Label IDs` filter with specific campaign label UUIDs (e.g., ['550e8400-e29b-41d4-a716-446655440000']) to track indicators related to particular threat campaigns.
79+
80+
7. **Indicators from specific sources**: Use `Author IDs` with the UUIDs of specific threat intelligence sources (e.g., ['123e4567-e89b-12d3-a456-426655440000']) to filter indicators from trusted sources.
81+
82+
8. **Recently modified high-value indicators**: Combine `Modified After` set to 'now-24h', `Minimum Confidence Level` to 80, and `Revoked Status` to 'false' to get recently updated, high-confidence active indicators.
83+
84+
9. **TLP-restricted indicators**: Use `Marking Definition IDs` with TLP:CLEAR and TLP:GREEN UUIDs to only ingest indicators that are safe to share broadly: ['marking-definition--34098fce-860f-48ae-8e50-ebd3cc5e41da', 'marking-definition--bab4a63c-aed9-4cf5-a766-dfca5abac2bb'].
85+
86+
All filters work together using AND logic at the top level. Within each multi-value filter (like pattern types or label IDs), OR logic is applied between values.
87+
88+
#### High Availability and Deduplication
89+
90+
The OpenCTI integration supports running on multiple Elastic Agents for high availability. When multiple agents fetch the same indicators:
91+
92+
- **Automatic Deduplication**: The integration uses a fingerprint-based document ID to prevent duplicates. Each indicator gets a consistent ID based on its `standard_id` and `modified` timestamp.
93+
- **No Manual Configuration Needed**: Deduplication works automatically - just deploy the integration to multiple agents.
94+
- **Update Handling**: When an indicator is updated in OpenCTI, the new version replaces the old one in Elasticsearch.
95+
96+
#### Best Practices for HA Setup
97+
98+
1. **Stagger Execution Times**: To avoid all agents hitting OpenCTI simultaneously, consider offsetting their schedules slightly (e.g., Agent 1 at :00, Agent 2 at :02).
99+
2. **Use the Same Configuration**: Ensure all agents use identical filter settings to fetch the same dataset.
100+
3. **Monitor Performance**: Check OpenCTI server load when multiple agents are polling.
101+
102+
### Finding IDs in OpenCTI
103+
104+
Since several filters require UUIDs rather than names, here are ways to find these IDs:
105+
106+
1. **Label IDs**:
107+
- In OpenCTI UI: Navigate to Settings → Taxonomies → Labels. Click on a label to see its ID in the URL or details.
108+
- Via API: Query the `labels` endpoint to list all labels with their IDs.
109+
110+
2. **Author IDs**:
111+
- In OpenCTI UI: Click on any entity that has an author, then click on the author name to see its details including the ID.
112+
- Via API: Query the `identities` endpoint to list all identities (organizations, individuals) with their IDs.
113+
114+
3. **Creator IDs**:
115+
- In OpenCTI UI: Navigate to Settings → Security → Users to see user IDs.
116+
- Via API: Query the `users` endpoint (requires appropriate permissions).
117+
118+
For more information about OpenCTI's filtering system, refer to the [OpenCTI filters documentation](https://docs.opencti.io/latest/reference/filters/).
119+
28120
## Logs
29121

30122
### Indicator

packages/ti_opencti/_dev/deploy/docker/files/config.yml

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ rules:
44
request_headers:
55
Authorization: "Bearer test_api_key"
66
Content-Type: application/json
7-
request_body: '/"variables":\{"after":null,"first":3,"orderBy":"modified","orderMode":"asc"\}/'
7+
request_body: '/"variables":\{"after":null,"filters":\{"filterGroups":\[\],"filters":\[\{"key":"entity_type","operator":"eq","values":\["Indicator"\]\},\{"key":"revoked","operator":"eq","values":\["false"\]\}\],"mode":"and"\},"first":3,"orderBy":"modified","orderMode":"asc"\}/'
88
responses:
99
- status_code: 200
1010
body: |-
@@ -17,7 +17,7 @@ rules:
1717
"id": "d019b01c-b637-4eb2-af53-6d527be3193d",
1818
"standard_id": "indicator--cde0a6e1-c622-52c4-b857-e9aeac56131b",
1919
"is_inferred": false,
20-
"revoked": true,
20+
"revoked": false,
2121
"confidence": 15,
2222
"lang": "en",
2323
"created": "2018-02-05T08:04:53.000Z",
@@ -78,7 +78,7 @@ rules:
7878
"id": "82ecce19-82e6-43e4-9565-70488293b7ec",
7979
"standard_id": "indicator--c721b81a-6c3c-55ab-9bf3-038642e8add8",
8080
"is_inferred": false,
81-
"revoked": true,
81+
"revoked": false,
8282
"confidence": 15,
8383
"lang": "en",
8484
"created": "2020-08-03T06:40:58.000Z",
@@ -139,7 +139,7 @@ rules:
139139
"id": "c7253531-0832-4839-accd-b2b3ee950465",
140140
"standard_id": "indicator--bc622a61-52e0-5785-91d3-c7371f07f15d",
141141
"is_inferred": false,
142-
"revoked": true,
142+
"revoked": false,
143143
"confidence": 15,
144144
"lang": "en",
145145
"created": "2018-02-05T08:04:53.000Z",
@@ -209,7 +209,7 @@ rules:
209209
request_headers:
210210
Authorization: "Bearer test_api_key"
211211
Content-Type: application/json
212-
request_body: '/"variables":\{"after":"WzE2NzM5MzQ4MjU3MTYsImluZGljYXRvci0tYmM2MjJhNjEtNTJlMC01Nzg1LTkxZDMtYzczNzFmMDdmMTVkIl0=","first":3,"orderBy":"modified","orderMode":"asc"\}/'
212+
request_body: '/"variables":\{"after":"WzE2NzM5MzQ4MjU3MTYsImluZGljYXRvci0tYmM2MjJhNjEtNTJlMC01Nzg1LTkxZDMtYzczNzFmMDdmMTVkIl0=","filters":\{"filterGroups":\[\],"filters":\[\{"key":"entity_type","operator":"eq","values":\["Indicator"\]\},\{"key":"revoked","operator":"eq","values":\["false"\]\},\{"key":"updated_at","operator":"gt","values":\["2023-01-17T05:53:45.716Z"\]\}\],"mode":"and"\},"first":3,"orderBy":"modified","orderMode":"asc"\}/'
213213
responses:
214214
- status_code: 200
215215
body: |-
@@ -222,7 +222,7 @@ rules:
222222
"id": "e572d399-18e7-4f76-939b-b8b30c2a79b3",
223223
"standard_id": "indicator--65e96880-3652-5aa7-b625-b7b0f7103a37",
224224
"is_inferred": false,
225-
"revoked": true,
225+
"revoked": false,
226226
"confidence": 15,
227227
"lang": "en",
228228
"created": "2018-02-05T08:04:53.000Z",
@@ -283,7 +283,7 @@ rules:
283283
"id": "f123bd58-30c9-43c8-b3e9-28cc0740afca",
284284
"standard_id": "indicator--62337ac9-3e17-50db-8c67-4a499cbca0a4",
285285
"is_inferred": false,
286-
"revoked": true,
286+
"revoked": false,
287287
"confidence": 15,
288288
"lang": "en",
289289
"created": "2018-02-05T08:04:53.000Z",
@@ -344,7 +344,7 @@ rules:
344344
"id": "225c9e43-8b63-482b-a6af-897ab7f0d289",
345345
"standard_id": "indicator--5f95a26a-f9f0-5f8d-9d37-c912539462e3",
346346
"is_inferred": false,
347-
"revoked": true,
347+
"revoked": false,
348348
"confidence": 15,
349349
"lang": "en",
350350
"created": "2018-02-05T08:04:53.000Z",
@@ -414,7 +414,7 @@ rules:
414414
request_headers:
415415
Authorization: "Bearer test_api_key"
416416
Content-Type: application/json
417-
request_body: '/"variables":\{"after":"WzE2NzM5MzQ4NDA4MTAsImluZGljYXRvci0tNWY5NWEyNmEtZjlmMC01ZjhkLTlkMzctYzkxMjUzOTQ2MmUzIl0=","first":3,"orderBy":"modified","orderMode":"asc"\}/'
417+
request_body: '/"variables":\{"after":"WzE2NzM5MzQ4NDA4MTAsImluZGljYXRvci0tNWY5NWEyNmEtZjlmMC01ZjhkLTlkMzctYzkxMjUzOTQ2MmUzIl0=","filters":\{"filterGroups":\[\],"filters":\[\{"key":"entity_type","operator":"eq","values":\["Indicator"\]\},\{"key":"revoked","operator":"eq","values":\["false"\]\},\{"key":"updated_at","operator":"gt","values":\["2023-01-17T05:54:00.81Z"\]\}\],"mode":"and"\},"first":3,"orderBy":"modified","orderMode":"asc"\}/'
418418
responses:
419419
- status_code: 200
420420
body: |-
@@ -427,7 +427,7 @@ rules:
427427
"id": "2b3fe8d7-29d9-403b-9454-2be96af85d7b",
428428
"standard_id": "indicator--52fd9e7a-2ea9-5e9a-9aa0-9bea534c5158",
429429
"is_inferred": false,
430-
"revoked": true,
430+
"revoked": false,
431431
"confidence": 15,
432432
"lang": "en",
433433
"created": "2020-08-03T06:40:55.000Z",
@@ -488,7 +488,7 @@ rules:
488488
"id": "f18778b9-f692-490d-9971-72012ea86520",
489489
"standard_id": "indicator--363accc6-ab9b-5776-bb80-21abe4a049ce",
490490
"is_inferred": false,
491-
"revoked": true,
491+
"revoked": false,
492492
"confidence": 15,
493493
"lang": "en",
494494
"created": "2018-02-05T08:04:53.000Z",
@@ -552,7 +552,7 @@ rules:
552552
"id": "32b60cc0-4b03-4b28-8436-0979ae34712e",
553553
"standard_id": "indicator--35e958f3-978e-56a9-9801-121c92a7142e",
554554
"is_inferred": false,
555-
"revoked": true,
555+
"revoked": false,
556556
"confidence": 15,
557557
"lang": "en",
558558
"created": "2018-02-05T08:04:53.000Z",

packages/ti_opencti/changelog.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,18 @@
11
# newer versions go on top
2+
- version: "2.10.0"
3+
changes:
4+
- description: Add comprehensive filtering support for indicators including pattern types, confidence levels, labels, dates, authors, creators, and marking definitions.
5+
type: enhancement
6+
link: https://github.com/elastic/integrations/pull/15876
7+
- description: Implement deduplication mechanism using fingerprint processor to prevent duplicate indicators.
8+
type: enhancement
9+
link: https://github.com/elastic/integrations/pull/15876
10+
- description: Add state management to track last modified timestamp and prevent re-fetching already processed indicators.
11+
type: enhancement
12+
link: https://github.com/elastic/integrations/pull/15876
13+
- description: Update OpenCTI logo for better visual consistency.
14+
type: enhancement
15+
link: https://github.com/elastic/integrations/pull/15876
216
- version: "2.9.0"
317
changes:
418
- description: Avoid adding documents with errors to the transform destination index.

packages/ti_opencti/data_stream/indicator/_dev/test/system/test-default-config.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,6 @@ data_stream:
99
page_size: 3
1010
preserve_original_event: true
1111
enable_request_tracer: true
12+
revoked: "false"
1213
assert:
1314
hit_count: 9

0 commit comments

Comments
 (0)