Creating Retention Policy
The life cycle of an event in FortiSIEM begins in the Online event database, before moving to the Archive data store. Online event data resides on faster, but expensive storage. Archive data resides on relatively slower, cheaper and higher capacity storage. You can set up retention policies to specify which events are retained, and for how long, in the online and archive event databases.
ClickHouse Event Retention
This section covers how events retention is managed for ClickHouse based deployments. The deployment possibilities are provided in the following table.
FortiSIEM Deployment |
Online Storage |
Archive Storage |
---|---|---|
Non-AWS | Hot and Warm and Cold tiers | Real time archive on NFS. Note that Cold tier with large disks may suffice for Archive. |
AWS | Hot and Warm and Cold tiers | AWS S3 |
How ClickHouse Event Retention Works
Case 1: Regular non-AWS Deployments
An example is on-premise ClickHouse deployment, where online data is stored in ClickHouse Hot/Warm/Cold tiers, with multiple disks in each tier. In many cases, Cold tier can serve for archiving old events. If this is not sufficient, then you can add an Archive storage on NFS where events are stored in EventDB format.
Online Storage Management
Online storage includes events stored in ClickHouse Hot/Warm/Cold tiers. For Online storage, event retention is managed using two mechanisms: Space based Retention and Time based Retention.
- Space based Retention:
- If free Hot tier disk utilization is less than 10%:
- If Warm tier is defined, then events are moved from Hot tier to the Warm tier until free Hot tier disk utilization is more than 20%.
- If Warm tier is not defined, then those events are purged.
- If free Warm tier disk utilization is less than 10%:
- If Cold tier is defined, then events are moved from the Warm tier to the Cold tier until free Warm tier disk utilization is more than 20%.
- If Cold tier is not defined, then those events are purged.
- If Cold tier is defined and its disk utilization is less than 10%:
- Events are purged until free Cold tier disk utilization is more than 20%.
When events are moved or purged, FortiSIEM goes through each ClickHouse event retention bucket (90 days, 180 days, ...) and moves or deletes the oldest events within each bucket in a round robin manner. All retention buckets are treated uniformly. An example of event movement/purging is as follows.
Suppose in Hot tier, there are two retention buckets containing events for the following days: Day D1 is the oldest while Day D6 is the latest.
90 day bucket - days D1, D2, D3, D4, D5, D6
180 day bucket - days D3, D4, D5, D6
If the free Hot tier disk utilization goes below 10% on Day D6, then the events (to move or purge) are chosen in the following order, until free Hot tier disk utilization reaches 20%:
- 90 day bucket day D1
- 180 day bucket day D3
- 90 day bucket day D2
- 180 day bucket day D4
- 90 day bucket day D3
- 180 day bucket day D5 ...
- If free Hot tier disk utilization is less than 10%:
- Time based Retention: You can specify Online event retention policies to specify the duration for which certain events need to be retained. The policies can take event attributes such as Organization, Reporting Device and Event Type as input. See Creating ClickHouse Event Retention Policy.
During the retention period, the events can be in Hot or Warm or Cold storage depending on the Space based retention, e.g., if Hot tier becomes full, then the event may move to Warm tier, etc... After the retention period expires, these events are purged from Online storage. If you do not have sufficient disk space for the event retention policies, then Space based retention policies kick in and may purge the data, to make room for FortiSIEM to store new events.
Note: When adding a new retention policy, ensure that there is sufficient disk space to meet the retention policy requirements, or else data may be purged before retention time.
Archive Storage Management
If you define Archive storage, then events are copied in real-time to Archive storage and stored in FortiSIEM EventDB format. This storage is not maintained by ClickHouse. Events can stay in both Online and Archive storage, and their retention is managed independently.
- Space based Retention: If free Archive disk utilization is less than 10GB, then oldest events are purged until free Archive disk utilization is more than 20GB. These parameters are defined in the
phoenix_config.txt
file. - Organization based Retention: You can write Archive retention policies to specify the duration for which events for each Organization need to be retained. See Creating Archive Event Retention Policy.
Case 2: AWS Deployments
For AWS deployments, you can define S3 as the Archive storage. ClickHouse manages both the Online Hot/Warm/Cold disks and Archive S3 storage.
- Space based Retention:
- If free Hot tier disk utilization is less than 10%:
- If Warm tier is defined, then events are moved from Hot tier to the Warm tier until free Hot tier disk utilization is more than 20%.
- If Warm tier is not defined, but S3 Archive is defined, then those events are moved from Hot tier to S3 Archive until free Hot tier disk utilization is more than 20%.
- If neither Warm tier nor S3 Archive is defined, then those events are purged.
- If free Warm tier disk utilization is less than 10%:
- If Cold tier is defined, then events are moved from Warm tier to Cold tier until free Warm tier disk utilization is more than 20%.
- If Cold tier is not defined, but S3 Archive is defined, then those events are moved from Warm tier to S3 Archive until free Warm tier disk utilization is more than 20%.
- If neither Cold tier nor S3 Archive is defined, then those events are purged.
- If free Cold tier disk utilization is less than 10%:
- If S3 Archive is defined, then events are moved to S3 Archive until free Cold tier disk utilization is more than 20%.
- If S3 Archive is not defined, then events are purged until free Cold tier disk utilization is more than 20%.
- S3 Archive disk space is considered unlimited. Events are never purged from S3 Archive.
When events are moved or purged, FortiSIEM goes through each retention bucket (90 days, 180 days, ...) and moves/removes the oldest events within each bucket in a round robin manner. All retention buckets are treated uniformly. An example of event movement/purging is as follows.
Suppose in Hot tier, there are two retention buckets containing events for the following days where Day D1 is oldest and day D6 is the latest.
90 day bucket - days D1, D2, D3, D4, D5, D6
180 day bucket - days D3, D4, D5, D6
If the free Hot tier disk utilization is less than 10%, then the events to move or purge, are chosen in the following order, until free Hot tier disk utilization reaches 20%:
- 90 day bucket day D1
- 180 day bucket day D3
- 90 day bucket day D2
- 180 day bucket day D4
- 90 day bucket day D3
- 180 day bucket day D5 ...
- If free Hot tier disk utilization is less than 10%:
- Time based Retention: You can specify Online event retention policies to specify the duration for which certain events need to be retained. The policies can take event attributes such as Organization, reporting Device and Event Type as input. See Creating ClickHouse Event Retention Policy.
During the retention period, the events can be in Hot or Warm or Cold storage depending on the Space based retention, e.g., if Hot tier becomes full, then the event may move to Warm tier, etc... After the retention period expires, these events are purged from ClickHouse Online storage and S3 Archive.
Note: When adding a new retention policy, ensure that there is sufficient disk space to meet the retention policy requirements, or else data may be purged before retention time.
Creating ClickHouse Event Retention Policy
Online event retention policies specify which events are retained, and for how long, in the online event database. Take the following steps to create an Online Event retention policy for ClickHouse.
- Go to ADMIN > Settings > Database > Retention Policy.
- Under Online Retention Policy, click New.
- Select Enabled if the policy has to be enforced immediately.
- Choose the Organizations for which the policy must be applied (for service provider installations). Select All if it should apply to all organizations.
- Choose the Reporting Devices to apply this policy using the edit icon and click Save.
- Choose the Event Type or event type groups to apply this policy and click Save.
- Select the Retention Period from the drop-down list (3 Months, 6 Months, 1 Year, 3 Years, 5 Years, 10 Years, Forever (50 Years). Each month is 30 days.
- Enter any Description related to the policy.
- Click Save.
- When done, click Apply.
Implementation Notes:
- Any time the retention policy on ClickHouse environment is changed, you must click Apply to push the retention policy.
- Retention policies are evaluated based on Rank. A lower rank policy is evaluated first and first match is applied.
- All events matching a retention policy are retained for the duration specified by the Retention Period specified in the policy.
- For the events that do not match with any existing retention policy, the default value for Retention Days is 18, 250 (50 years)
Creating ClickHouse Archive Event Retention Policy for EventDB on NFS
These policies specify which events are retained, and for how long, when EventDB on NFS is used to archive.
- Go to ADMIN > Settings > Database > Retention Policy.
- Under Offline Retention Policy, click New to create a new policy.
- Select the Organization this policy applies to.
- Enter the Time Period in days for archive retention.
- Click Save.
Implementation Notes:
- Policies are enforced only at the end of the day.
- FortiSIEM will attempt to retain the events in the archive according to the policies. However, if the low storage threshold is hit (10GB, by default), then the oldest events which occurred in the day are purged.
FortiSIEM EventDB Event Retention
This section covers how events retention is managed for EventDB based deployments.
How EventDB Event Retention Works
For Online storage, event retention is managed using two mechanisms:
- Space based retention: If free online disk utilization is less than 10GB, then oldest events are moved to the Archive until free online disk utilization is more than 20GB. If Archive is not defined, then those events are purged.
- Policy based retention: You can specify Online event retention policies to specify the duration for which certain events need to be retained in online storage. The policies can take event attributes such as Organization, Reporting Device and Event Type as input. See Creating Online Event Retention Policy. If an event has remained in the online EventDB for the time period in the event retention policy, then the event is moved to the Archive at the end of the day.
For Archive storage, event retention is managed using two mechanisms:
- Space based retention: If free archive disk utilization is less than 10GB, then oldest events are purged until free online disk utilization is more than 20GB.
- Policy based retention: You can specify Archive event retention policies to specify the duration for specific Organizations. See Creating Archive Event Retention Policy. If an event has remained in the archive EventDB for the time period in the event retention policy, then the event is purged at the end of the day.
Creating EventDB Online Event Retention Policy
- Go to ADMIN > Settings > Database > Retention Policy.
- Under Online Retention Policy, click New.
- Select Enabled if the policy needs to be applied.
- Choose the Organizations for which the policy must be applied (for service provider installations). Select All if it should apply to all organizations.
- Choose the Reporting Devices to apply this policy using the edit icon and click Save. If all reporting devices should be applied, check the All checkbox.
- Choose the Event Type or event type groups to apply this policy and click Save. If all event types should be applied, check the All checkbox.
- Enter or select the Time Period in days that the event data specified by the conditions (Organizations, Reporting Devices and Event Type) should be held in the online storage before it is moved to archive or purged.
- Enter any Description related to the policy.
- Click Save.
Implementation Notes:
- If an event has remained in the online event database for the time period in the event retention policy, then the event is moved to the archive at the end of the day.
- If an event does not match any online event retention policy, then it remains in the online event database until the low storage threshold (10GB, by default) is reached. The event is then moved to the archive.
- If the archive mount point is defined, then ALL events are moved from online to archive. Nothing is purged.
- If the archive is not reachable after multiple retries, then FortiSIEM is forced to purge the event because there is nowhere to store the event.
- FortiSIEM will attempt to retain the events in the online event database according to the policies. However, if the low storage threshold is hit (10GB, by default), then the events from the oldest day are moved to archive.
- Implementing an online event policy requires selectively deleting specific events from the database and then re-indexing the database for the affected days. This is expensive in terms of time and performance. Therefore, do not define excessively fine-grained retention policies, because this will affect database performance.
- Policies are enforced only at the end of day – this means that events are deleted and re-indexed only at the end of the day. This minimizes the impact on database performance because the database usage should be low at that time.
- Policies are enforced by FortiSIEM only from the date just before the retention period. For example, if the retention period for a policy is 10 days, and today is 12/19/2022, then FortiSIEM will automatically enforce the policy for events with event receive time starting from 12/18/2022. For processing older dates, Fortinet recommends customers to use the
EnforceRetentionPolicy
tool as follows:EnforceRetentionPolicy <DATES>
, where DATES is a comma-separated list of dates or date-range on which to enforce the policy. DATES is specified as the number of days since the UNIX epoch began: 1970-01-01. A date-range can specified by two dates inclusively separated by "-".
For example, run the commandEnforceRetentionPolicy 16230,16233-16235
to enforce retention policies on these dates: 6/8/2014 and from 6/11/2014 to 6/13/2014.- Run the tool as admin user.
Creating EventDB Archive Event Retention Policy
These policies specify which events are retained, and for how long, in the archive.
- Go to ADMIN > Settings > Database > Retention Policy.
- Under Offline Retention Policy, click New to create a new policy.
- Select the Organization this policy applies to.
- Enter the Time Period in days for archive retention.
- Click Save.
Implementation Notes:
- Policies are enforced only at the end of the day.
- If an event has remained in the archive for the duration specified in the event retention policy, then the event is purged at the end of the day.
Elasticsearch Event Retention
This section covers how events retention is managed for Elasticsearch based deployments. The deployment possibilities are:
FortiSIEM Deployment |
Online Storage |
Archive Storage |
---|---|---|
On-premises Elasticsearch – Option 1 | Hot and Warm tiers | HDFS archive from Elasticsearch |
On-premises Elasticsearch – Option 2 | Hot and Warm tiers | Real-time HDFS archive from FortiSIEM |
On-premises Elasticsearch – Option 3 | Hot and Warm tiers | Real-time Archive to NFS |
Elastic Cloud and AWS Elasticsearch | Hot and Warm tiers | Not available |
How Elasticsearch Event Retention Works
Elasticsearch online events storage is managed by the following thresholds:
- Hot Node
- Free Space Threshold: When the Hot node cluster disk free space falls below Low Threshold, then events are moved to Warm nodes until the Hot node cluster disk free space reaches High Threshold. If Warm node is not defined, then events are Archived. If Archive is not defined or real time archive option is chosen, then events are purged.
- Age Limit: Maximum number of days after which events are moved to Warm nodes. If Warm node is not defined, then events are Archived. If Archive is not defined or real time archive option is chosen, then events are purged.
- Warm Node
- Free Space Threshold: When the Warm node cluster disk free space falls below Low Threshold, then events are Archived. If Archive is not defined or real time archive option is chosen, then events are purged.
- Age Limit: Maximum number of days after which events are moved to Archive. If Archive is not defined or real time archive option is chosen, then events are purged.
These thresholds are defined in Configuring Elasticsearch Retention Threshold.
For archive you can choose either HDFS or EventDB on NFS.
- HDFS archive from Elasticsearch: In this option, FortiSIEM
HDFSMgr
process creates Spark jobs to directly pull events from Elasticsearch and store in HDFS. This option may result in extra load on Elasticsearch as events have to read and then deleted from Elasticsearch while events are getting inserted. In this option, archive disk is managed by threshold, that is when low threshold is reached, then events are purged until the high threshold is reached – see Configuring HDFS Archive Threshold. - Real-time HDFS archive from FortiSIEM: In this option, FortiSIEM
HDFSMgr
process creates Spark jobs to pull events from FortiSIEM Supervisor and Worker nodes. This happens while events are getting inserted into Elasticsearch. This approach has no impact in Elasticsearch performance, but events are stored in both Elasticsearch and HDFS and managed independently. Note that HDFS has better event storage compression properties. In this option, archive disk is managed by threshold, that is when low threshold is reached, then events are purged until the high threshold is reached – see Configuring HDFS Archive Threshold. - Real time archive to NFS: In this option, FortiSIEM Supervisor and Worker nodes store events in NFS managed by FortiSIEM EventDB. This happens while events are getting inserted into Elasticsearch. This approach has no impact in Elasticsearch performance, but events are stored in both Elasticsearch and EventDB and managed independently. Note that EventDB has better event storage compression properties. In this option, archive disk is managed by policies– see Creating Archive Event Retention Policy.
Configuring Elasticsearch Retention Threshold
Complete these steps to configure Native Elasticsearch free space and age retention threshold:
- Go to ADMIN > Settings > Database > Online Settings.
- Select the low percentage threshold, high percentage threshold, and age under:
- Hot Node - Free Space Threshold - Events are moved to Warm nodes based on the first occurrence of one of the following:
- When the Hot node cluster disk free space falls below Low value, then events are moved to Warm nodes until the Hot node cluster disk free space reaches High value.
- If the time duration limit set under Hot Age (the Warm age phase) is met, all events under this limit are moved to Warm nodes.
- Warm Node - Free Space Threshold - Events are moved to Warm nodes based on the first occurrence of one of the following:
- When the Warm node cluster disk free space falls below Low value, then events are moved to Cold nodes until the Warm node cluster disk free space reaches High value.
- If the time duration limit set under Warm Age (the Cold age phase) is met, all events under this limit are moved to Cold nodes.
Note: In the fsiem_ilm_policy, the cold age phase is reflected as a sum of the warm age phase and cold age phase UI values.
- When the Warm node cluster disk free space falls below Low value, then events are moved to Cold nodes until the Warm node cluster disk free space reaches High value.
- Hot Node - Free Space Threshold - Events are moved to Warm nodes based on the first occurrence of one of the following:
Configuring HDFS Archive Threshold
Complete these steps to configure the HDFS retention threshold:
- Go to ADMIN > Settings > Database > Archive Data.
- Select the low and high percentage thresholds under Archive Threshold. If HDFS disk utilization falls below Low value, then events are purged until disk utilization reaches High value.
Creating Elasticsearch Archive Event Retention Policy
These policies specify which events are retained, and for how long, in the archive.
- Go to ADMIN > Settings > Database > Retention Policy.
- Under Offline Retention Policy, click New to create a new policy.
- Select the Organization this policy applies to.
- Enter the Time Period in days for archive retention.
- Click Save.
Implementation Notes:
- Policies are enforced only at the end of the day.
- If an event has remained in the archive for the duration specified in the event retention policy, then the event is purged at the end of the day.