Collector High Availability Configuration
A FortiSIEM Collector plays the critical role of communicating with the end devices and Cloud Services for collecting logs and performance monitoring metrics, configurations, and other data. Currently if a Collector goes down, then:
-
Logs sent to this Collector need to be manually resent to another Collector unless there is a Load Balancer in front.
-
Events being pulled by Collector stops, until a new Collector is onboarded, and discovery is repeated to create an event pulling job for the new Collector.
This release adds the ability to deploy Collectors in High Availability mode – this enables Collector data collection to continue uninterrupted even when a Collector fails. This feature works differently in these environments:
-
Case 1: On premise and AWS deployments via VRRP
-
Case 2: On Azure and GCP deployments via Load Balancer
Case 1: On premise and AWS deployments via VRRP
If your Collectors are deployed on On-premise hypervisors, or on AWS, or they are hardware appliances, then High Availability (HA) is enabled via Virtual Router Redundancy Protocol (VRRP). A Collector HA Cluster needs to be created with one Leader and one or more Followers and a Virtual IP (VIP) that is always owned by the Leader.
During normal operations:
-
Logs sent to the VIP are handled by the Leader Collector (which owns the VIP).
-
FortiSIEM Supervisor node distributes event pulling and performance monitoring jobs among all Collectors in the Cluster.
If the Leader Collector goes down:
-
The Follower node with highest priority will become the Leader and own the VIP. No human intervention is needed.
-
Logs previously sent to the (failed) Leader Collector will automatically reach the new Leader Collector.
-
FortiSIEM Supervisor node will automatically re-distribute event pulling and performance monitoring jobs previously assigned to the failed Leader Collector, to other Collectors in the HA Cluster.
If a Follower Collector goes down:
-
App Server will distribute event pulling and performance monitoring jobs assigned to the failed Collector to other Collectors in the HA Cluster.
If a failed Collector comes back up, then it will stay a Follower, but the event pulling jobs will be re-distributed among all the working Collectors in the HA Cluster.
Case 2: On Azure and GCP deployments via Load Balancer
If your Collectors are deployed on Azure or GCP, then High Availability is achieved via Load balancing mechanisms. A Collector HA Cluster needs to be created with a Load Balancer in front of the Collectors. The disadvantage of this approach is that the Customer needs to deploy a Load Balancer. However,
During normal operations:
-
Logs sent to the Load Balancer are distributed among the Collectors in the Cluster.
-
FortiSIEM Supervisor node distributes event pulling and performance monitoring jobs among all Collectors in the Cluster.
-
Job distribution is handled via Round Robin.
If a Collector goes down, then:
-
Load Balancer will skip the failed Collector and distribute logs among other Collectors.
-
FortiSIEM Supervisor node will automatically re-distribute event pulling and performance monitoring jobs previously assigned to the failed Collector, to other Collectors in the Cluster.
Follow the appropriate configuration for your environment: