Configuring system settings : Using high availability (HA) : About the heartbeat and synchronization
About the heartbeat and synchronization
Heartbeat and synchronization traffic consists of TCP packets transmitted between the FortiMail units in the HA group through the primary and secondary heartbeat interfaces.
 
Service monitoring traffic can also, for short periods, be used as a heartbeat. For details, see “Remote services as heartbeat”.
Heartbeat and synchronization traffic has three primary functions:
to monitor the responsiveness of the HA group members
to synchronize configuration changes from the primary unit to the secondary units
For exceptions to synchronized configuration items, see “Configuration settings that are not synchronized”.
to synchronize mail data from the primary unit to the secondary unit (active‑passive only)
Mail data consists of the FortiMail system mail directory, user home directories, and mail queue.
 
FortiGuard Antispam packages and FortiGuard Antivirus engines and definitions are not synchronized between primary and secondary units.
When the primary unit’s configuration changes, it immediately synchronizes the change to the secondary unit (or, in a config-only HA group, to the peer units) through the primary heartbeat interface. If this fails, or if you have inadvertently de-synchronized the secondary unit’s configuration, you can manually initiate synchronization. For details, see “click HERE to start a configuration/data sync”. You can also use the CLI command diagnose system ha sync on either the primary unit or the secondary unit to manually synchronize the configuration. For details, see the FortiMail CLI Reference.
During normal operation, the secondary unit expects to constantly receive heartbeat traffic from the primary unit. Loss of the heartbeat signal interrupts the HA group, and, if it is active-passive in style, generally triggers a failover. For details, see “Failover scenario 1: Temporary failure of the primary unit”.
Exceptions include system restarts and the execute reload CLI command. In case of a system reboot or reload of the primary unit, the primary unit signals the secondary unit to wait for the primary unit to complete the restart or reload. For details, see “Failover scenario 2: System reboot or reload of the primary unit”.
Periodically, the secondary unit checks with the primary unit to see if there are any configuration changes on the primary unit. If there are configuration changes, the secondary unit will pull the configuration changes from the primary unit, generate a new configuration, and reload the new configuration. In this case, both the primary and secondary units send alert email. For details, see “Failover scenario 3: System reboot or reload of the secondary unit”.
Behavior varies by your HA mode when the heartbeat fails:
Active-passive HA
A new primary unit is elected: the secondary unit becomes the new primary unit and assumes the duty of processing of email. During the failover, no mail data or configuration changes are lost, but some in-progress email deliveries may be interrupted. These interrupted deliveries may need to be restarted, but most email clients and servers can gracefully handle this. Additional failover behaviors may be configured. For details, see “On failure”.
 
Maintain the heartbeat connection. If the heartbeat is accidentally interrupted for an active-passive HA group, such as when a network cable is temporarily disconnected, the secondary unit will assume that the primary unit has failed, and become the new primary unit. If no failure has actually occurred, both FortiMail units will be operating as primary units simultaneously. For details on correcting this, see “click HERE to restore configured operating mode”.
Config-only HA
Each secondary unit continues to operate normally. However, with no primary unit, changes to the configuration are no longer synchronized. You must manually configure one of the secondary units to operate as the primary unit, synchronizing its changes to the remaining secondary units.
For failover examples and steps required to restore normal operation of the HA group in each case, see “Example: Failover scenarios”.