Failover scenario 1: Temporary failure of the primary unit

In this scenario, the primary unit (P1) fails because of a software failure or a recoverable hardware failure (in this example, the P1 power cable is unplugged). HA logging and alert email are configured for the HA group.

When the secondary unit (S2) detects that P1 has failed, S2 becomes the new primary unit and continues processing email.

5. S2 sends an alert email similar to the following, indicating that S2 has determined that P1 has failed and that S2 is switching its effective HA operating mode to master.

This is the HA machine at 172.16.5.11.

The following event has occurred
‘MASTER heartbeat disappeared’
The state changed from ‘SLAVE’ to ‘MASTER’

6. S2 records event log messages (among others) indicating that S2 has determined that P1 has failed and that S2 is switching its effective HA operating mode to master.

After P1 recovers from the hardware failure, what happens next to the HA group depends on P1’s HA On failure settings under System > High Availability > Configuration.

P1 will not process email or join the HA group until you manually select the effective HA operating mode (see “click HERE to restart the HA system” and “click HERE to restore configured operating mode”).

On recovery, P1’s effective HA operating mode resumes its configured master role. This also means that S2 needs to give back the master role to P1. This behavior may be useful if the cause of failure is temporary and rare, but may cause problems if the cause of failure is permanent or persistent.

In the case, the S2 will send out another alert email similar to the following:

This is the HA machine at 172.16.5.11.

The following event has occurred
‘SLAVE asks us to switch roles (recovery after a restart)
The state changed from ‘MASTER’ to ‘SLAVE’

After recovery, P1 also sends out an alert email similar to the following:

This is the HA machine at 172.16.5.10.

The following critical event was detected
The system was shutdown!

On recovery, P1’s effective HA operating mode becomes slave, and S2 continues to assume the master role. P1 then synchronizes the content of its MTA queue directories with the current master unit, S2. S2 can then deliver email that existed in P1’s MTA queue directory at the time of the failover. For information on manually restoring the FortiMail unit to acting in its configured HA mode of operation, see “click HERE to restore configured operating mode”.