Chapter 9 High Availability : Configuring and connecting HA clusters : Troubleshooting HA clusters : Troubleshooting the initial cluster configuration
  
Troubleshooting the initial cluster configuration
This section describes how to check a cluster when it first starts up to make sure that it is configured and operating correctly. This section assumes you have already configured your HA cluster.
To verify that a cluster can process traffic and react to a failure
1. Add a basic security policy configuration and send network traffic through the cluster to confirm connectivity.
For example, if the cluster is installed between the Internet and an internal network, set up a basic internal to external security policy that accepts all traffic. Then from a PC on the internal network, browse to a website on the Internet or ping a server on the Internet to confirm connectivity.
2. From your management PC, set ping to continuously ping the cluster, and then start a large download, or in some other way establish ongoing traffic through the cluster.
3. While traffic is going through the cluster, disconnect the power from one of the cluster units.
You could also shut down or restart a cluster unit.
Traffic should continue with minimal interruption.
4. Start up the cluster unit that you disconnected.
The unit should re-join the cluster with little or no affect on traffic.
5. Disconnect a cable for one of the HA heartbeat interfaces.
The cluster should keep functioning, using the other HA heartbeat interface.
6. If you have port monitoring enabled, disconnect a network cable from a monitored interface.
Traffic should continue with minimal interruption.
To verify the cluster configuration - web‑based manager
1. Log into the cluster web‑based manager.
2. Check the system dashboard to verify that the System Information widget displays all of the cluster units.
3. Check the cluster member graphic to verify that the correct cluster unit interfaces are connected.
4. Go to System > Config > HA and verify that all of the cluster units are displayed on the cluster members list.
5. From the cluster members list, edit the primary unit (master) and verify the cluster configuration is as expected.
To troubleshoot the cluster configuration - web‑based manager
1. Connect to each cluster unit web‑based manager and verify that the HA configurations are the same.
2. To connect to each web‑based manager, you may need to disconnect some units from the network to connect to the other if the units have the same IP address.
3. If the configurations are the same, try re-entering the cluster Password on each cluster unit in case you made an error typing the password when configuring one of the cluster units.
4. Check that the correct interfaces of each cluster unit are connected.
Check the cables and interface LEDs.
Use the Unit Operation dashboard widget, system network interface list, or cluster members list to verify that each interface that should be connected actually is connected.
If Link is down re-verify the physical connection. Try replacing network cables or switches as required.
To verify the cluster configuration - CLI
1. Log into each cluster unit CLI.
You can use the console connection if you need to avoid the problem of units having the same IP address.
2. Enter the command get system status.
Look for the following information in the command output.
Current HA mode: a-a, master
The cluster units are operating as a cluster and you have connected to the primary unit.
Current HA mode: a-a, backup
The cluster units are operating as a cluster and you have connected to a subordinate unit.
Current HA mode: standalone
The cluster unit is not operating in HA mode
3. Verify that the get system ha status command displays all of the cluster units.
4. Enter the get system ha command to verify that the HA configuration is correct and the same for each cluster unit.
To troubleshoot the cluster configuration - CLI
1. Try using the following command to re-enter the cluster password on each cluster unit in case you made an error typing the password when configuring one of the cluster units.
config system ha
set password <password>
end
2. Check that the correct interfaces of each cluster unit are connected.
Check the cables and interface LEDs.
Use get hardware nic <interface_name> command to confirm that each interface is connected. If the interface is connected the command output should contain a Link: up entry similar to the following:
get hardware nic port1
.
.
.
Link: up
.
.
.
If Link is down, re-verify the physical connection. Try replacing network cables or switches as required.