The PCA can be configured to failover from a master CX Passive Capture Application to a slave machine. In the event of network outage, system failover, or other interruption on the master machine, the slave machine becomes the active machine and begins capturing traffic.
If you configured failover and are experiencing difficulties, you may find troubleshooting tips and steps in this section to help you resolve the issue.
If you do not specify a port number when you configure a failover master or subordinate, then failover uses port 9866
.
PCA failover mode requires two PCAs, designated Master and Slave. On the Failover tab in the PCA Web Console, define the Master/Slave PCA IP/Port Addresses.
For proper identification of which is Master or Slave, the PCA examines the /etc/hosts
file to find its IP address based on its local hostname. It then matches that IP address to the failover Master/Slave IP entries for assignment.
By running the Linux™ command hostname
, the host name that is listed must be in the /etc/hosts
file with its corresponding IP address. For example, executing hostname
returns the pca01machine, then an entry similar to the following should appear in the hosts
file.
10.10.100.1 pca01machine
If no match is found in the hosts file, then Failover mode fails to start. An error message appears in the capture.log
file:
Both MasterAddress and SlaveAddress must be specified in configuration file.
Determining the failover state
The PCA publishes statistics for the master and slave failover servers in various states. You can review the statistics to examine the failover states and determine the cause of failover issues.
Master PCA stats
You can review the statistics that are published by the PCA for the master failover server in various failover states. These statistics are published in the failover section of the Statistics tab.
When the Node state
statistic is set to active
, the server is delivering hits to other Tealeaf servers.
The Node state is active, meaning it is delivering hits:
Value | Statistic |
---|---|
master |
Node role |
active |
Node state |
running |
Capture state |
yes |
Failover active |
The Master was forced to failover to the slave. Master was stopped:
Value | Statistic |
---|---|
master |
Node role |
passive |
Node state |
stopped |
Capture state |
yes |
Failover active |
Master PCA failover log messages
Below you can review log messages for various failover conditions for the failover master PCA server:
- These log messages appear in
capture.log
. - Any
capture.log
messages that refer topeer
is a reference to the other PCA, not the local one.
Failover is disabled:
TLAPI: Failover is disabled. Delivery is always enabled.
Failover Master is in active delivery state:
TLAPI: Failover is enabled. Delivery is currently enabled.
Failover service falls over to the PCA slave machine:
failoverd: Peer node is down (connection refused).
failoverd: Peer node is alive.
failoverd: Capture has stopped. Initiating failover to peer.
failoverd: Delivery stopped.
Failover failed back to the PCA master machine:
failoverd: Requesting failback from peer.
failoverd: Delivery started.
Slave PCA statistics
You can review the statistics that are published by the PCA for the slave failover server in various failover states. These statistics are published in the failover section of the Statistics tab.
When the Node state
statistic is set to active
, the server is delivering hits to other Tealeaf servers.
Master is running in active Node state. Slave is in passive state (no delivery):
Value | Statistic |
---|---|
slave |
Node role |
passive |
Node state |
running |
Capture state |
yes |
Failover active |
Slave PCA is running, and Master PCA is stopped or non-existent (Failed over to Slave PCA):
Value | Statistic |
---|---|
slave |
Node role |
passive |
Node state |
running |
Capture state |
yes |
Failover active |
When Slave fails back to Master, the Node state returns to passive again.
Slave PCA failover log messages
Below you can review log messages for various failover conditions for the failover master PCA server:
- These log messages appear in
capture.log
. - Any
capture.log
messages that refer topeer
is a reference to the other PCA, not the local one.
Failover is disabled:
TLAPI: Failover is disabled. Delivery is always enabled.
Failover Slave is in passive Node state:
TLAPI: Failover is enabled. Delivery is currently disabled.
Failover Slave took control:
failoverd: Received TakeControl request from peer. Taking control.
failoverd: Delivery started.
Slave failed back to Master server:
failoverd: Received ReleaseControl request from peer. Releasing control.
failoverd: Delivery stopped.
Proper sequence for starting, stopping, and restarting failover
There is a proper sequence for the Master-Slave failover PCAs.
- Start Master PCA first.
- After the Master PCA successfully started, start the Slave PCA.
Note: If you are experiencing issues with correct operation of failover, you should always use the command line to manually stop and start it.
Restarting Failover through the Web Console
The Failover service can be started and stopped through the Failover tab in the CX PCA Web Console.
To start and stop the Failover service through the Failover tab in the CX PCA Web Console:
- Open the CX PCA Web Console on the slave CX PCA server.
- Click the Failover tab.
- Click Restart failover.
- Click Save changes to apply your changes.
- Repeat this procedure on the master CX PCA server.
After the failover status is reset, the CX PCA Web Console displays the new failover status. For example, the master CX PCA reports the status as Failover is active (master)
and the slave CX PCA reports a status of Failover is active (slave)
.
Stopping and starting the Failover service manually from the command line
Older versions of the PCA Web Console do not correctly accept changes to failover. When in doubt, you must manually stop and start the failover service from the command line.
To stop and stop the Failover service from the command line:
- Stop the
failoverd
service:tealeaf stop failoverd
- Confirm
failoverd
was stopped and removed by running the following command:tealeaf ps
- Review
capture.log
messages for more information. - Make changes as necessary.
- To start the
failoverd
service:tealeaf start failoverd