You can determine whether a Processing Server receives any captured data from a PCA by reviewing the Pipeline Status tab in TMS.
Before you begin troubleshooting PCA, you should review and verify that you completed the following configuration steps successfully:
- After the PCA is installed, more configuration may be required to effectively capture all required data types and to tune the PCA for capture in your environment.
- You should also review and verify that your PCA was properly configured when initially installed.
To determine whether a Processing Server is receiving any captured data from a PCA:
- Log in to the Portal as an administration.
- From the menu, navigate to TMS.
- Click the Pipeline Status tab.
- Look for a connection that is labeled with the name or IP address as the PCA, showing nonzero page views and bytes of received data. If you can find a connection, then the Processing Server is capturing data from the PCA.
Otherwise, proceed with the rest of this solution:
- Check the current statistics on the Summary (home) page of the PCA Web UI.
- When it auto-refreshes to show approximately the past 15 seconds of activity, verify if there is a nonzero number of Hits and Packets.
- If there are zero hits, continue to the next part of this solution.
- Run the following TCPDump command from the PCA command line.
- Run the command for a sufficient interval to determine that your web server IP addresses appear both on the left and right sides of the
>
character in the output lines. If not, tell the network team the side of the>
sign on which the Web server IP addresses do not appear. - Assuming your Web servers are operating on the standard IP port number 80, the command is:
tcpdump -n -i <NIC> port 80
- Run the command for a sufficient interval to determine that your web server IP addresses appear both on the left and right sides of the
- If there is bidirectional port 80 traffic from your Web servers being seen on one or more of the PCA network interfaces, then further diagnosis of the traffic is necessary.
- Use TCPDump to write some of the network traffic to a file with the following command, which writes binary format data to
output_file_name
:tcpdump -n -i <NIC> -s0 -w output_file_name port 80
- Use TCPDump to write some of the network traffic to a file with the following command, which writes binary format data to
- Open
output_file_name
and analyze itfor anomalies such as missing packets using the open source Wireshark protocol analyzer (http://wireshark.org/) or an equivalent tool.Tealeaf can help with this diagnosis if it is not clear what you are seeing, either with a Live Meeting session or by getting the dump file from you, if it does not contain any user's personal information.
Managing PCA message logging
You can manage PCA message logging so that the "messages" file no longer receives passive capture messages on Red Hat Linux™.
The default syslog.conf
causes /var/log/messages to receive all log notice messages, including the ones from passive capture.
The following procedure describes how to change the syslog.conf
so that the "messages" file no longer receives passive capture messages on Red Hat Linux.
To change syslog.conf
:
- Edit the /var/log/messages line in the
/etc/syslog.conf
file:*.info;mail.none;authpriv.none;cron.none;local0.none /var/log/messages
- Include the
local0.none
setting to prevent the passive capture log messages from being written to/var/log/messages
.
These messages continue to be written to the Tealeaf specific capture.log
file.
Ultimately, the presence of many messages from passive capture in either of these log files suggests a problem with the input data coming into the CX Passive Capture Application servers (PCA servers) capture NICs. The above procedure is only a means of eliminating redundant logging of the passive capture messages; corrective action on the input data stream to the PCA is likely still required.
ReqCancelled=Client
hits
If your Tealeaf system is recording a large number of ReqCancelled=Client
requests, there might be 5% or more hits that are scattered randomly through the session with a ReqCancelled=Client
, or there may be a concentrated number of ReqCancelled=Client
hits occurring in immediate proximity to each other.
Troubleshooting this issue involves the use of the tcpdump command. Running tcpdump to a file can result in some large files quickly. To make best use of tcpdump and the Support resources to analyze the data, it is important to be able to reproduce the sessions that have a large number of ReqCancelled=Client
hits. Before starting to record dump files for analysis, you should be able to effectively produce the problem behavior. So, a first step is some investigation into how to reliably reproduce ReqCancelled
hits.
- If you have the Data Extractor or cxConnect for Data Analysis in place, ask the BI analysis team to run some queries against the Tealeaf data to determine the following:
- The top 10 URLs on which
ReqCancelled
is occurring as a percentage of total hits - Top ten hit numbers within a session, particularly if it is occurring more often in the beginning of a session
- Top 3 hours of the day when
ReqCancelled
is occurring. - Whether
ReqCancelled
occurs more often during a GET or POST operation. - Is there a specific server or data farm on which it occurs most often?
- Is there a specific proxy or load balancer that causes these (analysis of the HTTP VIA® REQ field)?
- The top 10 URLs on which
- If there is no Data Extractor, the analysis is more complicated and can involve the following steps:
- Add the following fields to the
RTA.ini
rule that specifies more fields to index, if not already present in the list.ReqCancelled
HTTP VIA
, if it is provided in the REQs captured by Tealeaf. Some networks do not have this field, but networks involving physical load balancers or proxies will.
- Run data collection as usual for at least one business day.
- Manually use searches to get the same information specified in case 1 above. Once the analysis led to specific conditions most likely to cause sessions with high numbers of
ReqCancelled=Client
, it is time to record a data dump.
- Add the following fields to the
You are now ready to record a data dump.
Identifying the root cause ReqCancelled=Client
hits by recording a data dump
You can record a data dump to identify the root cause of ReqCancelled=Client
hits.
- As closely as possible, you should replicate conditions that are known to most likely cause
ReqCancelled=Client
hits. You should verify that you can access any specific pages or send the requests to any specific Web servers or proxies. Complete a few test runs, drive specific test sessions, and use the captured data to verify that your test sessions still have a high number ofReqCancelled=Client
pages. - Prepare a time for making the real tcpdump recording.
- If possible, schedule it for a quieter time of the day.
- If the tcpdump data contains only a few sessions, that will make it much easier to spot the problem.
- You want to run tcpdump for a short period, not more than 5 minutes if possible. By this time, you created a repeatable test case that is guaranteed to cause the behavior quickly, and you will not need to run tcpdump for a long period.
- Log in to the PCA using SSH and set up the tcpdump command, specifying as many restrictive conditions as you can. For example, if you know the test case goes against a specific Web server, specify to tcpdump to listen only to that IP address.
- Start the tcpdump command.
- Run the test.
- Stop the TCPDump command using
Ctrl-C
. - Find the session in the Canister.
- Verify there are
ReqCancelled=Client
hits in the sessions. - Save the session as a
.tls
file. Remember to run Get Images first.
- Verify there are
- If the tcpdump can be guaranteed not to contain any user's sensitive information, you can FTP the tcpdump files to Support for analysis.
If the tcpdump files contain production data and there is a chance that it may contain sensitive or personal information, contact Support to arrange a time for remote analysis of the data.
- After you have the captured session and the corresponding tcpdump of the raw traffic during the same slice of time that the session was captured, Tealeaf engineering should be able to help identify the root cause of the
ReqCancelled=Client
hits.
Tealeaf queue fails to start and capture is disabled
There are several things you do when the Tealeaf queue fails to start and capture is disabled.
- If capture fails to be properly initialized, check the PCA
capture.log
for a line similar to the following:Sep 3 15:33:51 tealeaf-dev reassd[15921]: TL Queue system failed to create (-10).
- If the above line appears in
capture.log
, verify the following settings with the listed commands. Expected answers are listed below the command:- net.core.rmem_max:
sysctl -n net.core.rmem_max 50000000
- net.core.rmem_default:
sysctl -n net.core.rmem_default 50000000
- kernel.shmmax:
sysctl -n kernel.shmmax 209715200
- net.core.rmem_max:
- If the numbers displayed on your screen do not match the expected values, you can reconfigure these settings with the commands listed below:
sysctl -w net.core.rmem_max=50000000 sysctl -w net.core.rmem_default=50000000 sysctl -w kernel.shmmax=209715200 touch /usr/local/ctccap/var/startup chown ctccap:ctccap /usr/local/ctccap/var/startup chmod 644 /usr/local/ctccap/var/startup
When you restart the PCA, the TL Queue should initialize, and PCA capture should commence.
Replacing expired SSL certificate on the PCA
You can replace an expired SSL certificate on the PCA. If an SSL certificate was added to the PCA to encrypt its Web Console, that certificate eventually expires. At expiration, a new certificate must be added to the PCA.
To add a new SSL certificate to PCA:
- Follow the instructions on "How to encrypt the PCA Console" solution on the Support website.
- Add the certificate store on the PortalStatus server under the security context of the user running PortalStatus as a Windows™ Scheduled Task.
If the certificate was not upgraded or is not added to the certificate store, the PortalStatus report emails have the message "Unable to connect to the Passive Capture server, server may be down or certificate is not installed."
PCA could not create reveal object
In some cases, the private keys generated by the old PCA build cannot be validated by the new build.
After upgrading, you might see an error message similar to the following one in the PCA capture log. The PCA may fail to start.
Oct 12 12:05:03 sh005 reassd[4763]: Couldn't create reveal object: 1
The basic solution is to remove the current PTL files from the appropriate directory, start the PCA to localize the problem to the PTL files, and then regenerate them from their source.
- Log in to the PCA server.
- Navigate to the following directory:
/usr/local/ctccap/etc/capturekeys
- Move the
.ptl
file or files in the directory to a location outside of the PCA installation. - Comment out any capture keys that are listed in
ctc-conf.xml
.- Open
/usr/local/ctccap/etc/ctc-conf.xml
in a text editor and comment out all the<CaptureKey>
nodes and their children. These nodes are children of the CaptureKeys node, which should remain enabled in the file.- To comment out a section of the
ctc-conf.xml
file use HTML style comments (<!- ->
).
- To comment out a section of the
- Before example:
<CaptureKeys> <CaptureKey> <Label>mykey</Label> <PrivateKeyFile>/usr/local/ctccap/etc/mykey.ptl </PrivateKeyFile> </CaptureKey> </CaptureKeys>
- After example:
<CaptureKeys> <!-- <CaptureKey> <Label>mykey</Label> <PrivateKeyFile>/usr/local/ctccap/etc/mykey.ptl </PrivateKeyFile> </CaptureKey> --> </CaptureKeys>
- Open
- Restart the PCA:
tealeaf restart
- When the PCA has restarted, verify through the Web Console that all PCA processes are working and that data is being passed to the appropriate targets.
- If PCA operations have been verified, then the problem has been localized to the problematic
.ptl
keys.- If the PCA still fails to start, the problem may be elsewhere. You should retain the moved
.ptl
files until you can troubleshoot the problem completely.
- If the PCA still fails to start, the problem may be elsewhere. You should retain the moved
- After you have regenerated the PTL keys, store them in the directory listed above.
- If the PTL keys are saved in the same location with the same names as the originals, uncomment the CaptureKey nodes in the
ctc-conf.xml
file. - If a new location and/or filename is used, the
.ptl
files can be added through the PCA Web Console or placed in thecapturekeys
directory to be automatically loaded.
- If the PTL keys are saved in the same location with the same names as the originals, uncomment the CaptureKey nodes in the
- Restart the PCA.
- If the PCA is able to decrypt SSL traffic, then the PTL files generated by the old build and moved out of the directory can be deleted.
Freeing up PCA disk space
On occasion, the /usr
partition can fill up, requiring you to free up PCA disk space.
By default, the Tealeaf CX Passive Capture Application is installed on the /usr
partition. Through various messages, you might receive indication that the partition is full.
To verify that the partition is full and take measures to free up disk space:
- Verify that the
/usr
is running out of space.To verify disk space on all available partitions, run the following command on the Linux server hosting the PCA:
df -h
- Verify the disk space available on the
/usr
partition. - On the server, navigate to
/usr/local/ctccap/bin-debug
. Look for files whose name begins withcore
. These core dump files can grow large and should be deleted or moved to another location for issue resolution.- To search for all core dump files, navigate to the
ctccap
directory and run the following command:find /usr/local/ctccap/ -name "core*" -print
- Where possible, remove these files to free up disk space.
- To search for all core dump files, navigate to the
PCA fails to start after adding a network interface card
After you added a network interface card (NIC), the PCA might fail to start.
If this happens, an error message similar to the following might display.
Apr 13 10:27:09 tealeaf2 deliverd[5757]: Ending main loop with 0.
Apr 13 10:27:09 tealeaf2 deliverd[5757]: main(), Exiting with 0
Apr 13 10:27:09 tealeaf2 captured[5740]: Restarting too rapidly (0 seconds).
Shutting down.
Apr 13 10:28:23 tealeaf2 tealeaf: info: Starting:
/usr/local/ctccap/bin-debug/failoverd -q
Apr 13 10:28:23 tealeaf2 tealeaf: pem2ptl: error: Please specify the name
of one or more PEM files to encrypt.
Apr 13 10:28:23 tealeaf2 tealeaf: info: Starting:
/usr/local/ctccap/bin-debug/captured -P
Apr 13 10:28:23 tealeaf2 captured[6173]: Captured starting:
revision 1277489920
Apr 13 10:28:23 tealeaf2 reassd[6182]: OpenSSL hw engine(0): None
Apr 13 10:28:23 tealeaf2 reassd[6182]: Couldn't create reveal object: 1
Apr 13 10:28:23 tealeaf2 reassd[6182]: Exiting
Apr 13 10:28:23 tealeaf2 captured[6174]: Caught signal (17). Restarting.
Apr 13 10:28:23 tealeaf2 reassd[6176]: OpenSSL hw engine(0): None
Apr 13 10:28:23 tealeaf2 deliverd[6184]: Ending main loop with 0.
Apr 13 10:28:23 tealeaf2 deliverd[6184]: main(), Exiting with 0
Apr 13 10:28:23 tealeaf2 reassd[6178]: OpenSSL hw engine(0): None
Apr 13 10:28:23 tealeaf2 reassd[6176]: Couldn't create reveal object: 1
Apr 13 10:28:23 tealeaf2 reassd[6176]: Exiting
Apr 13 10:28:23 tealeaf2 reassd[6178]: Couldn't create reveal object: 1
Apr 13 10:28:23 tealeaf2 reassd[6178]: Exiting
Apr 13 10:28:24 tealeaf2 captured[6174]: Restarting too rapidly (0 seconds).
Shutting down.
Apr 13 10:34:32 tealeaf2 tealeaf: info: Stopped httpd(5760).
Apr 13 10:34:32 tealeaf2 tealeaf: info: captured is not running.
Apr 13 10:34:32 tealeaf2 tealeaf: info:
Starting: /usr/local/ctccap/bin-debug/failoverd -q
Apr 13 10:34:32 tealeaf2 tealeaf: pem2ptl: error: Please specify the name
of one or more PEM files to encrypt.
Apr 13 10:34:32 tealeaf2 tealeaf: info: Starting:
/usr/local/ctccap/bin-debug/captured -P
Apr 13 10:34:32 tealeaf2 captured[9446]: Captured starting:
revision 1277489920
Apr 13 10:34:32 tealeaf2 tealeaf: info: Starting: /usr/local/ctccap/bin/httpd
Apr 13 10:34:32 tealeaf2 reassd[9449]: OpenSSL hw engine(0): None
Apr 13 10:34:32 tealeaf2 reassd[9449]: Couldn't create reveal object: 1
Apr 13 10:34:32 tealeaf2 reassd[9449]: Exiting
Apr 13 10:34:32 tealeaf2 captured[9447]: Caught signal (17). Restarting.
Apr 13 10:34:32 tealeaf2 deliverd[9458]: Ending main loop with 0.
Apr 13 10:34:32 tealeaf2 deliverd[9458]: main(), Exiting with 0
Apr 13 10:34:32 tealeaf2 captured[9447]: Restarting too rapidly (0 seconds).
Shutting down.
This issue might be caused by the PTL keys that are installed on the PCA. In some cases, these keys are encrypted using aspects of the addresses of the NIC cards.