This section contains frequently asked questions of Support on how to troubleshoot the Tealeaf system.
I upgraded my operating system and now some Tealeaf services are not starting
When you perform an upgrade of your OS, sometimes some of the Tealeaf services fails to start. In order to solve this problem, de-register and re-register the services in question.
I changed my privacy rules to scrape or block data, but they do not seem to be working. What should I check?
What to look for first
- Did you restart the Transport service? This must be completed on each processing server where privacy changes occurred. If you modified privacy using our TMS functionality, it should offer to restart this service while applying the configuration file change. This action can be validated by checking the Windows Event Application Log on the server(s) in question.
- If the service restarted successfully but the rule is not working, more than likely you find an error log. Each time the Transport Service restarts the privacy rules are loaded in an Windows Event Application log entry. Details of any problem should be available.
If you file a case with Support
- Report any visible error messages in relation to the privacy rules as they were loaded by the Transport Service restart.
- Attach the following:
- Privacy.cfg
- Example session data (can be in .txt document format) along with an explanation of what you are trying to do
The Transport Service will not start or will not stay running
What to look for first
- Have any changes been made immediately prior? Especially in the case of privacy.cfg changes, try disabling new rules and try again.
- Review the Windows Event Application Log on the server(s) in question.
If you file a case with Support
- Report any visible error messages in relation to the privacy rules as they were loaded by the Transport Service restart.
- Attach the Windows Application Event Log (exported as an EVT file, zipped)
- If there is a TealeafCaptureSocket.exe-(vW.X.YY.ZZZZ)-YYYYMMDD-HHMMSS*.dmp file in the logs folder, include it.
I received a portal status error: "Sessions waiting to be indexed broken its threshold"
What to look for first
- If you did not immediately respond to this error message, log in to the Portal and browse to the System Status Canister report and select the canister reporting indexing issues. Check for the "Sessions waiting for archive to disk". If this number is very high and not consistently decreasing, it indicates a problem.
- If your canister/indexes are stored on a SAN as opposed to local storage, have your SAN team confirm disk throughput is as expected. In many cases, the root cause of this issue is inadequate disk performance.
If you file a case with Support
Attach the following logs:
- CSS_1966_<CANISTERNAME>CS<DAYSDATE>.txt
- CSS_1966_<CANISTERNAME>DL<DAYSDATE>.txt
where <CANISTERNAME> is the name of your canister server and <DAYSDATE> is the date (yyyymmdd) on which you are having the issue.
I have errors indicating that indexing failed, crashed, stopped, or deleted while cleaning up session data.
What to look for first
- Fortunately, indexes are one of the recoverable file types in the Tealeaf system and they can be re-created. To do this:
- Log in to the canister in question, open a browser and navigate to
http://localhost:19000/CIC
(if prompted for authentication, the user and password are ssadmin/ssadmin). - If you are uncertain whether sessions are missing from the index, you can compare the two columns to ensure the session count and indexed document count match.
- If the columns match but you appear to be having issues or suspect index corruption, you can perform a "check indexes" from the top menu.
- Finally, if you'd like the system to do a full audit of indexes and repair any problems that it finds, select "check and fix". Depending on the number of sessions and the number of days' data you collect, this could take up to a full day to complete.
- Log in to the canister in question, open a browser and navigate to
If you file a case with Support
- If you performed the steps above and the indexing service does not continue running, gather any relevant errors from the Windows Event Application log and attach them to the support case.
Searching for active sessions returns no results
What to look for first
- If your search returns no results, first change Limit Hits To: <No Limit>. If results return, it means your method of sessionization was broken. Confirm that the expected cookie is not being found by checking that the TLMERGEID is blank in one of the request buffers. The cookie being referenced should be in parentheses. At this point, you must identify another unique cookie that can stitch together pages or determine where the existing cookie has gone.
- If there are still no results, you should open the Pipeline Status utility either locally on the server or in TMS (available in later versions).
- If you confirm that traffic is continuing to flow through the pipeline (usually the canister session agent is last), then sessions should still be making it into the active canister. At this point, there is a possible permission issue and depending on your form of authentication - portal or NT, you should contact Support.
If you file a case with Support
- Describe the steps that are previously attempted.
- Attach a TLS file that contains a representative session with the problem session IDs.
Searching completed sessions returns no results
What to look for first
- Check the completed session template for "available dates". The dates available should equal the total number of days you expect to retain production data.
- If you have access to the Canister, log in, browse to
http://localhost:19000/CIC
(user and password are ssadmin/ssadmin) and compare the two columns to ensure the session count and indexed document count match. If the indexes are unavailable the sessions are not searchable. If the indexes are gone, it's possible to re-create them with the check and fix command. See: Problem: I have errors indicating indexing failed or crashed. - If indexes are available and days appear to be present, confirm that there are no data segmentation filters in place that would be excluding content. This can be done by opening
searchconfig
(without the quotes) from the Windows - Start - Run menu, and selecting modify under Domain Local Groups. In the resulting window, there is data segmentation filters at the bottom. If someone assigned a filter to a group and the account searching is a member of this group, it will append this to any search and may result in no results found.
If you file a case with Support
- Report any visible error messages from the failed search and the results of initial troubleshooting above.
- Examine the Windows Event Application Log for errors.
- Collect and attach the TLSrchSrv.<DaysDate>.log
I am trying to search using "and same page" and am having troubles
What to look for first
- "And same page" does not work with "does not include" logic. If any of your search criteria include this, it returns no results.
- If searching for more than two things on the same page, try reducing the search terms one at a time to see if you get results.
- If you notice that the search results reflect x number of y sessions where y is greater than x, this indicates that Tealeaf found y number of sessions with your search terms in the same session, and then processed those sessions to determine x, where x is the number of sessions reflecting "and same page" logic. This is expected functionality, not a reflection of missing sessions.
If you file a case with Support
- Report any visible error messages from the failed search and the results of initial troubleshooting above.
- Collect and attach the TLSrchSrv.<DaysDate>.log.
It says I have only three days' data, but I should have 14; what do I check?
What to look for first
- Within the portal, navigate to the System Status Storage report, change to each individual Storage Server, and determine the number of days' worth of data that appears.
- Locally on the canister/processing server, navigate to the Canister.dbs directory, sort by type, and check the .dat files for: LSSN_<daysdate>_<canistername>.dat. There is one .dat file for each canister day. If you confirm the expected dates are available, then they are not being recognized. A
canrebuild
should address this:- Windows Start - Run - canrebuild
- Important: Ensure that Preserve session data is CHECKED, and complete the
canrebuild
. - Return to the portal and confirm that the dates are restored.
If you file a case with Support
- Report steps that are taken in troubleshooting.
- Check the Windows Event Application Log for errors that are related to Tealeaf.
- Attach the TLTMaint.log file.
Events not appearing in the portal
What to look for first
- If you created an event and it is not displaying anywhere in the portal make sure to check both the search templates and event charts.
- In the event itself, double-check that "Display event in portal" is checked and that Building blockis not checked. Both intentionally hide the event in the portal.
- Confirm that the event is not a session-level event and that you are not looking for it in an active search. Session-level events are evaluated when an active session is closed and written out as a completed session. As a result, session-level events cannot be found in active sessions.
If you file a case with Support
- Attach a screen capture of the configured event
Event data is not current, or is missing entirely
What to look for first
- Event data is collected by the Data Collector service on the portal server. At 5-minute intervals, it communicates with each of the canisters to become aware of new events and to gather statistics about these events for import into SQL. If data is missing for hours, it likely means the data collector service has been unable to communicate with the canister.
- Check the services on the Portal server and ensure that the Data Collector service is running.
- Examine the Windows Event Application Log for errors. If you find an error mentioning a timeout regarding either the SQL server or the canister itself, you can alter either or both timeout settings in the portal to allow more time for this process to complete:
- Under the Portal Management - CX Settings - Data Collector menu you find two values: Canister connection timeout (seconds), and Database Connection timeout(seconds).
- The former can be increased in an attempt to address timeouts to the canister, while the latter can be raised to accommodate performance issues in relation to the import of statistics into SQL.
If you file a case with Support
- Attach the following:
- TLDataCollector.log (if data collection stopped in the current day); or
- TLDataCollector_<daysdate>.log (where <daysdate> reflects the date of the last hour of event reporting data.
Alerts not firing and/or emails not arriving
What to look for first
- If you feel certain that alerts should be firing and sending email, there is an alert report in the portal under Active - Alert Monitor. If you find alerts in an alert state, the specified action in the alert should be occurring.
- Confirm that the alert is configured to send to valid email addresses and that Email is checked.
- In the Portal, under TMS, navigate to the reporting server, expand the Alert Service, select Alert Service Configuration, click View/Edit, and ensure an Email From Addressis configured. Most SMTP servers do not require that the sender is a real email account, only that it have a correct domain prefix, for example @yourcompany.com.
If you file a case with Support
- Note any troubleshooting steps taken.
- Attach the following:
- TLAlertSrv_<daysdate>.log (where <daysdate> reflects the date you expect to see the alert email).
Cannot log in to the Portal
What to look for first
- Are you using NT or portal authentication? If you normally type a password, it's Portal authentication. If you normally log in without a password, it is NT authentication.
- Confirm that the SQL Server database is operational and accessible over the network to the portal server.
- Try logging in with the master "admin" account.
- Confirm that the Search Service is running on all Window Servers.
If you file a case with Support
- Report any errors displaying in the portal.
- Examine the Windows Event Application log for error messages and attach to case.
No hits in the canister
What to look for first
- Portal Status Report: Are all servers up? Is there a working connection to the Passive Capture Application server?
- Portal Status Report: Has DecoupleEx commenced queuing? If yes:
- Is the canister using too much memory?
- Is the canister running out of disk space?
- Capture Server Web Console: Any warning indicators?
- Is the span port or load balancer oversubscribed?
If you file a case with Support
- Report any visible error messages.
- Determine as precisely as possible the date and time when the problem first appeared.
- Attach the following:
- Portal Status Report
- Capture server maintenance log
- Capture server capture log
- Capture server error log
- Capture server full-day statistics file
Many one-hit sessions
What to look for first
- Are there IP addresses that the capture server should filter out?
- Is the canister session timeout too short?
If you file a case with Support
- Note as precisely as possible the date and time when the problem first appeared.
- Attach the following:
- Portal Status Report
- Capture server capture log
Why are my saved changes ignored by the PCA Web Console?
The PCA Web Console requires cookies to be enabled for it to maintain the session state as you perform various tasks.
Confirm that cookies are enabled. If cookies are disabled, after you click Save Changes, the page you are viewing reverts to the state before you made your changes.
Why can't I stop the Web Console processes?
The default location of the httpd.pid
file that is used by the tealeaf
script to find the Web Console is in the /usr/local/ctccap/var
directory.
If you previously modified the httpd.conf
so that it differed from the httpd.conf.default
, then your httpd.conf
is preserved when the new httpd.conf.default
is installed by a later PCA package. This preservation means the newer script in a 3100 or later package cannot find the httpd.pid
because the Web Console continues to write it to the old location specified by the httpd.conf
.
To resolve this issue, do the following steps:
- Stop all current Web Console processes by using the following command as root:
killall httpd
- Review the changes between
httpd.conf
and the default file from the package,httpd.conf.default
. For example, you can view the changes by using thediff
command as follows:cd /usr/local/ctccap/etc diff -c httpd.conf.default httpd.conf
- Isolate the changes that were made locally for the Passive Capture Application server (for example, basic authentication, disabling the non-SSL port) from the changes that are introduced by the package.
- Save off the existing
httpd.conf
, overwrite thehttpd.conf
with thehttpd.conf.default
, and merge in the isolated changes from step 3.
Sources of network traffic quality issues
If you are having network traffic issues, review the following issues to help isolate the problem.
Issue | Source |
---|---|
Dropped network TCP packets | Network TCP packets can be dropped in any of the following conditions occur:
|
Unidirectional traffic |
A simple misconfiguration error can result in the CX PCA receiving network traffic for one direction only. In these instances, HTTP requests or responses are forwarded to the CX PCA, but not both. For the CX PCA to correctly reassemble HTTP hits, the TCP traffic must be provided for both directions. In most cases, this situation is a relatively easy to identify and usually is caused by misconfiguration of the source network device. |
Measuring dropped packets
The PCA provides several metrics to help identify dropped network packet conditions. These metrics are only data points to help to assess likely causes for dropped packets.
Unfortunately, few network switch metrics can indicate when a switch has overrun its internal buffers, causing dropped network packets. Indirect metrics such as port bandwidth and CPU utilization can indicate a possible issue. These metrics samples the state of the network switch at some predetermined time interval. If a peak condition occurs between sampling periods, however, no indication would be available at all.
The best indicator is to evaluate captured sessions for missing pages, partial pages, or both missing pages and partial pages. Static validation of test sessions can provide another data point in analyzing the cause of sessions with missing pages. Real time tracking of sessions with compound events that are trigger for missing pages can help to determine whether a solution resolves the issue.