The data preaggregator is used in large-scale environments such as within an environment that includes multiple data centers. The data preaggregator can reduce the overall data flow from canisters to the data collector, which improves the performance of the data collector and reduces the amount of data that is sent over the network.
The data preaggregator operates as a service (TL_DataPreaggregator
) on a dedicated server within the data center. When the preaggregator is enabled within a data center, the preaggregator server hosts an extra canister and an agent that collects data from all of the other canisters. The preaggregator agent downloads and aggregates the data, then stores the data to a single canister that is hosted on the data preaggregator server. The preaggregator enables the data collector to retrieve the data from the single preaggregator canister instead of retrieving the data from multiple canisters. Downloading the aggregated data from the preaggregator server reduces the amount of data that is sent over the network to the data collector and improves the performance of the data collector by reducing the amount of processing that must be done by the data collector.
Multiple data preaggregators can be deployed to improve performance in large-scale data centers. Depending on the amount of data that is collected, a typical data preaggregator can process the data of 10 to 15 canisters. A single data collector collects the data from each data preaggregator. Data preaggregators do not transfer data between themselves. If a data preaggregator goes offline, the canisters that are associated with that preaggregator continue to save the data until the data preaggregator comes online again. When a data preaggregator comes online, the preaggregator agent begins downloaded the data from each canister.
The following diagram shows a data center that contains four canisters with one data preaggregator. In this scenario, the data preaggregator agent collects data from each of the canisters and downloads and aggregates the data on the data preaggregator server. The data collector, which is hosted outside of the data center, collects the aggregated data from preaggregator server.
The following diagram shows three data centers with each containing four canisters and one data preaggregator. In this scenario, the data preaggregator agent collects data from each of the canisters in the same data center. The data collector, which is hosted outside of the data centers, collects the aggregated data from each of the preaggregator servers.
The data preaggregator is part of the Tealeaf data collection process. You can use the data collector logs to track the performance of a data preaggregator or for troubleshooting purposes.
Configuration setting | Description |
---|---|
Active | When this setting is selected, the data preaggregator is enabled and collects data from the target canisters. |
Display name | Identifies the data preaggregator when viewing a list of servers in the Portal Management view. |
Host name | Specifies the host name for the data preaggregator server. |
Port | Specifies the port number that is used to access the server. The default port number for a data preaggregator server is 5597.
Note: Make sure that the specified port number is open within your network environment. |
User name | The user name that is used to access the data preaggregator. |
Password | The password that is used to access the data preaggregator. |
Target canisters | Lists all of the canisters in your Tealeaf environment. Each canister is displayed with a selection box.
|
Data Preaggregator Statistics log (TLDataPreaggregatorStats.log
)
The TLDataPreaggregatorStats.log
contains one line for every minute, indicating counts about AGG, AGKEY, PATH tables.
The TLDataPreaggregatorStats.log
is written to the Preaggregator Server and contains the following columns:
READ-AGG READ-AGKEY READ-PATH WRITE(O)-AGG WRITE(O)-AGKEY WRITE(O)-PATH WRITE(A)-AGG
WRITE(A)-AGKEY WRITE(A)-PATH TOTAL-AGG TOTAL-AGKEY TOTAL-PATH TO-COLLECT-AGG
TO-COLLECT-AGKEY TO-COLLECT-PATH TO-TRIM-AGG TO-TRIM-AGKEY TO-TRIM-PATH
WRITE(A)
, stands for the actual records written to the PreaggregatorWRITE(O)
, is the original count, before preaggregation. (The Original count that matches with Read).TOTAL
, is the current count in Preaggregator server.TO-COLLECT
, is the records waiting to be collected by Data collectorTO-TRIM
is the records waiting to be trimmed.
Adding a data preaggregator
You can add a data preaggregator to your Tealeaf environment to reduce the amount of processing that occurs on the data collector.
A dedicated server is required to host the data preaggregator. If you have not already installed the data preaggregator software, review the requirements for the data preaggregator and for instructions on how to use the Tealeaf installation package.
Note: When running the Tealeaf installer, only select the Data Preaggregator component for installation. The data preaggregator cannot be installed with any additional Tealeaf components. Starting with version 10.2, the Data Preaggregator service can also be managed via TMS. If you don’t see it in TMS, stop the TMS service, clean the TMSStore, and start the TMS service again on the Data Preaggregator server.
Note: Once a data preaggregator is added to your Tealeaf environment, it cannot be deleted through the user interface. If you no longer want to use a data preaggregator, edit the configuration and deselect the Active setting. If you want to continue to collect data from the canisters that are assigned to the data preaggregator, you need to assign them to another data preaggregator or deselect them from the data preaggregator configuration settings to have them report directly into a data collector.
To add a new data preaggregator to your Tealeaf environment:
- Log in to the Tealeaf portal.
- Navigate to Portal Management > Manage Servers.
- Select New > Data Preaggregator Server to add a new data preggregator server.
- Enter the configuration settings for the data preaggregator.
The default port number for a data preaggregator server is 5597. Make sure that this port is open within your network environment.
- Select the canisters that you want the data preaggregator to collect data from.
Canisters which are not already assigned to a data preaggregator are listed in bold text. If you select a canister which is already assigned to a data preaggregator, the canister is automatically removed from the configuration of the original data preaggregator.
- Click Active to make the data preaggregator active.
Once the preaggregator is active, the selected canisters start sending data to the data preaggregator instead of sending data directly to the data collector.
- Click Save to save and apply your changes.
Adding and removing canisters for a data preaggregator
You can add or remove canisters within the data preaggregator configuration.
A canister can only be assigned to connect to one data preaggregator at a time. You can assign a canister to a data preaggregator by editing the configuration for a specific canister or by editing the configuration for the data preaggregator. Editing the data preaggregator configuration gives you the ability to see all of the other canisters that are assigned to the data preaggregator which makes it easier to assign a group of canisters to the data preaggregator.
Warning: If you move a canister from one data preaggregator to another data preaggregator, disable both data preaggregators. There can be a short period of time where a canister is assigned to more than one data preaggregator. Disabling the data preaggregators before adding or removing a canister prevents the new data preaggregator from collecting data from the canister while it is still assigned to the old data preaggregator. To disable the data preaggregator, remove the Active selection from the settings and click Save.
To add or remove a canister within the configuration settings for a data preaggregator:
- Log in to the Tealeaf portal.
- Navigate to Portal Management > Manage Servers.
- Select a data preaggregator from the server list and click Edit.
- To add a canister to the data preaggregator, select a canister from the Target canisters list.
- To remove a canister from the data preaggreagator, deselect the canister from the Target canisters.
After a canister is deselected, the canister reports directly into the data collector. You can also assign the canister to another data preaggregator by editing the configuration for the data preaggregator.
- Click Save to save and apply your changes.
- Make sure that the Active selection is selected for each data preaggregator that you have updated. Once the data preaggregator is active, the data preaggregator starst collecting data from the selected canisters.
Make the data preaggregators active after you saved your canister assignments in the data-preaggregator settings. Activating the data preaggregators after applying your canister changes prevents a canister from sending data to multiple canisters and creating duplicate data.
Canister reassignment
You can reassign canisters to different data preaggregators or back to the data collector.
If you want to assign a canister to a different data preaggregator, temporarily disable both data preaggregators before you reassign the canister. If you are reassigning a canister from a data preaggregator to a data collector, then you need to temporarily disable the data preaggregator. When you reassign a canister, there is a short amount of time where the original data preaggregator continues to collect data from the canister while the new data preaggregator or data collector collects data from the canister causing data to be duplicated.
Step | Procedure |
---|---|
1 | Disable each data preaggregator that is affected by moving the canister.
You can disable the data preaggregator by editing the settings for the data preaggregator, clearing Active, and saving the settings. |
2 | If you are assigning the canister to a different data preaggregator, edit the settings for the new data preaggregator, select the canister from the canister list, and save the settings. |
3 | If you are assigning the canister back to the data collector, edit the settings for the current data preaggregator, clear the canister from the canister list, and save the settings. |
4 | Verify that the canister has been reassigned by viewing the data collector log. |
5 | Enable any data preaggregators that were disabled in this procedure.
You can enable the data preaggregator by editing the settings for the data preaggregator, selecting Active, and saving the settings. The data preaggregator can start collecting data as soon as it is activated. |