This example provides step-by-step instructions for creating the data objects to identify and report on the Top IP addresses and referrers for your web application.
IP addresses often provide unique identifiers to the machine from which the visitor is starting the session. By tracking the IP addresses of your visitors, you may be able to identify the most frequent visitors to your application and, if needed, filter out unwanted traffic from your Tealeaf reporting data for visits, for example, by bots.
A session's referrer identifies the URL from which one of your visitors started to begin the session. Identifying the top referrers to your web application can be useful in tracking the effectiveness of your marketing campaigns. By identifying the top referrers as dimensional data, you can assess the value of your relationships with partners driving content to your web application.
For example, if your campaigns include identifying information as query parameters in the URL, you can configure Tealeaf to capture and track this information. In the following simple example, the URL value for this referrer contains a query parameter (CampaignID
) and value (01
), which would be useful to track:
http://www.example.com?CampaignID=01
These pieces of information provide contextual information for the most frequent visitors to your web application and the most common points of entry. These reports are then added to a dashboard for convenient review.
Important notes about dimension data limits
To prevent runaway creation of dimensional data that becomes unwieldy to use in reporting, Tealeaf imposes a configurable limit on the maximum number of unique dimension values to be captured on an hourly basis.
Suppose that you are interested in creating dimensions to capture referrer URLs and IP addresses. Typically, URL and IP address data is captured and stored in Tealeaf as dimensional data. As hits are passed through the pipeline, these values are captured and stored in the Tealeaf database. Depending on the volume of traffic on your site, this dataset could number in the millions of unique values.
The configurable limit:
- Can be configured for individual dimensions.
- Is applied to each Canister.
For example, if you define a limit of 10 unique values and have two Canisters in your environment, a maximum of 10 X 2 unique values for the dimension can be recorded per hour.
Suppose for the IP address dimension, you configure a limit of 2 unique values per hour in a single Canister environment. This limit means that in a single hour, a maximum of 2 unique values is written to the database from your lone Canister. Any other value that is subsequently captured in the same hour is written using the limit value, which is set to [Limit]
by default.
- This example limit is extremely small for demonstration purposes; by default, the limit value is set to
1000
unique values per hour.
In the following table , you can see a set of example IP addresses that are detected in the pipeline for each hour:
Address | Hr 1 | Hr 2 | Hr 3 |
---|---|---|---|
Addr 1 | 1.1.1.1 | 1.1.1.4 | 1.1.1.4 |
Addr 2 | 1.1.1.2 | 1.1.1.5 | 1.1.1.5 |
Addr 3 | 1.1.1.3 | 1.1.1.1 | 1.1.1.1 |
Addr 4 | 1.1.1.4 | 1.1.1.2 | 10.255.255.1 |
Addr 5 | 1.1.1.5 | 1.1.1.3 | 10.255.255.2 |
If the limit for this dimension is set to 2
per hour, then for each hour, the following addresses are recorded for the dimension:
Hour | Recorded | Recorded as [Limit] |
---|---|---|
Hr 1 | 1.1.1.1, 1.1.1.2 | 1.1.1.3, 1.1.1.4, 1.1.1.5 |
Hr 2 | 1.1.1.4, 1.1.1.5 | 1.1.1.1, 1.1.1.2, 1.1.1.3 |
Hr 3 | 1.1.1.4, 1.1.1.5 | 1.1.1.1, 10.255.255.1, 10.255.255.2 |
Based upon the configured limit of two unique values recorded per hour for this dimension, all other values for URL are recorded as [Limit]
.
- For Hour 1, you can see that two new values are recorded. Three IP addresses are not.
- For Hour 2, two of the values that were assigned the
[Limit]
value in Hour 1 are now recorded. Values that were recorded in Hour 1 are present in Hour 2, but because they appeared after the limit had been reached, they are recorded as[Limit]
. - For purposes of recording, Hour 3 looks identical to Hour 2. However, two new, previously undetected values are captured, but they are not recorded because the limit has already been reached.
The above example demonstrates the potential uncertainties around capturing dimensional values with a high volume of unique values per hour. To manage the data volume:
- You can raise the limit for each configured dimension sufficiently high to account for the volume.
Note: This approach can significantly impact your data storage.
- Use an approximation for the IP address.
- Instead of using the
Referrer
value, you can use theReferrer Domain for Session
session attribute provided by Tealeaf as the source for the dimension. Instead of capturing the full referral URL (for example,www.sample.com/mypage.htm
), this session attribute just captures the domain value (for example,www.sample.com
). - Instead of the
Client IP Address
value, you can use theUser Agent of Client
session attribute provided by Tealeaf as the source for the dimension.
- Instead of using the
Create an event called Every Session
Create an event that is called Every Session
, which fires on the first hit of the session.
This event uses the Session GUID
session attribute, which is the internal Tealeaf identifier that is assigned by the Canister on the first hit of each valid session; each session has a Session GUID value, which means that this event fires for every valid session.
This event then creates reports that evaluate all sessions.
- To open the Event Manager, select Configure > Event Manager.
- In the Event Manager, click the Events tab.
- Click New Event.
- For the Name of the event, enter Every Session.
- For the trigger of the event, set Evaluate On to
First hit of session
. - For organizational purposes, you can find it useful to assign the event to a new event label. Otherwise, the event is listed in the
Default
event label after it is created. - Select the Condition step.
- Click the Session Attributes category.
- Click
Session GUID
. This session attribute is the session identifier that is created by Tealeaf. Every session is marked with one on the first hit of the session. - The Session GUID session attribute is added to the main pane. From the drop-down for the attribute, select
Is not empty
. The event is configured to fire on the first hit of the session, whenever the Session GUID is specified. This event fires on every valid session. - Save your draft.
- Save your changes and commit them.
The Every Session
event is created.
Search for the event in active sessions
Use the information in this topic to find a session where the event is displayed and can be reused for locating sessions to test after you create new event objects contained in them.
After you create a new event, you should verify that is being written to sessions. Since this event is created on the first hit of any new session, it must be available in any active session that was started after you saved the event to the server.
These steps help you to find a session where the event is displayed and can be reused for locating sessions to test after you create new event objects contained in them.
- In the Portal, select Search > Active Sessions. The Active Session Search page is displayed.
- If any search fields are displayed, click the X icon in the corner to remove them.
- From the left navigation panel, click the Basic Search Fields panel. Click Events.
- An Events search term is displayed in the search criteria.
- Click <Select an event.
- Open the event label containing the
Every Session
event. - Click the
Every Session
event. - To select it, click Select.
- Leave the dimension value as
<Any Dimension>
. - Click Search.
When the search results are returned, all active sessions in which the Every
Session
event has occurred are displayed.
You can also use the Event Tester to verify event operations.
Create dimensions
A dimension provides contextual information that can be recorded when an event is recorded. When you create a dimension and associate it with a specific event, whenever the event is recorded, the value for the dimension is also recorded. In this case, the value for the visitor's IP address can be recorded.
Create a dimension called IP address
Tealeaf provides a session attribute Client IP Address
, which is created and maintained by the Canister to identify the IP address from which the visitor's client application is connecting. In this step, you create a dimension that is called IP
Address
, which is populated by the Client IP Address
session attribute.
- Open the Event Manager by select Configure > Event Manager.
- In the Event Manager, click the Dimensions tab.
- To create a dimension, click New Dimension.
- Set the following values:
- Set the name:
IP Address
. - Set the Populated By value:
Client IP Address
session attribute. - Set the Populated With value:
First Value in Session
. - For Values to Record, set the value to
Whitelist + Observed Values
. - For Default Value, select
[Others]
. - For Max Values Per Hour, you must adjust the setting that is based on traffic volume for your site.
Note: If the accepted maximum forMax Values Per Hour
is not already known, you must leave the default value. Depending on the volume of traffic to your site, setting it to a much higher value can increase data storage requirements.Note: For this dimension, you might not be able to configure an acceptable limit to generate a high-fidelity data set. - Set the name:
- Click Turn On Logging.
- Save your draft.
- Save your changes and commit them.
The dimension IP Address
is created and is populated by the session attribute Client IP Address
. Dimension logging is enabled.
Create a dimension called Referrer by session
In this step, you configure a dimension to capture the referrer for the session. The referrer is the IP address from which the visitor left to begin the session on your web application.
There is a hit attribute for capturing the referrer, which tracks the previous page visited by the visitor for each hit. For this exercise, use the session referrer.
A session attribute is a variable associated with each session. Tealeaf provides a number of session attributes for your use and enables the configuration of up to 64 user-defined session attributes.
In the following sequence, you create a dimension, which is populated by the Referrer for
Session
session attribute. This attribute is a default session attribute provided by Tealeaf. Call this new dimension Referrer for Session
.
- In the Event Manager, on the Dimensions tab, create a new dimension, click New Dimension.
- Set the following values:
- Set the name:
Referrer for Session
. - Set the Populated By value:
Referrer for Session
session attribute. - Set the Populated With value:
First Value in Session
. - For Values to Record, set the value to
Whitelist + Observed Values
. - For Default Value, select
[Others]
. - For Max Values Per Hour, you must adjust the setting based on traffic volume for your site.
Note: If the accepted maximum forMax Values Per Hour
is not already known, you must leave the default value. Depending on the volume of traffic to your site, setting it to a much higher value can increase data storage requirements.Note: For this dimension, you can't configure an acceptable limit to generate a high-fidelity dataset. - Set the name:
- Save your draft.
- Save your changes and commit them.
The dimension Referrer for Session
is created.
Create a report group called IP addresses
Bring together the objects that you have created by creating a report group. You create a simple report group containing the Client IP
Address
and Referrer for Session
dimensions.
A report group is an organizing structure for dimensions. In the Report Builder, you can include multiple dimensions in a report if they belong to the same report group. This structure enables efficient storage of dimensional data while maintaining flexibility in reporting.
- A report group may contain up to 4 dimensions.
- A dimension must belong to at least one report group.
This report group is then associated with the Every Session
event.
To create the report group:
- In the Event Manager, on the Dimensions tab, create a report group by clicking New Report Group.
- In the Add Report Group dialog, set the Name:
IP Addresses
. - If it is not already configured, set the Template to
Standard
. - Click Add Dimensions.
- Select
IP Address
andReferrer for Session
dimensions. - Click Save Draft.
Report group
IP Addresses
is created. - In the Event Manager, on the Events tab, edit the
Every Session
event. - Click the Report Groups step.
- Add the report group
IP Addresses
to the event. - Save your draft.
- Save your changes and commit them.
The report group IP Addresses
is created and associated with the event Every Session
. Since the event fires on each session once, the IP address and referrer values are captured into the dimensions in the report group. This information can be used as context for reporting.
Test active sessions for the new dimensions
You can search for event + dimension combinations in active sessions. This method of search is useful for finding specific sessions that are based on narrowly defined dimensions.
When the Every Session
is triggered, the values for the dimensions, the IP address, and the referrer value, are also written into the request of the first hit of the session. So, this data is available as soon as the event is recorded, which is on the first page of every session.
The following steps show how to search for event + dimension combinations in active sessions. This method of search is useful for finding specific sessions that are based on narrowly defined dimensions.
To search for event + dimension combinations in active sessions:
- Search active sessions for the
Every Session
event. Leave the dimension as<Any Dimension>
.The list of returned sessions includes all active sessions where the
Every Session
event fired. This event is fired on the first page of a session. Now, the dimensions are configured to be recorded when the event fires, so you can quickly locate a session that contains the recorded dimensions. - In the displayed list of sessions, click the Send to Event Tester icon. In the dialog, you might want to change the Description to something easier to locate, such as
Test Session - Every Session
. Click Send to Event Tester. Click OK to go immediately to the Event Tester. - In the Event Tester, click the icon next to the session you want to test. The session is evaluated.
- In the Select Events tab, click the event label containing the
Every Session
event. Click the event to add it to the list of tested events. Do not select any other events. - Click the View Results tab.
- Expand the Events node to display the Every Session event.
- Expand the entry for Page 1.
- Expand the entry for the report group to which you assigned the two dimensions. See
IP Addresses
.
If the two dimensions are displayed in the tested results, then they are being recorded with the event. In the above, the Referrer for Session
value is set to a null value (TLT$NULL
). In this case, no value was recorded for it.
Create a report by using the Every Session event
After the event is tested and at least one hour is passed, you can create your first report by using the event.
Before you continue with this step, you must wait at least one hour so that reporting data can be gathered and aggregated into the database. Data for your new event and dimensions is not available until it is aggregated into the database.
In this step, you generate a report in the Report Builder that uses the Every
Session
event with the IP Address
dimension added to the x-axis.
The Report Builder enables Portal users to create ad hoc reports by using events, dimensions, and ratios of their own choosing. The Report Builder uses the flexibility of data object creation to deliver critical analytical capability to the desktops of the users that are most informed about your web application.
- To open the Report Builder, select Analyze > Report Builder from the Portal menu.
- In the Report Builder, click the Create New button in the toolbar.
- A new report is created.
- To add an event, click Add Event. Add the
Every Session
event to the report.If you did not assign an event group to the event, it is in the
Default
group.Note: If you create theEvery Session
event within the last hour, you might not see any data in the report. - Click the Dimensions tab. Click and drag the dimension
IP Address
to the x-axis. - When you add the IP address dimension, you might be prompted that the report is too large to display. Click OK for the moment.
- Next to the IP address dimension, click the drop-down menu caret. Select Filter.
- In the Dimension Filter dialog, select the
Top N
check box.For the Maximum Number of Values to Display, enter the number of top IPs to display.
25
is a good starting value. - Click Apply. The report is updated to show sessions for only the Top IP addresses.
- To save the report, click the Save button in the toolbar. Enter the name
Top IP Addresses
. - Click Save. The report is saved.
The Top IP Address
report is created.
This report generates a simple list of the IP addresses visiting your site for the designated time period. The meaningful data in the report is the dimension IP Address
. If you configured a Top N filter, only the top IP addresses are listed.
By adding the report group to other events, you can now create reports on those events that are filtered by the top values for IP Address
and Referrer for
Session
.
You can modify the report to display the Referrer for Session
dimension to identify top referrers. These steps are described later in this tutorial.
Create a blacklist of IP addresses from observed values
You can add the IP address to a blacklist for the dimension you created in order to prevent its appearance in reports using the dimension.
Suppose you are able to identify that some of your "top" IP addresses are sourced from traffic in which you have no interest. For example, you may have been able to determine through extended user agent parsing that the IP address 76.21.18.170
is a bot crawling your site. You can add this IP address to a blacklist for the dimension you created in order to prevent its appearance in reports using the dimension.
To create a blacklist of IP Addresses from observed values:
- To open the Event Manager, select Configure > Event Manager.
- In the Event Manager, click the Dimensions tab.
- Right-click the dimension
IP Address
and select Edit Dimension.... - In the Edit Dimension dialog, click Edit Blacklist.
- Click Download Log Values....
- Open the downloaded file in Microsoft™ Excel.
- Edit the list of log values into include only the items that you wish to add to your blacklist.
- In the Edit Blacklist window, click Import File....
- Select the file you edited locally. Click Import.
- The values are displayed in your blacklist. Click Done.
- Save your draft.
- Save your changes and commit them.
After completing this step, the IP values in the blacklist do not appear in the report, starting at ten minutes past the next hour.
Add the report to a dashboard
In this step, you add the Top IP Addresses
report to a dashboard as a table from the Report Builder.
- If not done already, create a dashboard where this report can be posted.
- To configure dashboards, select Configure > Dashboard in the Portal menu.
- Click the Dashboards link in the left navigation pane.
- To add a dashboard, click the
+
icon in the upper-right corner. - Enter a name for the dashboard. For this scenario, the name is
E2E_TopIPdash
. - Click OK.
- To save your dashboard, click Save.
- The dashboard
E2E_TopIPdash
is created. It is empty now.
- To open the Report Builder, select Analyze > Report Builder from the Portal menu.
- Click the Open button in the toolbar.
- Open the
Top IP Addresses
report. - The
Top IP Addresses
report is displayed. - In the toolbar, click the Add Report to Dashboard button in the toolbar.
- In the Add Report to Dashboard dialog, set the following values:
- Set Display:
Table
. - Set Target Tab.
- Select the
E2E_IPdash
dashboard. - Select the
Default
tab or other tab if you want to put it elsewhere. - Click Select.
- Select the
- To add the report, click Add.
- Set Display:
- A success message indicates that the report is added to the dashboard.
- In the Portal menu, select Dashboard > More.
- In the Dashboard selector, select
E2E_TopIPdash
.
The Top IP Addresses
report is displayed in table format in the selected tab of the dashboard.
Add a referrer dimension to the report
Optionally,you can reconfigure the Top IP Addresses
report so that it represents the top IP address of your visitors for each referrer. This information is valuable in determining which sources of traffic are most valuable to your site.
To reconfigure Top IP Addresses
report:
- To open the Report Builder, select Analyze > Report Builder from the Portal menu.
- Open the existing
Top IP Addresses
report. - Save the report as a new name:
Top IP Addresses by Referrer
. - Add the
Referrer for Session
dimension to the X-Axis. - Click the Options button in the toolbar.
- Set Report Title:
Top IP Addresses by Referrer
. Click Apply. - Save the report.
- Add the new report to the same dashboard.
The report is saved with a new name and the Referrer for Session
dimension added to it.
Check the dashboard for the new component and refreshed data
Navigate to the dashboard where the Top IP Addresses
report was previously added to verify that the new report is displayed.
- In the Portal menu, select Dashboard > More.
- In the Dashboard selector, select
E2E_TopIPdash
. - Verify that both reports are added to the dashboard and are populated with data.
- If not, click the refresh icon in the toolbar in the dashboard component to force a data refresh.
The report with Referrer for Session
data is displayed.