For purposes of populating whitelists, blacklists and group lists, you may enable the logging of values for the dimension from which you can extract source values for the dimension. When logging is enabled, dimensional values that are detected in hit data for the dimension are recorded in a database log.
These log values can then be used as the basis for building your list for the dimension. Dimension values do not appear and therefore cannot be logged until the dimension is included in a report group.
When enabled, the logging of dimensional values is recorded from the date when it was enabled for the number of days that log values are configured to be retained.
Note: Changes to the logging setting are applied after the dimension definition was saved and committed to the server.
Management of the logging of values is especially important in high-volume dimensions, such as URL.
Configuration
For system dimensions, logging is enabled by default. However, you must specify the whitelists, blacklists, or group lists for accepted values before data begins to appear in these dimensions.
Default logging behaviors
For user-created dimensions, logging is disabled by default. To begin populating dimensions with values, do the following:
- Verify that collection of values is enabled.
- Enable logging for the dimension. It must be manually enabled for each created dimension.
- Specify the type of dimension data to capture.
- Add values from logs.
Aggregate data log values
By default, log values are gathered from the Canisters and inserted into the database on an hourly basis. After the Data Collector process runs at ten minutes past the hour, these values become available for use in specifying whitelists, blacklists, and group lists.
- If needed, you can configure the aggregation of these values to be on a daily or weekly basis at a specified, off-peak time.
- Aggregation of dimension log values can be disabled.
Monitor limit configuration
After you enable logging of dimensions, you can monitor your limit configuration by checking the number of counts of the [Limit]
value for the dimension in each hour of reports that are configured to include the dimension.
In the Report Builder, you add the dimension to the report and check the counts for the [Limit]
row in the detail table.
Note: The appearance of the [Limit]
value consistently in your reports indicates that some values are not being factored into the dimension data for the hours where the limit is reached.
If you are consistently reaching the [Limit]
value for the dimension, you should consider raising the limit. Caveats:
- Raising the limit requires more disk space to store the values. These hourly entries in the database can grow quite large and may not be detected unless you careful monitor the counts of
[Limit]
values. - If you are consistently hitting the limit even after it was raised, the dimension may not be able to reflect an accurate sample of the real data set.
Suppose you configured a dimension to capture a maximum of 1000 values per hour, which is the default setting. In the table below, you can see the logged values are recorded for successive hours.
Assume that the values are detected and recorded in the sequence that is listed below. For Hour 0, values are detected in the following order: v0000, v0001, v0002, etc.
Hour | Detected Values | Captured Values | Values Overwritten with [Limit] |
---|---|---|---|
0 | v0000 - v1200 | v0000 - v0999 | v1000 - v1200 |
1 | v1000 - v1200, v0000 - v0999 | v1000 - v1200, v0000 - v0799 | v0800 - v0999 |
In Hour 0, the first 1000 values (0-0999) are captured, and all subsequent values are stored as [Limit]
. In Hour 1, new values (1000-1200) are detected and captured initially, but then the sequence of values from Hour 0 starts again. However, since the limit is capped at 1000, [Limit]
is assigned to the last two hundred values for the hour, even though the values were already captured and recorded in the previous hour. As a result, contextual information for data that is already known to the system is not captured in Hour 1.
Note: If you enabled logging of a dimension that captures a high number of discrete values per hour, you may be challenged to capture and use a useful selection of them. For example, recording IP addresses for a high-volume site may result in explosive log growth and a varying set of values that are recorded as the dimension limit value.
A potential solution for addressing these issues is to do the following:
- Enable logging for a day or two.
- Export the list as a text file.
- Change the dimension to use a whitelist only.
- Import the export list as your whitelisted values.
Note: This solution does not work for data that varies significantly over time.
Recommended workflow for creating dimension populated by logged values
If you are creating a whitelist, blacklist, or group list dimension that is populated by source values from logs, the following workflow is recommended for creating the dimension.
- Create the dimension.
- For the Values to Record, select the type of data you want to record.
- Turn on logging.
- Do not specify your list yet.
- Specify other properties.
- Click Save Draft.
- Before dimension values can be logged, you must add the dimension to a report group.
- Click Save Changes.
- Allow sufficient time for the database log to be populated by a representative sample of values that are detected in the capture stream. Typically, 24 hours is a sufficient waiting period.
- Review the dimension logs to verify that all values you want to detect were captured.
- Create the list or lists from the logged values.
- After you specified the list for the dimension, you might want to export the list to a file for record keeping.
- Click Save Draft.
- Click Save Changes.
Monitor dimension data growth over time
Periodically, review the Database Table Size report, which contains details on the daily and monthly growth patterns of dimension data.