The value of the Path reference dimension can be set by the Reference session agent or by any pipeline agent before it. The session agent provides the following three methods for setting this variable: Simple, Mapping, and Programming.
The path is defined as all characters in the URL from the first /
after the host identifier through the character that precedes the first ?
, if it is present. The following elements are not part of the path:
- Protocol identifier:
http://
- Host: (for example,
www.tealeaf.com
) - Port (for example,
:80
)
Simple
In the Simple case, the file name extension and HTTP status code checks done by the Reference session agent are sufficient to limit or determine the virtual path values. This case implies the following conditions:
- The URL does not contain distinct session or hit identifiers. It does not carry any state information.
- Paths end with file name extensions
- The website software is homogeneous across all servers. Default page names (for example,
index.html
ordefault.asp
) for URLs that end with/
are identical for all web servers.
Mapping
The mapping technique can be used for the following cases:
- The path for multiple pages can be the same (for example,
/page.cgi
or/ISAPI.dll
), but one or two query strings or other request variables can distinguish pages from one another.- For example, a code can be embedded in the query string to distinguish pages for the web application's own purposes, as in the catid value
700
in/page.asp?catid=700
signifies the Product View page). - The session agent's "default page" algorithm for virtual directories (for example,
URL=/
orURL=/somedir/
) use the mapping configuration file to set the TLT_URL value. - The mapping file has an initial entry that is created during the installation process. For example:
# TLT_URL URL ReqVar1 ReqVar2 /default.asp /
This example configuration assigns
TLT_URL=/default.asp
in the[appdata]
section when/
is the value of the URL variable in the Tealeaf request.
- For example, a code can be embedded in the query string to distinguish pages for the web application's own purposes, as in the catid value
- Virtual directory start patterns determine validity of paths. All paths that conform to a specified start pattern are included, such as all paths that begin with
/server/
or/support/customers/
).
This method is more open-ended than the first mapping method and allows more junk path values to be accepted.
Programming
Programming, in the form of RTA rules or a custom pipeline agent, using TCL or Managed Code session agents, is required for cases that cannot be handled by the Simple or Mapping techniques. For example, if the URL value contains any type of application state or tracking information that should be stripped out to produce a good value for TLT_URL
, the TLT_URL
value must be set through a custom pipeline agent upstream of the Reference session agent.
As performed by DoubleClick for example, page tagging cannot be handled natively by the session agent as a source for the value of TLT_URL
. You must create a custom pipeline agent and apply it before the session agent.
Order of precedence of path processing methods
The URL processing precedence is as follows:
- URLReferenceRules
- URLReferenceVirtualDir
- NormalizeURLExt and NormalizeURLStatusCode
If URLReferenceRulesMode is set to STOP
, then method 3 is not used to validate the URL, and the Reference session agent performs strict interpretation of the URL. If the mode is CONT
, then a combination of rules and status code and file name extension tests can be used.
An example of combining methods 1 and 3 would be for a site with multiple possible virtual directory names, most likely because the web servers are a heterogeneous mixture of IIS and Java™ Platform, Enterprise Edition. Use step 1 to determine the default file name for virtual directories. For all other types of URLs, use the normal extension/status code rules.
Example:
#TLT_URL URL ReqVar1 ReqVar2
/default.asp / IISSESSIONID
/page.jhtml / JSESSIONID