This example uses the data file integration method of Tealeaf cxConnect for Data Analysis to extract specific URLs, URL fields, and cookies and import this data into the Data Files temporary database.
Create the database
The database is created using SQL files that are in the Tealeaf\DataExtractor\Scripts
directory. The two SQL files are CreateDB.sql
and CreateSchema.sql
.
CreateDB.sql
- creates the databases FileGroups.
- The default database name is
TLWEB
. - The default location is
D:\DB
.Note: To change the database location, editCreateDB.sql
and change all occurrences ofD:\DB
to the preferred location.
Default CreateDB.sql script
USE master;
if exists (select 1 from dbo.sysdatabases where name='TLWEB')
DROP DATABASE [TLWEB];
CREATE DATABASE [TLWEB]
ON PRIMARY( NAME = TLDATA_PRIMARY,
FILENAME = 'D:\DB\TLWEB_primary.mdf', SIZE = 20, FILEGROWTH = 5% ),
FILEGROUP TLDATA_SESN( NAME = TLDATA_SESN,
FILENAME = 'D:\DB\TLWEB_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_HIT( NAME = TLDATA_HIT,
FILENAME = 'D:\DB\TLWEB_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_APPD( NAME = TLDATA_APPD,
FILENAME = 'D:\DB\TLWEB_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_FACT( NAME = TLDATA_FACT,
FILENAME = 'D:\DB\TLWEB_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_COOK( NAME = TLDATA_COOK,
FILENAME = 'D:\DB\TLWEB_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_EVNT( NAME = TLDATA_EVNT,
FILENAME = 'D:\DB\TLWEB_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_ATTR( NAME = TLDATA_ATTR,
FILENAME = 'C:\DB\TLWEB_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_SESN( NAME = TLIDX_SESN,
FILENAME = 'D:\DB\TLIDX_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_HIT( NAME = TLIDX_HIT,
FILENAME = 'D:\DB\TLIDX_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_APPD( NAME = TLIDX_APPD,
FILENAME = 'D:\DB\TLIDX_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_FACT( NAME = TLIDX_FACT,
FILENAME = 'D:\DB\TLIDX_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_COOK( NAME = TLIDX_COOK,
FILENAME = 'D:\DB\TLIDX_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_EVNT( NAME = TLIDX_EVNT,
FILENAME = 'D:\DB\TLIDX_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_ATTR( NAME = TLIDX_ATTR,
FILENAME = 'C:\DB\TLIDX_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% )
LOG ON( NAME = 'TLDATA_LOG',
FILENAME = 'D:\DB\TLWEB_log.ldf', SIZE = 5, FILEGROWTH = 5 );
ALTER DATABASE [TLWEB] SET RECOVERY SIMPLE;
Change the database name from TLWEB
to TLWEB_TEST
The following is an example of how to change a database name.
CreateDB.sql
:
USE master;
if exists (select 1 from dbo.sysdatabases where name='TLWEB_TEST')
DROP DATABASE [TLWEB_TEST];
CREATE DATABASE [TLWEB_TEST]
ON PRIMARY( NAME = TLDATA_PRIMIRY,
FILENAME = 'D:\DB\TLWEB_TEST_primary.mdf', SIZE = 20, FILEGROWTH = 5% ),
FILEGROUP TLDATA_SESN( NAME = TLDATA_SESN,
FILENAME = 'D:\DB\TLWEB_TEST_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_HIT( NAME = TLDATA_HIT,
FILENAME = 'D:\DB\TLWEB_TEST_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_APPD( NAME = TLDATA_APPD,
FILENAME = 'D:\DB\TLWEB_TEST_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_FACT( NAME = TLDATA_FACT,
FILENAME = 'D:\DB\TLWEB_TEST_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_COOK( NAME = TLDATA_COOK,
FILENAME = 'D:\DB\TLWEB_TEST_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_EVNT( NAME = TLDATA_EVNT,
FILENAME = 'D:\DB\TLWEB_TEST_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_ATTR( NAME = TLDATA_ATTR,
FILENAME = 'C:\DB\TLWEB_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_SESN( NAME = TLIDX_SESN,
FILENAME = 'D:\DB\TLIDX_TEST_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_HIT( NAME = TLIDX_HIT,
FILENAME = 'D:\DB\TLIDX_TEST_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_APPD( NAME = TLIDX_APPD,
FILENAME = 'D:\DB\TLIDX_TEST_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_FACT( NAME = TLIDX_FACT,
FILENAME = 'D:\DB\TLIDX_TEST_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_COOK( NAME = TLIDX_COOK,
FILENAME = 'D:\DB\TLIDX_TEST_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_EVNT( NAME = TLIDX_EVNT,
FILENAME = 'D:\DB\TLIDX_TEST_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_ATTR( NAME = TLIDX_ATTR,
FILENAME = 'C:\DB\TLIDX_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% )
LOG ON( NAME = 'TLDATA_LOG',
FILENAME = 'D:\DB\TLWEB_TEST_log.ldf', SIZE = 5, FILEGROWTH = 5 );
ALTER DATABASE [TLWEB_TEST] SET RECOVERY SIMPLE;
CreateSchema.sql
:
:
USE [TLWEB_TEST]
...
Modify sample load script
Complete the following steps to modify a sample load script.
- Edit
bcp_load_data_files.bat
to update the following variables as needed.Variable Description DBSERVER
SQL server name DATABASE
Database name PARALLELLOAD
Session extract tasks containing more than 250,000 sessions should have this set to TRUE
.JOBNAME
Typically, this setting does not need to be modified. CLEARSCHEMA
To empty an existing schema before loading it with data, set this value to TRUE
.STOPONERROR
When this value is set to TRUE
, the Data Extractor stops processing the job if an error is encountered by BCP.OVERWRITELOGS
To overwrite existing BCP log files, set this value to TRUE
.BULKDIR
Location of data files SCRIPTDIR
Location of TeaLeaf\DataExtractor\scripts directory - For this example, set the following values:
set DBSERVER=localhost set DATABASE=TLWEB set PARALLELLOAD=FALSE set JOBNAME=* set CLEARSCHEMA=TRUE set STOPONERROR=TRUE set OVERWRITELOGS=FALSE set BULKDIR=C:\TeaLeaf\DataExtractor\DataFiles set SCRIPTDIR=C:\TeaLeaf\DataExtractor\scripts
- Save the file.
Create a cxConnect task to export data files
This task exports three data files: BulkCookie*
, BulkHit*
, and BulkSesn*
.
- From the Portal menu, navigate to cxConnect for Data Analysis.
- Select Configure Task.
- On the General tab, complete the following steps:
- Name:
Single URL
- Post Command:
C:\Tealeaf\DataExtractor\Scripts\LoadData_SingleUrl.bat
Note: This script is a modified copy ofbcp_data_files.bat
with appropriate variable settings.
- Name:
- On the Data Set tab, complete the following steps:
- Enable Custom Search String:
true
- Search String:
url contains alliance-partnership.asp
- Enable Custom Search String:
- On the Data Filters tab, complete the following steps:
- Response Type:
Include All
- URL: Include the actual URL relative path in request
- Status Code:
Include All
- Urlfield:
Include All
- Cookie:
Include All
(This value is not a default value.) - AppData:
Exclude All
- Event ID:
Exclude All
- Var:
Exclude All
(This area applies only to the earlier SQL Server database method of export.)
- Response Type:
- On the Destination tab, complete the following steps:
- Data Files:
Active
- Number of concurrent exports Set this value to
1
for proof of concept.
- Data Files: