This example uses the data file integration method of Tealeaf cxConnect for Data Analysis to extract specific URLs, URL fields, and cookies and import this data into the Data Files temporary database.
Create the database
The database is created using SQL files that are in the Tealeaf\DataExtractor\Scripts directory. The two SQL files are CreateDB.sql and CreateSchema.sql.
CreateDB.sql - creates the databases FileGroups.
- The default database name is
TLWEB. - The default location is
D:\DB.Note: To change the database location, editCreateDB.sqland change all occurrences ofD:\DBto the preferred location.
Default CreateDB.sql script
USE master;
if exists (select 1 from dbo.sysdatabases where name='TLWEB')
DROP DATABASE [TLWEB];
CREATE DATABASE [TLWEB]
ON PRIMARY( NAME = TLDATA_PRIMARY,
FILENAME = 'D:\DB\TLWEB_primary.mdf', SIZE = 20, FILEGROWTH = 5% ),
FILEGROUP TLDATA_SESN( NAME = TLDATA_SESN,
FILENAME = 'D:\DB\TLWEB_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_HIT( NAME = TLDATA_HIT,
FILENAME = 'D:\DB\TLWEB_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_APPD( NAME = TLDATA_APPD,
FILENAME = 'D:\DB\TLWEB_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_FACT( NAME = TLDATA_FACT,
FILENAME = 'D:\DB\TLWEB_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_COOK( NAME = TLDATA_COOK,
FILENAME = 'D:\DB\TLWEB_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_EVNT( NAME = TLDATA_EVNT,
FILENAME = 'D:\DB\TLWEB_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_ATTR( NAME = TLDATA_ATTR,
FILENAME = 'C:\DB\TLWEB_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_SESN( NAME = TLIDX_SESN,
FILENAME = 'D:\DB\TLIDX_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_HIT( NAME = TLIDX_HIT,
FILENAME = 'D:\DB\TLIDX_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_APPD( NAME = TLIDX_APPD,
FILENAME = 'D:\DB\TLIDX_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_FACT( NAME = TLIDX_FACT,
FILENAME = 'D:\DB\TLIDX_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_COOK( NAME = TLIDX_COOK,
FILENAME = 'D:\DB\TLIDX_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_EVNT( NAME = TLIDX_EVNT,
FILENAME = 'D:\DB\TLIDX_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_ATTR( NAME = TLIDX_ATTR,
FILENAME = 'C:\DB\TLIDX_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% )
LOG ON( NAME = 'TLDATA_LOG',
FILENAME = 'D:\DB\TLWEB_log.ldf', SIZE = 5, FILEGROWTH = 5 );
ALTER DATABASE [TLWEB] SET RECOVERY SIMPLE;
Change the database name from TLWEB to TLWEB_TEST
The following is an example of how to change a database name.
CreateDB.sql:
USE master;
if exists (select 1 from dbo.sysdatabases where name='TLWEB_TEST')
DROP DATABASE [TLWEB_TEST];
CREATE DATABASE [TLWEB_TEST]
ON PRIMARY( NAME = TLDATA_PRIMIRY,
FILENAME = 'D:\DB\TLWEB_TEST_primary.mdf', SIZE = 20, FILEGROWTH = 5% ),
FILEGROUP TLDATA_SESN( NAME = TLDATA_SESN,
FILENAME = 'D:\DB\TLWEB_TEST_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_HIT( NAME = TLDATA_HIT,
FILENAME = 'D:\DB\TLWEB_TEST_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_APPD( NAME = TLDATA_APPD,
FILENAME = 'D:\DB\TLWEB_TEST_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_FACT( NAME = TLDATA_FACT,
FILENAME = 'D:\DB\TLWEB_TEST_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_COOK( NAME = TLDATA_COOK,
FILENAME = 'D:\DB\TLWEB_TEST_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_EVNT( NAME = TLDATA_EVNT,
FILENAME = 'D:\DB\TLWEB_TEST_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLDATA_ATTR( NAME = TLDATA_ATTR,
FILENAME = 'C:\DB\TLWEB_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_SESN( NAME = TLIDX_SESN,
FILENAME = 'D:\DB\TLIDX_TEST_SESN.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_HIT( NAME = TLIDX_HIT,
FILENAME = 'D:\DB\TLIDX_TEST_HIT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_APPD( NAME = TLIDX_APPD,
FILENAME = 'D:\DB\TLIDX_TEST_APPD.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_FACT( NAME = TLIDX_FACT,
FILENAME = 'D:\DB\TLIDX_TEST_FACT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_COOK( NAME = TLIDX_COOK,
FILENAME = 'D:\DB\TLIDX_TEST_COOK.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_EVNT( NAME = TLIDX_EVNT,
FILENAME = 'D:\DB\TLIDX_TEST_EVNT.mdf', SIZE = 50, FILEGROWTH = 20% ),
FILEGROUP TLIDX_ATTR( NAME = TLIDX_ATTR,
FILENAME = 'C:\DB\TLIDX_ATTR.mdf', SIZE = 50, FILEGROWTH = 20% )
LOG ON( NAME = 'TLDATA_LOG',
FILENAME = 'D:\DB\TLWEB_TEST_log.ldf', SIZE = 5, FILEGROWTH = 5 );
ALTER DATABASE [TLWEB_TEST] SET RECOVERY SIMPLE;
CreateSchema.sql:
:
USE [TLWEB_TEST]
...
Modify sample load script
Complete the following steps to modify a sample load script.
- Edit
bcp_load_data_files.batto update the following variables as needed.Variable Description DBSERVERSQL server name DATABASEDatabase name PARALLELLOADSession extract tasks containing more than 250,000 sessions should have this set to TRUE.JOBNAMETypically, this setting does not need to be modified. CLEARSCHEMATo empty an existing schema before loading it with data, set this value to TRUE.STOPONERRORWhen this value is set to TRUE, the Data Extractor stops processing the job if an error is encountered by BCP.OVERWRITELOGSTo overwrite existing BCP log files, set this value to TRUE.BULKDIRLocation of data files SCRIPTDIRLocation of TeaLeaf\DataExtractor\scripts directory - For this example, set the following values:
set DBSERVER=localhost set DATABASE=TLWEB set PARALLELLOAD=FALSE set JOBNAME=* set CLEARSCHEMA=TRUE set STOPONERROR=TRUE set OVERWRITELOGS=FALSE set BULKDIR=C:\TeaLeaf\DataExtractor\DataFiles set SCRIPTDIR=C:\TeaLeaf\DataExtractor\scripts - Save the file.
Create a cxConnect task to export data files
This task exports three data files: BulkCookie*, BulkHit*, and BulkSesn*.
- From the Portal menu, navigate to cxConnect for Data Analysis.
- Select Configure Task.
- On the General tab, complete the following steps:
- Name:
Single URL - Post Command:
C:\Tealeaf\DataExtractor\Scripts\LoadData_SingleUrl.batNote: This script is a modified copy ofbcp_data_files.batwith appropriate variable settings.
- Name:
- On the Data Set tab, complete the following steps:
- Enable Custom Search String:
true - Search String:
url contains alliance-partnership.asp
- Enable Custom Search String:
- On the Data Filters tab, complete the following steps:
- Response Type:
Include All - URL: Include the actual URL relative path in request
- Status Code:
Include All - Urlfield:
Include All - Cookie:
Include All(This value is not a default value.) - AppData:
Exclude All - Event ID:
Exclude All - Var:
Exclude All(This area applies only to the earlier SQL Server database method of export.)
- Response Type:
- On the Destination tab, complete the following steps:
- Data Files:
Active - Number of concurrent exports Set this value to
1for proof of concept.
- Data Files: