Guide | Demo Data & Scripts
This guide will detail the location and function for the demo data and scripts that are installed with CJ Path.
Start your CJ Path docker container as "apache" user
- Open a shell
- Start your docker container by entering:
docker start cjpath docker attach cjpath
- Switch to "apache" user by entering:
su -s /bin/bash apache
Explore the apache_log project directory
CJ Path comes loaded with three existing project directories. We will explore the apache_log directory.
- First view a list of project directories by entering:
bash-4.3$ cd /var/www/html/cjpath/ess bash-4.3$ ls -l
- Access the apache_log directory and list its contents.
The directories and files listed here serve the following purpose:
- cache – subdirectory to store cache data
- createdb.sh – script to create database
- data – subdirectory to store data files
- demo.sh – script to run demo data
- import.sh – script to import data
- make_cj.sh – script to create customer journey data
- pagename.csv – sample table to transform page names
- profile.sh – script to fill profile vector
- setup.sh – script to create categories
- usrconfig.inc – text file for project configuration
- usrparam.inc – text file for customer parameters
Explore Sample Data
The apache_log project data contains 3 weekly web server log files that are typical apache logs. Let’s examine their contents.
- 1. Access the data subdirectory and view the contents of the apachelog.zip file.
bash-4.3$ cd data bash-4.3$ ls bash-4.3$ unzip -l apachelog.zip
- View a few lines of the top log file. Enter the following:
bash-4.3$ unzip -c apachelog.zip 125-access_log-20140330
The data contained in these logs are what is used for generating the customer journey paths for this project.
This script is used to create a virtual schema of the raw data. This is primarily a preparatory step prior to processing the customer journey tables.
The setup.sh script for this project looks like the following:
When you execute the script, it performs the following:
In the summary, we see that a newly created virtual schema labeled “apachlog” was made from the raw log files.