Oracle Database In-Memory

How To Load The SSB Schema Into An Oracle Database

Steps to load the Star Schema Benchmark (SSB) into an Oracle Database

The Star Schema benchmark (SSB) is designed to measure performance of database products in support of classical data ware-housing applications. It is based on the TPC-H benchmark. The main change from TPC-H to SSB is the merge of the tables LINEITEM and ORDERS into table LINEORDER.

The 5 tables in the SSB are:
LINEORDER
CUSTOMER
SUPPLIER
PART
DATE (I named this DATE_TABLE because DATE is a reserved word in Oracle)

As a first step for us to try Database In-Memory we want to have a set of tables that we can use for our queries. This post was created basically as a support for the people that want to test queries for the Oracle Database In-Memory option and don’t have data for testing.

1. Download StarSchemaBenchmark-master.zip from https://github.com/lemire/StarSchemaBenchmark

Go to https://github.com/lemire/StarSchemaBenchmark

Click on “Download” button

Save file StarSchemaBenchmark-master.zip

2. Unzip StarSchemaBenchmark-master.zip

$ unzip StarSchemaBenchmark-master.zip

This will create StarSchemaBenchmark-master directory

3. Change to StarSchemaBenchmark-master directory and execute ‘make’.

$ cd StarSchemaBenchmark-master
$ make

This will use the makefile file in this directory to create ‘dbgen’ executable.

4. Generate the .tbl ascii text files.

$ dbgen -s 4 -T a

The -s 4 is the scale factor, in this case we are multiplying the number of rows in the tables by 4. If you want to start with samller tables use -s 1
This creates files: customer.tbl, part.tbl, supplier.tbl, date.tbl, lineorder.tbl

5. Create the sql loader control files:

load_customer.ctl:
---------------------- START ---------------
load data
INFILE 'customer.tbl'
INTO TABLE CUSTOMER
APPEND
FIELDS TERMINATED BY '|'
(C_CUSTKEY,
 C_NAME,
 C_ADDRESS,
 C_CITY,
 C_NATION,
 C_REGION,
 C_PHONE,
 C_MKTSEGMENT)
----------------------- END ---------------
load_part.ctl:
---------------------- START ---------------
load data
INFILE 'part.tbl'
INTO TABLE PART
APPEND
FIELDS TERMINATED BY '|'
( P_PARTKEY,
 P_NAME,
 P_MFGR,
 P_CATEGORY,
 P_BRAND1,
 P_COLOR,
 P_TYPE,
 P_SIZE,
 P_CONTAINER)
----------------------- END ---------------
load_supplier.ctl:
---------------------- START ---------------
load data
INFILE 'supplier.tbl'
INTO TABLE SUPPLIER
APPEND
FIELDS TERMINATED BY '|'
( S_SUPPKEY,
 S_NAME,
 S_ADDRESS,
 S_CITY,
 S_NATION,
 S_REGION,
 S_PHONE)
----------------------- END ---------------
load_date_table.ctl:
---------------------- START ---------------
load data
INFILE 'date.tbl'
INTO TABLE DATE_TABLE
APPEND
FIELDS TERMINATED BY '|'
( D_DATEKEY,
 D_DATE,
 D_DAYOFWEEK,
 D_MONTH,
 D_YEAR,
 D_YEARMONTHNUM,
 D_YEARMONTH ,
 D_DAYNUMINWEEK,
 D_DAYNUMINMONTH,
 D_DAYNUMINYEAR,
 D_MONTHNUMINYEAR,
 D_WEEKNUMINYEAR,
 D_SELLINGSEASON,
 D_LASTDAYINWEEKFL,
 D_LASTDAYINMONTHFL,
 D_HOLIDAYFL,
 D_WEEKDAYFL)
----------------------- END ---------------
load_lineorder.ctl:
---------------------- START ---------------
load data
INFILE 'lineorder.tbl'
INTO TABLE LINEORDER
APPEND
FIELDS TERMINATED BY '|'
( LO_ORDERKEY,
 LO_LINENUMBER,
 LO_CUSTKEY,
 LO_PARTKEY,
 LO_SUPPKEY,
 LO_ORDERDATE,
 LO_ORDERPRIORITY,
 LO_SHIPPRIORITY,
 LO_QUANTITY,
 LO_EXTENDEDPRICE,
 LO_ORDTOTALPRICE,
 LO_DISCOUNT,
 LO_REVENUE,
 LO_SUPPLYCOST,
 LO_TAX,
 LO_COMMITDATE,
 LO_SHIPMODE)
----------------------- END ---------------

6. Create the Tablespace, the user SSB and the Tables.

conn / as sysdba
create tablespace SSB;
select file_name, bytes, autoextensible, maxbytes from dba_data_files;
alter database datafile '+DATA/DATAFILE/ssb.507.906051297' autoextend on next 500M;
grant connect, resource to ssb identified by ssb;
alter user ssb default tablespace ssb;
conn ssb/ssb

7. Load the data using sql loader

host sqlldr userid=ssb/ssb control=load_customer.ctl log=loader_customer.log
host sqlldr userid=ssb/ssb control=load_part.ctl log=loader_part.log
host sqlldr userid=ssb/ssb control=load_supplier.ctl log=loader_supplier.log
host sqlldr userid=ssb/ssb control=load_date_table.ctl log=loader_date_table.log
host sqlldr userid=ssb/ssb control=load_lineorder.ctl log=loader_lineorder.log
select count(*) from customer;
COUNT(*)
-----------
 120000
select count(*) from part;
COUNT(*)
-----------
 600000
select count(*) from supplier;
COUNT(*)
-----------
 8000
select count(*) from date_table;
COUNT(*)
-----------
 2556
select count(*) from lineorder;
COUNT(*)
-----------
 23996670

8. To clean up:

conn ssb/ssb
DROP TABLE CUSTOMER;
DROP TABLE PART;
DROP TABLE SUPPLIER;
DROP TABLE DATE_TABLE;
DROP TABLE LINEORDER;
host rm customer.tbl part.tbl supplier.tbl date.tbl lineorder.tbl
conn / as sysdba
DROP TABLESPACE ssb INCLUDING CONTENTS AND DATAFILES;
exit
$ cd ..
$ rm -r StarSchemaBenchmark-master

For information about the Star Schema Benchmark http://www.cs.umb.edu/~poneil/StarSchemaB.PDF

Blogs I Follow:
https://blogs.oracle.com/In-Memory/
https://carlos-sierra.net
http://mauro-pagano.com
https://davidloinaz.wordpress.com
http://kerryosborne.oracle-guy.com
https://martincarstenbach.wordpress.com
https://fritshoogland.wordpress.com
http://blog.oracle-ninja.com
http://karlarao.tiddlyspot.com

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s