Thursday, June 23, 2011

Creating a Database

Earlier this month I posted about the Dodgers traffic patterns data source I am using from the University of California Irvine for my database course.  This data source provides over 50,000 rows of data I can use for SQL exercises in my course.  Now that I have the data files, I need to load this into a relational database.  This is where the fun begins.

In the past, I created database scripts so that we are able "reset" the database to its original state.  This reset works well for when students begin adding, deleting, or modifying records.  I typically write a database script to drop each of the database tables, create new tables, and insert records into the database.  In the past, when inserting records, I created up to two dozen insert statements to populate the tables.  However, with this much larger database I did not want to write the 50,000 insert statements.  I could use a data loading tool but I want to be able to reset the database and allow students to create this same database on their own for practice.  This created a challenge.

I still needed to create the 50,000 insert statements but rather than writing these statements myself, I enlisted the help of Excel.  Opening the individual data files from the data source text file into Excel, I created formulas to concatenate pieces of the SQL insert statement together with the data to create a complete insert statement for the first record in the first row of the spreadsheet.  I then copied this same concatenation formula across all 50,000 rows and I ended up with a SQL insert statement for each row.  I then copied the set of insert statements to my script and ran it in database environment (after testing a single generated insert statement).  No errors, and all records were created.

While most of the time was spent preparing the data for this automated SQL statement generator (ensuring the date field stayed formatted as a date was a challenge), I was able to create the database script within a couple hours.  I now have a database script that will drop the existing tables, create new tables, and insert all 50,000 rows into the tables.  The script is about 4.5MB (a lot of SQL commands) and takes approximately 10 seconds to execute on my laptop.  I'm not ready to begin using this script with my students.

No comments:

Post a Comment

Skills to Look for in Project Managers

Today I read a brief article describing the eight skills to look for when hiring an IT project manager. The headlines caught my attention...