LargeDataSets DBMSTesting 2012 13
Nullius in verba (Motto of the Royal Society, the oldest "learned society for science" in the world. Roughly translated: "Trust no one's word.")
DBMS Performance Testing Group work
Fall 2012 Performance Group
Oct 15 2012
A helpful page for MySQL can be found at http://dev.mysql.com/doc/refman/5.6/en/index.html Files for generating fake Census data.
GenerateDataBGPJ.java, LastNameBGPJ.java, ExpDistribBGPJ.java, and, to support the generation of last names (Names are copied from http://names.mongabay.com/data/1000.html), lastNames1-250.txt, lastNames251-500.txt, lastNames501-750.txt, lastNames751-1000.txt. Lists of names are courtesy of Mongabay.com.
New SQL create table statement to match above text file:
drop table fakecensus;
create table fakecensus ( identifier int auto_increment, gender char(1), income int, lastname char(50), age int, primary key (identifier) );
load data local infile '/Users/wherever/whatever.txt' into table fakecensus fields terminated by ',';
Notes: There must be a comma at the front of each line, for the auto_increment field to be created properly.
Oct 8 2012
Start by reading: "Pathologies of Big Data", Communications of the ACM, August 2009. Read this by next week. You might need to be on a campus computer to access the full text of the article.
Install MySQL on your own machine. (Set up passwords! Don't leave root/superuser/administrator account on MySQL unprotected.)
Thomas slides from Aug 2011 DBMS/MySQL quickie workshop - might be useful in figuring out MySQL installation. Or not. Depends on your computer.