LargeDataSets DBMSTesting 2012 13

From Wiki
Revision as of 11:56, 15 October 2012 by MThomas (Talk | contribs) (Fall 2012 Performance Group)

Jump to: navigation, search

Nullius in verba (Motto of the Royal Society, the oldest "learned society for science" in the world. Roughly translated: "Trust no one's word.")

DBMS Performance Testing Group work

main AHPCRC wiki page


Fall 2012 Performance Group

Oct 15 2012

REFERENCE MATERIAL:

A helpful page for MySQL can be found at http://dev.mysql.com/doc/refman/5.6/en/index.html Files for generating fake Census data.

GenerateDataBGPJ.java, LastNameBGPJ.java, ExpDistribBGPJ.java, and, to support the generation of last names (Names are copied from http://names.mongabay.com/data/1000.html), lastNames1-250.txt, lastNames251-500.txt, lastNames501-750.txt, lastNames751-1000.txt. Lists of names are courtesy of Mongabay.com.

New SQL create table statement to match above text file:

drop table fakecensus;

create table fakecensus ( identifier int auto_increment, gender char(1), income int, lastname char(50), age int, primary key (identifier) );

load data local infile '/Users/wherever/whatever.txt' into table fakecensus fields terminated by ',';

Notes: There must be a comma at the front of each line, for the auto_increment field to be created properly.

A webpage that can probably explain you how to work with SQL

Oct 8 2012

Start by reading: "Pathologies of Big Data", Communications of the ACM, August 2009. Read this by next week. You might need to be on a campus computer to access the full text of the article.

Install MySQL on your own machine. (Set up passwords! Don't leave root/superuser/administrator account on MySQL unprotected.)

Thomas slides from Aug 2011 DBMS/MySQL quickie workshop - might be useful in figuring out MySQL installation. Or not. Depends on your computer.