CS 4250 Project Part 5
Upload to Canvas on or before midnight on
Monday, 2 Dec 2024.
Remember to access the server only from on campus or while logged in to the campus VPN.
Start this assignment with the nicely designed, normalized relations you
created for Project Part 4. Significant changes to the database schema after Part 4
has been satisfactorily completed must be suggested or approved by the professor.
This assignment will consist of implementing your schema in MariaDB and populating the database. You can use your
MariaDB account (see professor for your password) for all aspects of this assignment.
- Write an SQL database schema for your application, using
the CREATE TABLE
commands described in class. If you make any
minor changes to the relational schema from Part 4, please tell me the
rationale for your changes. (Minor changes would be those described in the list
below. A significant change - not allowed without prior approval -- is
deleting a field or table.)
- Pick suitable datatypes for each attribute. Consult the MariaDB documentation to determine
the principal options
that are available for types (your class notes may or may not cover the ones that you
require and your class notes may or may not contain the right declaration).
- Do not forget to specify at least one key for each
relation (using the PRIMARY KEY construct).
- Specify other choices for keys (candidate keys) via the UNIQUE construct.
- If some attribute should not be NULL, use
NOT NULL to specify this aspect.
- Also provide DEFAULT values wherever applicable.
- Indicate any foreign key (referential integrity) constraints that
are expected to hold for that relation. Include the declaration of foreign key
constraints for any appropriate relations.
- (Note: phpMyAdmin requires
specification of attribute / field names in destination of foreign key references.
phpMyAdmin will produce error messages if you use the simplified convention in the book
of simply specifying "references tablename" when you mean
"references tablename(primary key field names)").
- Also mention any domain
specific constraints that you might require of your attributes, using
the CHECK construct. (Note that these may be constraints you
had no way of specifying in your ER diagram!)
- Load your table definitions into the database.
Now, execute some sample
INSERT, DELETE and UPDATE commands on
every one of your relations. Convince yourself that things are working
fine. Required: report any interesting observations that you make.
(Simple observations are fine, like "I learned not to UPDATE
a tuple I had not yet INSERTed.")
- Develop a substantial amount of data for your
application and load it into your relations using the bulk loading
facility provided in MariaDB: you can use either the
LOAD DATA INFILE
syntax or the
mariadb-import
program.
We require data in the order of 40-50 tuples, or more,
for each relation in your application. To create the data, either get it
from some web source or information repository (making sure that it is
public -- legal to use for non-profit purposes -- first, of course) or write a program in any scripting/programming
language (like PHP, Java, or Python) that can create large files of records in a
format acceptable to the bulk loader. Either way, realize that you may
need to transform data from one form to another to be acceptable for use
in MariaDB.
In very rare situations, one of your relations might require only
4--5 tuples. For example, if you are modeling international airplane flights, there
are only seven continents on the planet Earth, in which
case it is okay to have just a few tuples in that particular relation.
In such cases, where you have only a few tuples, state explicitly why
you have only a few tuples. (During most semesters, no student group
encounters a legitimate reason to have a small table in their database.)
Precautions to Take if you "Cook up" your Data: In parts (1) and
(2), you have have specified key(s) (and possibly foreign keys) for each
relation. Make sure that while cooking up data, your fabricated data
indeed confirms to such "key restrictions". If you declare that
"StudentID" is the key, then do not generate data for different students
having the same ID! Notice that relations will definitely share
attributes (if there are no common attributes at all in your schema,
then your project is in grave danger, see the instructor immediately!),
so make sure that they "agree" on the common attributes. For example, if
there are two relations that share an attribute called "name", then make
sure that the names do indeed tally! Do not list the name "Astaire" as
"ASTAIRE" in one and "astAIRE" in another and "Astaire" in yet another because
then they would be three different names! The bottom line is to be
consistent. When two things are meant to be the same, make sure your
data reflects this fact. If you do not ensure so at this stage, you will
spend more time debugging your application than otherwise! (GIGO problems.)
What to turn in: A document that details the following:
- A list of your relation schemas (your CREATE TABLE statements)
reproduced from Part 4, with an explanation of any changes you made.
This list serves as a reminder for me. This is the list of
relations you start work with on Part 5.
- A list of your SQL CREATE TABLE commands (with
PRIMARY KEY, NOT NULL, DEFAULT,
FOREIGN KEY, and CHECK constraints).
This is the list of create table statements your group creates as part of your
work on Part 5; it is not the same as the list required by #1. (See the recommendation below.)
- For each table in your database, a screenshot of the output from the
MariaDB "explain"
command AND, separately, a screenshot of the output from the query "select count(*) from mytablename;" Make certain
the screenshots clearly show the number of tuples for the relation.
Important: "explain tablename" and "select count(*) from mytablename;" are
two different commands. Do not "explain select count(*) ..."
- Samples of INSERT, DELETE, and
UPDATE commands that you tried out on your relations and that worked,
plus your observations.
- The source of your data. Examples of possible answers are "We got it
from this web address (http://where.we.found.data/)", "We got this from this reference book on
apartment homes", "We imported it from this document", or "We cooked it
up". Whatever your answer, write a short paragraph on how you
transformed the data from its native form to the form required by MariaDB.
If you wrote a special purpose program or script (in Perl, PHP, C or some
other language), include that script. Explain how you took special care
to make sure that key restrictions and common attribute constraints are
not violated. Explain why some relations have only a few tuples (if
applicable).
- Sample of tuples for each relation (about 6--10 tuples each), just to give
me an idea of how they look. (Screenshots, or copy and paste of text output
from command line MariaDB interface, strongly preferred.)
- Note: "select * from Sailors" prints all tuples in the Sailors table.
"select * from Sailors limit 10" prints 10 tuples from the Sailors table (the
DBMS picks which 10).
- Do not forget to include all project member names (and email addresses,
and member contributions) with what you turn in.
What NOT to turn in: Pages and pages of raw data or printouts of all the
data in your database. There is no way I can read all that.
Recommendations:
- This might seem obvious but it is surprising the number of people
who miss it. Please SAVE a copy of your SQL statements
(and data) on some disk or flash drive or somewhere.
(Every good database person knows the value of backups!)
- To export from phpMyAdmin: Go to the 'Export' menu at the top of the database
page. Click 'Custom.' UNclick 'Data.' (Export the structure of your
database, not all the tuples.) Click 'Go.'
Useful links:
- MariaDB's auto_increment
-- for setting up a table so it will create unique numerical ID values automatically
- test_student.sql -- sample, plain text script file that creates a table and loads
data from a plain text file into the new table
- USA Gov't data
- Data in
comma-separated-values (CSV) format is ready to load into a database. One row
per tuple, each field separated from neighbors with commas. If you want to
strip out a field, you can write a simple program to count commas and delete
the unwanted field. (Or, if the data file is small enough, load the data
directly into MS Excel, delete the unwanted column, and export back into a new CSV file.)
- Generatedata.com: web site that
provides random data generation services
- Mockaroo: web site that
provides random data generation services