CS 4250 Project Part 5

Upload to Canvas on or before midnight on Monday, 2 Dec 2024.

Remember to access the server only from on campus or while logged in to the campus VPN.

Start this assignment with the nicely designed, normalized relations you created for Project Part 4. Significant changes to the database schema after Part 4 has been satisfactorily completed must be suggested or approved by the professor.

This assignment will consist of implementing your schema in MariaDB and populating the database. You can use your MariaDB account (see professor for your password) for all aspects of this assignment.

Write an SQL database schema for your application, using the CREATE TABLE commands described in class. If you make any minor changes to the relational schema from Part 4, please tell me the rationale for your changes. (Minor changes would be those described in the list below. A significant change - not allowed without prior approval -- is deleting a field or table.)
- Pick suitable datatypes for each attribute. Consult the MariaDB documentation to determine the principal options that are available for types (your class notes may or may not cover the ones that you require and your class notes may or may not contain the right declaration).
- Do not forget to specify at least one key for each relation (using the PRIMARY KEY construct).
- Specify other choices for keys (candidate keys) via the UNIQUE construct.
- If some attribute should not be NULL, use NOT NULL to specify this aspect.
- Also provide DEFAULT values wherever applicable.
- Indicate any foreign key (referential integrity) constraints that are expected to hold for that relation. Include the declaration of foreign key constraints for any appropriate relations.
  - (Note: phpMyAdmin requires specification of attribute / field names in destination of foreign key references. phpMyAdmin will produce error messages if you use the simplified convention in the book of simply specifying "references tablename" when you mean "references tablename(primary key field names)").
- Also mention any domain specific constraints that you might require of your attributes, using the CHECK construct. (Note that these may be constraints you had no way of specifying in your ER diagram!)
Load your table definitions into the database.
Now, execute some sample INSERT, DELETE and UPDATE commands on every one of your relations. Convince yourself that things are working fine. Required: report any interesting observations that you make. (Simple observations are fine, like "I learned not to UPDATE a tuple I had not yet INSERTed.")
Develop a substantial amount of data for your application and load it into your relations using the bulk loading facility provided in MariaDB: you can use either the LOAD DATA INFILE syntax or the mariadb-import program.
We require data in the order of 40-50 tuples, or more, for each relation in your application. To create the data, either get it from some web source or information repository (making sure that it is public -- legal to use for non-profit purposes -- first, of course) or write a program in any scripting/programming language (like PHP, Java, or Python) that can create large files of records in a format acceptable to the bulk loader. Either way, realize that you may need to transform data from one form to another to be acceptable for use in MariaDB.
In very rare situations, one of your relations might require only 4--5 tuples. For example, if you are modeling international airplane flights, there are only seven continents on the planet Earth, in which case it is okay to have just a few tuples in that particular relation. In such cases, where you have only a few tuples, state explicitly why you have only a few tuples. (During most semesters, no student group encounters a legitimate reason to have a small table in their database.)
Precautions to Take if you "Cook up" your Data: In parts (1) and (2), you have have specified key(s) (and possibly foreign keys) for each relation. Make sure that while cooking up data, your fabricated data indeed confirms to such "key restrictions". If you declare that "StudentID" is the key, then do not generate data for different students having the same ID! Notice that relations will definitely share attributes (if there are no common attributes at all in your schema, then your project is in grave danger, see the instructor immediately!), so make sure that they "agree" on the common attributes. For example, if there are two relations that share an attribute called "name", then make sure that the names do indeed tally! Do not list the name "Astaire" as "ASTAIRE" in one and "astAIRE" in another and "Astaire" in yet another because then they would be three different names! The bottom line is to be consistent. When two things are meant to be the same, make sure your data reflects this fact. If you do not ensure so at this stage, you will spend more time debugging your application than otherwise! (GIGO problems.)

What to turn in: A document that details the following:

A list of your relation schemas (your CREATE TABLE statements) reproduced from Part 4, with an explanation of any changes you made. This list serves as a reminder for me. This is the list of relations you start work with on Part 5.

A list of your SQL CREATE TABLE commands (with PRIMARY KEY, NOT NULL, DEFAULT, FOREIGN KEY, and CHECK constraints). This is the list of create table statements your group creates as part of your work on Part 5; it is not the same as the list required by #1. (See the recommendation below.)

For each table in your database, a screenshot of the output from the MariaDB "explain" command AND, separately, a screenshot of the output from the query "select count(*) from mytablename;" Make certain the screenshots clearly show the number of tuples for the relation.
Important: "explain tablename" and "select count(*) from mytablename;" are two different commands. Do not "explain select count(*) ..."

Samples of INSERT, DELETE, and UPDATE commands that you tried out on your relations and that worked, plus your observations.
The source of your data. Examples of possible answers are "We got it from this web address (http://where.we.found.data/)", "We got this from this reference book on apartment homes", "We imported it from this document", or "We cooked it up". Whatever your answer, write a short paragraph on how you transformed the data from its native form to the form required by MariaDB. If you wrote a special purpose program or script (in Perl, PHP, C or some other language), include that script. Explain how you took special care to make sure that key restrictions and common attribute constraints are not violated. Explain why some relations have only a few tuples (if applicable).

Sample of tuples for each relation (about 6--10 tuples each), just to give me an idea of how they look. (Screenshots, or copy and paste of text output from command line MariaDB interface, strongly preferred.)
- Note: "select * from Sailors" prints all tuples in the Sailors table. "select * from Sailors limit 10" prints 10 tuples from the Sailors table (the DBMS picks which 10).

Do not forget to include all project member names (and email addresses, and member contributions) with what you turn in.

What NOT to turn in: Pages and pages of raw data or printouts of all the data in your database. There is no way I can read all that.

Recommendations:

This might seem obvious but it is surprising the number of people who miss it. Please SAVE a copy of your SQL statements (and data) on some disk or flash drive or somewhere.
(Every good database person knows the value of backups!)

To export from phpMyAdmin: Go to the 'Export' menu at the top of the database page. Click 'Custom.' UNclick 'Data.' (Export the structure of your database, not all the tuples.) Click 'Go.'

Useful links:

MariaDB's auto_increment -- for setting up a table so it will create unique numerical ID values automatically
test_student.sql -- sample, plain text script file that creates a table and loads data from a plain text file into the new table
USA Gov't data
Generatedata.com: web site that provides random data generation services
Mockaroo: web site that provides random data generation services