This assignment will consist of implementing your schema in MariaDB and populating the database. You can use your MariaDB account (see professor for your password) for all aspects of this assignment.
- Write an SQL database schema for your application, using
the CREATE TABLE
commands described in class. If you make any
minor changes to the relational schema from Part 4, please tell me the
rationale for your changes. (Minor changes would be those described in the list
below. A significant change - not allowed without prior approval -- is
deleting a field or table.)
- Pick suitable datatypes for each attribute. Consult the MariaDB documentation to determine the principal options that are available for types (your class notes may or may not cover the ones that you require and your class notes may or may not contain the right declaration).
- Do not forget to specify at least one key for each relation (using the PRIMARY KEY construct).
- Specify other choices for keys (candidate keys) via the UNIQUE construct.
- If some attribute should not be NULL, use NOT NULL to specify this aspect.
- Also provide DEFAULT values wherever applicable.
- Indicate any foreign key (referential integrity) constraints that
are expected to hold for that relation. Include the declaration of foreign key
constraints for any appropriate relations.
- (Note: phpMyAdmin requires specification of attribute / field names in destination of foreign key references. phpMyAdmin will produce error messages if you use the simplified convention in the book of simply specifying "references tablename" when you mean "references tablename(primary key field names)").
- Also mention any domain specific constraints that you might require of your attributes, using the CHECK construct. (Note that these may be constraints you had no way of specifying in your ER diagram!)
- Load your table definitions into the database.
Now, execute some sample INSERT, DELETE and UPDATE commands on every one of your relations. Convince yourself that things are working fine. Required: report any interesting observations that you make. (Simple observations are fine, like "I learned not to UPDATE a tuple I had not yet INSERTed.")
- Develop a substantial amount of data for your
application and load it into your relations using the bulk loading
facility provided in MariaDB: you can use either the
LOAD DATA INFILE
syntax or the
mariadb-import
program.
We require data in the order of 40-50 tuples, or more, for each relation in your application. To create the data, either get it from some web source or information repository (making sure that it is public -- legal to use for educational purposes -- first, of course) or write a program in any scripting/programming language (like PHP, Java, or Python) that can create large files of records in a format acceptable to the bulk loader. Either way, realize that you may need to transform data from one form to another to be acceptable for use in MariaDB.
In very rare situations, one of your relations might require only 4--5 tuples. For example, if you are modeling international airplane flights, there are only seven continents on the planet Earth, in which case it is okay to have just a few tuples in that particular relation. In such cases, where you have only a few tuples, state explicitly why you have only a few tuples. (Strongly recommended: Speak to the professor and get permission for fewer than 40 tuples. During most semesters, no student group encounters a legitimate reason to have a small table in their database.) Precautions to Take if you "Cook up" your Data: In parts (1) and (2), you have have specified key(s) (and possibly foreign keys) for each relation. Make sure that while cooking up data, your fabricated data indeed confirms to such "key restrictions". If you declare that "StudentID" is the key, then do not generate data for different students having the same ID! Notice that relations will definitely share attributes (if there are no common attributes at all in your schema, then your project is in grave danger, see the instructor immediately!), so make sure that they "agree" on the common attributes. For example, if there are two relations that share an attribute called "name", then make sure that the names do indeed tally! Do not list the name "Astaire" as "ASTAIRE" in one and "astAIRE" in another and "Astaire" in yet another because then they would be three different names! The bottom line is to be consistent. When two things are meant to be the same, make sure your data reflects this fact. If you do not ensure so at this stage, you will spend more time debugging your application than otherwise! (GIGO problems.)
- A list of your relation schemas (your CREATE TABLE statements) reproduced from Part 4, with an explanation of any changes you made. This list serves as a reminder for me. This is the list of relations you start work with on Part 5.
- A list of your SQL CREATE TABLE commands (with PRIMARY KEY, NOT NULL, DEFAULT, FOREIGN KEY, and CHECK constraints). This is the list of create table statements your group creates as part of your work on Part 5; it is not the same as the list required by #1. Create this list by exporting the database schema directly from MariaDB; see the recommendation below for how to export a schema.
- For each table in your database, a screenshot of the output from the
MariaDB "explain"
command AND, separately, a screenshot of the output from the query "select count(*) from mytablename;" Make certain
the screenshots clearly show the number of tuples for the relation.
Important: "explain tablename" and "select count(*) from mytablename;" are two different commands. Do not "explain select count(*) ..." - Samples of INSERT, DELETE, and UPDATE commands that you tried out on a few of your relations and that worked, plus your observations. (A few samples over your whole database is sufficient; samples for each individual table are not necessary.)
- The source of your data. Examples of possible answers are "We got it from this web address (http://where.we.found.data/)", "We got this from this reference book on apartment homes", "We imported it from this document", or "We cooked it up". Whatever your answer, write a short paragraph on how you transformed the data from its native form to the form required by MariaDB. If you wrote a special purpose program or script (in Python, PHP, or some other language), include that script. Explain how you took special care to make sure that key restrictions and common attribute constraints are not violated. Explain why some relations have only a few tuples (if applicable).
- Sample of tuples for each relation (about 6--10 tuples each), just to give
me an idea of how they look. (Screenshots, or copy and paste of text output
from command line MariaDB interface, strongly preferred.)
- Note: "select * from Sailors" prints all tuples in the Sailors table. "select * from Sailors limit 10" prints 10 tuples from the Sailors table (the DBMS picks which 10).
- Do not forget to include all project member names (and email addresses, and member contributions) with what you turn in.
What NOT to turn in:
- A URL to some cloud service collection of documents. Everything I need to grade Part 5 must be included in what you upload to Canvas.
- Pages and pages of raw data or printouts of all the data in your database. There is no way I can read all that.
Recommendations:
- To export the schema from phpMyAdmin: Go to the 'Export' menu at the top of the database page. Click 'Custom.' Change 'Dump all rows' to 'Dump some row(s)' and leave the number of rows set to 0. (Unclick 'dump all rows'.) Scroll to the bottom and click 'Export.'
- This might seem obvious but ...
Please SAVE a copy of your SQL statements
(and data) on some disk or flash drive or somewhere.
(Every good database person knows the value of backups!)
Useful links:
- MariaDB's auto_increment -- for setting up a table so it will create unique numerical ID values automatically
- test_student.sql -- sample, plain text script file that creates a table and loads data from a plain text file into the new table
- Generatedata.com: web site that provides random data generation services
- Mockaroo: web site that provides random data generation services
- USA Gov't data - Data in comma-separated-values (CSV) format is ready to load into a database. One row per tuple, each field separated from neighbors with commas. If you want to strip out a field, you can write a simple program to count commas and delete the unwanted field. (Or, if the data file is small enough, load the data directly into MS Excel, delete the unwanted column, and export back into a new CSV file.)