Consider a database containing information about supercomputers.
Create entities and relationships as appropriate for each (independent) situation described below.
Information that is true for all of the following ER diagrams:
Each supercomputer has a unique identifying number, a name, a number of cores,
and a peak TFlop/second rate. (TFlop = teraflop. 1,000,000,000,000 floating point
operations per second. Yes, that's 1012. A "core" is a CPU... sort of.
Close enough for this homework.)
For each of the following situations, draw an ER diagram that describes it.
Consider each situation independently. (Multiple situations means multiple, separate ER diagrams in a correct homework solution.)
Information in parentheses is simply information. Information in parentheses does
not affect the ER diagrams, but may help you understand the real world context of the
question.
- Each supercomputer runs on one operating system. An operating system is
required by each supercomputer, and the database stores the name of the operating
system and version number (together, those can identify the operating system).
The database also stores the name of the organization that maintains the operating
system and the URL of the operating system documentation.
-
The database stores information about supercomputers.
(See above for more details.) Each supercomputer is located at one physical
site. A supercomputer site can be identified by its address (street, city, state,
and country). Each supercomputer site also has a name, and a funding agency.
(For example, the Sierra supercomputer
is at Lawrence Livermore National Laboratory,
7000 East Ave, Livermore, CA, USA. LLNL is funded by the DOE, Department of Energy.)
Note that physical sites can have more than one supercomputer. (LLNL also hosts the
Lassen supercomputer.)
-
The database stores information about performance test results for each supercomputer.
A supercomputer must have test results to be in this database. (The performance
tests are how peak supercomputer speed is officially measured.) Test results for a
particular computer can be identified using the name of the test software and the date the
test was run. However, there is no identifier that can be used universally to
identify a test result overall, since many supercomputers run the standard tests
and might run them on the same day.
For each performance test, the database stores the name of the test software,
the date of the test, the peak speed achieved, and the power consumed (kilowatts)
by the supercomputer during the test run.
(Note: I may have "fudged" the truth a bit, on this question, to make a good homework question. Draw the ER diagram for the situation described.)
-
The database stores information about supercomputer rankings on the Top500 List. Each Top500 list can be uniquely identified using the month
and year it was released, and for each list we also store a description of the highlights.
For each pair of supercomputer and Top500 list, the database stores that
computer's ranking in that list. (For example, the Sierra supercomputer was #2
on the June 2019 Top500 list, #3 on the June 2020 list, and #3 on the June 2021 list.)
Each list contains dozens of supercomputers.
- Some of the physical locations of supercomputers are universities, some are
national laboratories (research centers) and some are corporations. No location
can be more than one of those types of location at once.
A supercomputer site can be identified by its address (street, city, state,
and country). Each supercomputer site also has a name.
The database also stores university student populations (number of students)
and logos, and stores national laboratory funding agency names and annual budgets,
and stores corporation URLs and annual net incomes.
Suppose a database has the following three relations.
Museum (mid: integer, name: string, address: string, website: string)
Displays (museumID: integer, artID: integer, startDate: date)
ArtItem (aid: integer, title: string, weightPnd: integer, type: string)
ExhibitsIn (artID: integer, exhibitID: integer, insuranceCost: integer, deliveryDate: date, homeMuseum: integer)
Exhibition (eID: integer, title: string, startdate: date, enddate: date, specialFee: integer)
(The "bit" data type stores 0 or 1. False or true. Booleans.)
"museumID" in Displays is a foreign key referencing "mid" in Museum.
"exhibitID" in ExhibitsIn is a foreign key referencing "eid" in Exhibition.
"artID" in Displays and ExhibitsIn are foreign keys referencing "aid" in ArtItem. The
art item weights are measured in pounds.
You may assume simple boolean comparisons (<, >, ≠) work on dates.
If you are unfamiliar with art museums, they will frequently organize special
exhibitions. Museums will borrow art from other museums for a few months,
art that fits an exhibition theme, and charge a small extra fee for admission
to the exhibition. For examples, here
are some past Asian Art Museum of San Francisco exhibitions.
Art items must, of course, be insured appropriately before they leave their
home museum and are shipped
to the exhibition host museum, in case of loss or damage in transit.
Write the following queries in relational algebra.
- Find the exhibition titles and start dates for exhibitions that start after December 2021.
- Find the art item titles and weights for art items displayed in
a museum starting before January 1980.
- Find the art item identifiers and titles, and exhibition identifiers.
The art items are of type "painting" and must cost more than $5,000 to insure
for their participation in the exhibition.
- Find the names and web sites of museums that display an art item
titled "Rosetta Stone".
- Find the titles and start dates of exhibitions that exhibit an art item titled
"The Great Wave off Kanagawa" or an art item titled "Mask of Tutankhamun".
- Find the titles and start dates of exhibitions that exhibit an art item titled "Mona Lisa" and also an art item titled
"Mask of Tutankhamun".
(Such an exhibition would never occur. The insurance costs for shipping two such
valuable works of art would be astronomical.)