CS 4250: Introduction to Database Management Systems

Project Overview


General Description

The goal of the class project is to implement a database system application. The project includes the following activities spread over the entire semester: The end result should be a functioning application that runs on the WWW and that uses your database to allow useful functionality.

A group of two to four students (ideally, three) should do each project. You are free to choose your own project members; if you would like the instructor to assign you to a group, send an email. Each of the steps above will be a specific project assignment. You will get detailed instructions with each assignment. Each group should turn in a single solution to each assignment. Every member of the group will get the same grade.

N.B.: There will be a peer evaluation component of project grades. For every component of the project, each student will be required to specify what s/he and each of the other members of his group contributed to the group work.


Project Ideas

These ideas are just samples. You are free to propose your own ideas. Realize that the ideas below are not complete descriptions. You need to work on them more and develop your project more concretely and in more detail. Do not get intimidated by the examples that are linked from this web page -- they are meant to give you a feel for the application domain. It is up to you to narrowly define the scope of the application within the time frame of a semester-long project. Do not forget that you are supposed to have fun!

  1. Nobel Awards Database: The goal is to model and populate information about the awards made in the various fields (Physics, Chemistry, Physiology or Medicine Literature, Peace and the Economic Sciences), the recipients, their countries, their year of birth etc. Your system should be able to answer questions such as "When was the first time an Asian won an award for the economic sciences?" (the answer to this particular question is 1998). The Nobel Foundation maintains such an interface. You could also work on variants of this idea such as the recipients of the ACM awards. Interesting queries then could be "Name people who have won at least two different awards" (the answer would include Knuth, Thompson, Ritchie, Engelbart etc.) Or the people "who were ACM Fellows before becoming Turing Award Winners" and so on.
  2. Books Database: This domain is another popular one. Just look at barnesandnoble.com or amazon.com for excellent examples. You could model entities such as books, their authors, topics (which may be a complex hierarchy). You may also model various attributes of the authors and the institutions they belong to. You can support a service for buying and selling used books or books used in specific university courses. Your system can build a personal profile of people (and the books they like) and your database application could form the basis for a "recommender system", such as those supported by the commercial sites. The goal here is to "cluster" similar preferences together and the system can then make recommendations: "Since you liked Shakespeare's Romeo and Juliet, I recommend that you try Shakespeare's Cleopatra".
  3. Movies (or Television Shows) Database: There are several excellent movie resources on the web, such as the hollywood.com movies site or the Internet Movie Database. You could model entities such as movies, their actors, directors, genres, playing times, and reviews. There are several sources on the web from which you could get data to populate such a database. You can support various queries such as finding specific playing times, or finding movies playing in Turlock directed by a given director. You can also support updates to the reviews section of the database (e.g., viewers giving their own opinions). Another functionality is to provide personal profiles of people (i.e., the movies they like) and then try to recommend movies to them based on profiles of viewers with similar tastes. You could also create a database of Oscar or Golden Globe nominations and awards and answer queries such as "Find all the sitcoms that have been nominated three years in a row".
  4. Personal Photos database:  With the advent of cheap digital cameras, everybody has piles of digital photos. People need a way to organize, access, and show off their photos. Personal videos might also be included.
  5. Apartment Homes: This domain would require modeling apartments and their attributes, areas of town and their various characteristics (e.g., Modesto Area Express bus lines, crime rates, distance from various landmarks). You would provide an interface for offering apartments for rent, finding apartments based on various requirements ("gas heating + pets allowed + rent less than $800 + close to campus").
  6. Research Literature: This domain involves modeling research publications. You need to identify the title of the publication, the forum it was published in, the authors, topics, keywords and related subtopic areas. This is a big business now (under the name of digital libraries). For example, the ACM Digital Library provides a beautiful searchable index (and retrievable repository, but that is beyond our scope) of nearly all of the publications of ACM. If you use this domain, then there are a lot of available resources for you to use. The ACM Computing Classification System provides a convenient hierarchial meta-index that you can use to organize your class hierarchy etc. If you are interested in a smaller domain, then the DBLP Bibliography Site provides a searchable facility for publications related to the database and programming communities. At the end of the day, you could identify papers written by a particular person at a particular place or ones in a narrowly defined area.
  7. Research Literature, the Dark Side, aka Retraction Watch: This domain involves modeling retracted scientific publications -- the mistakes, misunderstandings, and occasional outright fraud that can occur in any competitive publication field. The Retraction Watch web site gathers information on cases where a scientific publication was declared "unpublished" and documents why the publishers retracted the paper. According to a recent news article, the Retraction Watch web site has received a generous grant to help them create a database of scientific paper retractions. You could try to design that database.

    Or you could find another web site that gathers up socially useful information and try designing a database for their worthy cause. Possibilities include a database of adoptable stray animals, of endangered animal sightings in the wild, of homeless shelter locations, of food banks for humans, of locations of weeds in the wild that are food sources for endangered creatures (like Monarch butterflies), of museums that store functioning, old computers, ...

  8. Web Sites: How do you think web search engines such as Google model their domain? You could think of them as a glorified database system where the basic entities modeled are web sites. You could then model the various properties of a web site: topic, URL, domain name, other sites it links to, the background colour, etc. Retrieval could be for sites that have similar characteristics and properties.
  9. 'Healthy App Challenge': "The Office of the National Coordinator for Health Information Technology has partnered with the U.S. Surgeon General to launch the Healthy App Challenge, which invites developers to submit health, wellness, and fitness apps that promote nutrition and interactive health. 'The challenge will highlight a selection of mobile apps in support of the U.S. Department of Health and Human Services efforts to empower individuals to make healthy choices using electronic technology,' according to the Surgeon General's office. The government is considering apps in three categories, including fitness and physical activity apps, nutrition and healthy eating apps, and integrative health apps. The winning apps will be featured on ..." From 'Computerworld' (12/15/11), by Lucas Mearian.

    There have been other app challenges in the news recently that you could use as inspiration for your project.

  10. Others: Of course, there are a whole host of other ideas such as bank accounts, student records, World Cup data, election results, Senate demographics, car rentals, auto insurance, consumer products, courses at MadeUp University, silly statistics, "match-making services" and so on. Use your imagination.

Credit and thanks are due to Dr. Murali at Virginia Tech for inspiration and documentation.

Last modified: Jan 28, 2016