CS 4250: Database Management Systems

Fall 2017 - Homework 4

Concurrency Control and Recovery Homework

Email due at midnight on Tuesday, December 12, 2017. (Strongly prefer plain text. Will accept PDF or Word.) Email subject should be "cs4250,hwk4".

This is an individual assignment. All work must be your own. You should not look at any other student's work (in whole or in part, on paper or on screen), nor allow anyone else to look at yours, during the course of this assignment.

Note: T<number> identifies a transaction numbered number. R(<letter>) identifies a read operation on database object letter. W(<letter>) identifies a write operation on database object letter.

Questions

  1. Consider the following schedule:

    T2: R(B), T2: W(B), T1: R(A), T3: R(C), T1: R(B), T3: W(C), T3: R(B), T1: R(C), T2: R(A), T3: R(A), T3: W(A)

    Is the schedule serializable? If so, show an equivalent serial transaction order. If not, precisely describe why not.

    If relevant, fill in this table with the equivalent serial transaction order. Time proceeds from left to right, with only one action possible in each time slot.
    Serializable Schedule Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9 Time 10 Time 11 Time 12 Time 13 Time 14 Time 15
    T1














    T2














    T3














  2. Consider the following schedule:

    T2: R(B), T2: R(C), T2: W(C), T3: R(C), T1: R(B), T3: R(A), T3: W(A), T2: R(A), T3: R(A), T1: R(A), T1: W(A) , T3: R(B)

    Is the schedule serializable? If so, show an equivalent serial transaction order. If not, precisely describe why not.

    If relevant, fill in this table with the equivalent serial transaction order. Time proceeds from left to right, with only one action possible in each time slot.
    Serializable Schedule Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9 Time 10 Time 11 Time 12 Time 13 Time 14 Time 15
    T1














    T2














    T3














  3. Consider the following schedule:

    T3: R(A), T3: R(B), T2: R(B), T1: R(C), T3: R(B), T1: R(C), T3: W(C), T2: R(C), T3: R(A), T3: R(B), T1: R(A), T1: R(B), T2: R(C), T2: W(C)

    Is the schedule serializable? If so, show an equivalent serial transaction order. If not, precisely describe why not.

    If relevant, fill in this table with the equivalent serial transaction order. Time proceeds from left to right, with only one action possible in each time slot.
    Serializable Schedule Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9 Time 10 Time 11 Time 12 Time 13 Time 14 Time 15
    T1














    T2














    T3














  4. For questions 4-6, consider the execution of the ARIES recovery algorithm given the following log:

    LSN Log Record
    00 begin_checkpoint
    05 end_checkpoint
    10 Update: T1 writes P3
    20 Update: T2 writes P2
    30 T1 abort
    40 Update: T3 writes P1
    50 Update: T4 writes P2
    60 CLR: Undo T1 LSN 10
    70 T4 commit
    80 Update: T3 writes P1
    90 T1 end
    X - crash, restart

    For the questions below, when you are asked which log records are read, you are to supply the exact list of LSNs from log above. When data pages are asked for, you are to supply the exact list of page identifiers from the log above. And so on. Be specific and concrete in your answers, answering specifically for the provided log.

    Generic statements or quotations from the textbook will earn 0 points. Example: "all the pages" == 0 points. "P1, P2 and P3" == more than 0 points, assuming those are the pages read.

  5. During Analysis: a) What log records are read? b) What are the contents of the Dirty Page Table (DPT) and the transaction table at the end of the analysis stage?

  6. During Redo: a) What log records are read? b) What data pages are read? c) What operations are redone?  (Assume no updates made it out to disk before the crash, except updates written to disk as part of a transaction commit.)

  7. During Undo: a) What log records are read? b) What operations are undone?
  8. Consider these tables (from Homework 2).
    Patient (pid: integer, lname: string, fname: string, primary_did: integer, age: integer)
    Doctor (did: integer, lname: string, fname: string, deptid: integer, age: integer)
    Assume the Patient table has 10,000 tuples in it and the Doctor table has 500 tuples in it.
    Also consider this question:
    Find the patient identifiers and last names, and the doctor identifiers and last names, of patients and their assigned primary care doctors, where the doctors work for the department with identifier 11.
    One possible solution to the question is:
    SELECT P.pid, P.lname, D.did, D.lname FROM Patient P, Doctor D WHERE P.primary_did = D.did AND D.deptid = 11;

    a) Translate this SQL query into a corresponding relational algebra expression (and write down the expression). Then draw the query tree for your relational algebra expression.
    b) Using equivalencies of relational algebra expressions, re-write your relational algebra expression into a more efficient form, if possible. Draw the new, corresponding query tree. Explain why your new expression is more efficient.