CS 4250: Concurrency Control and Recovery Homework (4)

Email due at midnight on Tuesday, May 17, 2016. (Strongly prefer plain text. Will accept PDF or Word.) Email subject should be "cs4250,hwk4".

This is an individual assignment. All work must be your own. You should not look at any other student's work (in whole or in part, on paper or on screen), nor allow anyone else to look at yours, during the course of this assignment.

Note: T<number> identifies a transaction numbered number. R(<letter>) identifies a read operation on database object letter. W(<letter>) identifies a write operation on database object letter.

Questions

  1. Consider the following schedule:

    T1: R(A), T1: W(A), T2: R(B), T3: R(A), T1: R(A), T3: W(B), T2: R(A), T2: R(C)

    Is the schedule serializable? If so, show an equivalent serial transaction order. If not, precisely describe why not.

    If relevant, fill in this table with the equivalent serial transaction order. Time proceeds from left to right, with only one action possible in each time slot.
    Serializable Schedule Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9 Time 10 Time 11 Time 12 Time 13 Time 14 Time 15
    T1














    T2














    T3














  2. Consider the following schedule:

    T2: R(A), T2: R(B), T2: R(C), T3: W(C), T1: R(B), T1: R(C), T1: W(B), T3: R(A), T3: R(B), T2:W(D)

    Is the schedule serializable? If so, show an equivalent serial transaction order. If not, precisely describe why not.

    If relevant, fill in this table with the equivalent serial transaction order. Time proceeds from left to right, with only one action possible in each time slot.
    Serializable Schedule Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9 Time 10 Time 11 Time 12 Time 13 Time 14 Time 15
    T1














    T2














    T3














  3. For questions 3-5, consider the execution of the ARIES recovery algorithm given the following log:

    LSN Log Record
    00 begin_checkpoint
    05 end_checkpoint
    10 Update: T1 writes P1
    20 Update: T1 writes P4
    30 Update: T3 writes P2
    40 Update: T4 writes P4
    50 T3 abort
    60 Update: T2 writes P3
    70 T2 commit
    80 Update: T1 writes P3
    90 T2 end
    X - crash, restart

    For the questions below, when you are asked which log records are read, you are to supply the exact list of LSNs from log above. When data pages are asked for, you are to supply the exact list of page identifiers from the log above. And so on. Be specific and concrete in your answers, answering specifically for the provided log.

  4. During Analysis: a) What log records are read? b) What are the contents of the Dirty Page Table (DPT) and the transaction table at the end of the analysis stage?

  5. During Redo: a) What log records are read? b) What data pages are read? c) What operations are redone?  (Assume no updates made it out to disk before the crash, except updates written to disk as part of a transaction commit.)

  6. During Undo: a) What log records are read? b) What operations are undone?
  7. For questions 6-8, consider the execution of the ARIES recovery algorithm given the following log:

    LSN Log Record
    00 begin_checkpoint
    05 end_checkpoint
    10 Update: T2 writes P3
    20 Update: T1 writes P4
    30 Update: T3 writes P2
    40 Update: T4 writes P3
    50 T3 abort
    60 CLR: Undo T3 LSN 30
    70 Update: T1 writes P3
    80 T4 commit
    90 T3 end
    X - crash, restart

    For the questions below, when you are asked which log records are read, you are to supply the exact list of LSNs from log above. When data pages are asked for, you are to supply the exact list of page identifiers from the log above. And so on. Be specific and concrete in your answers, answering specifically for the provided log.

  8. During Analysis: a) What log records are read? b) What are the contents of the Dirty Page Table (DPT) and the transaction table at the end of the analysis stage?

  9. During Redo: a) What log records are read? b) What data pages are read? c) What operations are redone?  (Assume no updates made it out to disk before the crash, except updates written to disk as part of a transaction commit.)

  10. During Undo: a) What log records are read? b) What operations are undone?
  11. Added to Homework 4 on May 9

  12. Suppose you are given the relation magazines(mid: integer, mtitle: string, mtelephone: char(10), currentVolume: integer, currentNumber: integer, color: bit). The relation is stored in a file sorted by currentVolume. Each hard disk page on the (very small) hard disk can store up to four tuples. (So tuples 1, 2, and so on are on disk page 1, but tuple 5 is on page 2.) Below is part of an instance of the relation:
    mid mtitle mtelephone currentVolume currentNumber color
    314 IEEE Spectrum 8003333333 54 4 1
    345 Communications of the ACM 8001111111 58 5 1
    101 National Geographic8002222222 227 5 1
    159 Magazine AAA 8004444444 10 12 0
    265 Magazine BBB8006666666 42 2 0
    358 Magazine CCC8006666666 39 10 1
    1. Explain what the data entries in each of the following indexes would contain. If such an index can be constructed, provide at least one sample data entry based on the table instance above. If such an index cannot be constructed, say so and explain why.

      1. An unclustered index on currentVolume using Alternative (1).
      2. An unclustered index on currentVolume using Alternative (2).
      3. A clustered index on currentVolume using Alternative (1).
      4. A clustered index on currentVolume using Alternative (3).
      5. An unclustered index on mtelephone using Alternative (2).
      6. An unclustered index on mtelephone using Alternative (3).
      7. A clustered index on mtelephone using Alternative (1).
      8. A clustered index on mtelephone using Alternative (2).
    2. Consider the relation above, but suppose the sample tuples above are simply a small piece of a very large table. currentVolume values range from 0 to 300, currentNumber ranges from 0 to 20, color can assume only two values (0 and 1), and mtelephone numbers range over the full set of possible combinations with few duplicate values in the table. You may assume uniform distributions of values. For each of the following indexes, would it speed up the this query? Answer yes or no and explain why.

      SELECT currentVolume, currentNumber FROM magazines where currentNumber = 250 and currentVolume < 20; (original SQL, with typo)

      SELECT currentVolume, currentNumber FROM magazines where currentNumber = 20 and currentVolume < 20; (new SQL, without typo, 5/17)

      1. Clustered hash index on (currentVolume, currentNumber) fields of magazine
      2. Unclustered hash index on (currentVolume) field of magazine
      3. Clustered tree index on (currentVolume) field of magazine
      4. Clustered tree index on (currentVolume, currentNumber) fields of magazine