CS 4250: Concurrency Control and Recovery Homework (4)

Email due at midnight on Friday, May 15, 2015. (Strongly prefer plain text. Will accept PDF or Word.) Email subject should be "cs4250,hwk4".

This assignment is to be completed individually.

Note: T<number> identifies a transaction numbered number. R(<letter>) identifies a read operation on database object letter. W(<letter>) identifies a write operation on database object letter.

Questions

  1. Consider the following schedule:

    T1: R(A), T1: W(A), T2: R(A), T3: R(B), T4: R(C), T3: W(B), T2: W(A), T1: R(D), T4: R(B), T4: R(D), T3: R(C), T3: W(C)

    Is the schedule serializable? If so, show an equivalent serial transaction order. If not, precisely describe why not.

    If relevant, fill in this table with the equivalent serial transaction order. Time proceeds from left to right, with only one action possible in each time slot.
    Serializable Schedule Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9 Time 10 Time 11 Time 12 Time 13 Time 14 Time 15 Time 16
    T1















    T2















    T3















    T4















  2. Consider the following schedule:

    T3: R(A), T2: R(B), T3: R(B), T3: R(C), T2: W(B), T4: R(B), T3: R(D), T4: R(A), T2: R(C), T4:W(B), T1: R(B), T1:R(D), T4: W(A), T1:W(D)

    Is the schedule serializable? If so, show an equivalent serial transaction order. If not, precisely describe why not.

    If relevant, fill in this table with the equivalent serial transaction order. Time proceeds from left to right, with only one action possible in each time slot.
    Serializable Schedule Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9 Time 10 Time 11 Time 12 Time 13 Time 14 Time 15 Time 16
    T1















    T2















    T3















    T4
















  3. For questions 3-5, consider the execution of the ARIES recovery algorithm given the following log (assume a checkpoint is completed just before LSN 10 and the Dirty Page Table (DPT) and Transaction Tables for that checkpoint are empty):

    LSN

    Log Record

    10

    Update: T1 writes P1

    20

    Update: T2 writes P4

    30

    Update: T3 writes P2

    40

    T3 commit

    50

    Update: T2 writes P3

    60

    T2 abort

    70

    Update: T1 writes P3

    80

    T3 end

    X - crash, restart

  4. During Analysis: a) What log records are read? b) What are the contents of the Dirty Page Table (DPT) and the transaction table at the end of the analysis stage?

  5. During Redo: a) What log records are read? b) What data pages are read? c) What operations are redone?  (Assume no updates made it out to disk before the crash, except updates written to disk as part of a transaction commit.)

  6. During Undo: a) What log records are read? b) What operations are undone?
  7. Suppose you are given the relation magazines(mid: integer, mtitle: string, mtelephone: char(10), currentVolume: integer, currentNumber: integer, color: bit). The relation is stored in a file sorted by currentVolume. Each hard disk page on the (very small) hard disk can store up to four tuples. (So tuples 1, 2, and so on are on disk page 1, but tuple 5 is on page 2.) Below is part of an instance of the relation:
    mid mtitle mtelephone currentVolume currentNumber color
    159 Magazine AAA 8004444444 10 12 0
    358 Magazine CCC8006666666 39 10 1
    265 Magazine BBB8005555555 42 2 0
    314 IEEE Spectrum 8003333333 54 4 1
    345 Communications of the ACM 8001111111 58 5 1
    101 National Geographic8002222222 227 5 1
    1. Explain what the data entries in each of the following indexes would contain. If such an index can be constructed, provide at least one sample data entry based on the table instance above. If such an index cannot be constructed, say so and explain why.

      1. An unclustered index on currentVolume using Alternative (1).
      2. An unclustered index on currentVolume using Alternative (2).
      3. An unclustered index on currentVolume using Alternative (3).
      4. A clustered index on currentVolume using Alternative (1).
      5. A clustered index on currentVolume using Alternative (2).
      6. A clustered index on currentVolume using Alternative (3).
      7. An unclustered index on mtelephone using Alternative (1).
      8. An unclustered index on mtelephone using Alternative (2).
      9. An unclustered index on mtelephone using Alternative (3).
      10. A clustered index on mtelephone using Alternative (1).
      11. A clustered index on mtelephone using Alternative (2).
      12. A clustered index on mtelephone using Alternative (3).
    2. Consider the relation above, but suppose the instance above is simply a small fraction of the full table. currentVolume values range from 0 to 300, currentNumber ranges from 0 to 20, color can assume only two values (0 and 1), and mtelephone numbers range over the full set of possible combinations with few duplicate values in the table. You may assume uniform distributions of values. For each of the following indexes, would it speed up the this query? Answer yes or no and explain why.

      SELECT currentVolume, currentNumber FROM magazines where currentVolume > 250 and currentVolume < 270;

      1. Clustered hash index on (currentVolume, currentNumber) fields of magazine
      2. Unclustered hash index on (currentVolume) field of magazine
      3. Clustered tree index on (currentVolume) field of magazine
      4. Clustered tree index on (currentVolume, currentNumber) fields of magazine