-
Notifications
You must be signed in to change notification settings - Fork 7
assignment.1
The aim of the first two projects is to modify functionality in the backend server of PostgreSQL. For this first assignment, we focus on just one core module - the buffer manager. The actual coding for this assignment is minimal, given the highly organized and modular PostgreSQL code (as you'll see, the code for the buffer manager policy is cleanly separated from the rest of the code base). However, you have to figure out what code to modify, which is non-trivial!
There are three major parts to this project:
-
Understanding different buffer management strategies
-
Examining, understanding and modifying existing code
-
Testing your code in a real system
Please skim through this document to the end before getting to work!
The first part of the assignment is to be done individually. The second part should be done in teams of two.
= Deadline =
Both parts of Assignment 1 are due on Friday, February 13th. This is a short period of time -- not so much to do the work, which is minimal, but the understand the algorithms and the codebase. So get started as soon as possible!
- 30% - Written assignment on "Understanding Different Buffer Replacement Strategies"
- 50% - Implementation of alternate buffer management strategies in postgres
- 20% - Performance analysis of different strategies
The textbook describes LRU, MRU and CLOCK strategies in section 9.4.1. However, you may need to read the entire section (9.4) to get a full image of a buffer manager in the DBMS. A fourth (more recent) strategy is called 2Q, and is used in Postgres v8.0.3 (not by any means the most recent version of Postgres, but one that is amenable to our changes). LRU, MRU, and CLOCK strategies work well enough for most systems, so we are going to replace 2Q. But first you'll need to exercise your understanding of the LRU, MRU, and CLOCK buffer replacement strategies.
This part of the assignment requires no code. We will give you the number of frames that the buffer manager must manage and a sequence of data blocks accessed by the different DBMS layers above the buffer manager. You will have to track the behavior of the buffer manager (which pages it has to replace, when, etc.) using only your knowledge of the three aforementioned buffer replacement strategies.
You must record your answers in three plain text files called strategy.lru, strategy.mru, strategy.clock. These files will contain the buffer/frame state and buffer manager replacement decisions in accordance to the LRU, MRU, and CLOCK policies respectively. The format of each file is as follows:
- Three columns of text, separated by spaces
- One row (line of text) each time a page miss occurs
- Each row lists the time of a page miss, a space, the buffer pool frame (P1, P2, P3, P4), a space and, if an eviction occurs, the page evicted from memory. For example, If a miss occurs during time 3, causing us to evict page A and ping frame 3, we would write: "T3 P3 A"
- Nothing will be written when there is a hit.
For example, assume there are four frames your buffer manager must manage: P1, P2, P3, P4. All four frames are initially empty. When the buffer pool has unused frames (e.g., in the beginning when all four frames are empty), it will put the newly read data in the first unused frame. The blocks to be read from disk are labeled A through F. For simplicity here, we assume that after each access the page is pinned, and then immediately unpinned. (You cannot make this same assumption when writing the actual code. A page may be pinned for any length of time in a DBMS.)
Given this information and the following workload:
||Time||T1||T2||T3||T4||T5||T6||T7||T8||T9||T10||T11 ||Block Accessed||A||A||B||C||D||E||F||A||B||C||D||
the files should be as follows:
||strategy.lru|| strategy.mru|| strategy.clock|| ||T1 P1||T1 P1||T1 P1|| ||T3 P2|| T3 P2||T3 P2|| ||T4 P3||T4 P3||T4 P3|| ||T5 P4||T5 P4||T5 P4|| ||T6 P1 A||T6 P4 D||T6 P1 A|| ||T7 P2 B||T7 P4 E||T7 P2 B|| ||T8 P3 C||T11 P3 C||T8 P3 C|| ||T9 P4 D ||||T9 P4 D|| ||T10 P1 E|| ||T10 P1 E|| ||T11 P2 F||||T11 P2 F||
Several points deserve elaboration:
- Certain times (e.g., T2) are missing for each strategy. This is due to the different buffer miss patterns exhibited by the given replacement policy.
- The third column will be blank when the buffer miss does not result in an eviction.
- When there are multiple FREE frames available, the frame with the lowest id is chosen (e.g. P2 before P3 and P4 when all are free at T3). You should conform to this tie-breaking criteria.
In this example, for the given number of frames and sequence of block accesses, MRU emerges the winner (least number of misses). Now we come to the problem that you will have to solve.
Suppose the buffer manager has five frames to manage: P1, P2, P3, P4, P5. All frames are initially empty.
Suppose seven disk blocks (A, B, ..., G) are accessed by the layers above the buffer manager layer in the DBMS.
Clock policy elaboration: we make the following assumptions in the clock policy:
- The pointer doesn't move when filling a free frame in buffer
- The pointer doesn't move when there's a hit in the buffer
- The pointer advances after replacing a buffer frame
To get your access pattern, login to your cs186 account and type genpattern.rb at the prompt.
Remember that you must use:
Your cs186 class account to get your access pattern. Based on this information, you should perform a similar analysis as the one above. We will first have you save your pattern by running the following command
$ /home/ff/cs186/sp09/bbs.rb > pattern
You must now generate the three files strategy.lru, strategy.mru, and strategy.clock for this problem. Stick to the specified format (e.g., no extra line at the end of a file), as we will use file comparison scripts to evaluate your answers.
You will need to use the unix submit program to hand in your assignment.
Save your four files pattern, strategy.lru, strategy.mru, strategy.clock in a directory called "hw1p1" within your cs186 home directory.
cd into hw1p1. Run: submit hw1p1