Assignment 4 Peer Design Review Guide As with previous assignments, the goal of this activity is to give you experience in reading and understanding someone else's design document as well as to provide feedback on your own design. You should feel free to incorporate things you learn through this process into your own designs, but please be sure to cite, in your own design document, who you worked with and what aspects of your design were influenced by the interaction. Here is how we recommend you approach the review. You may ignore the code-reading questions. Take about 10-15 minutes to read through the entire design. Each time you encounter something that you don't understand or something that doesn't seem quite right, make a note of it on the document. A perfectly written design document should be sufficiently clear and detailed that you could begin coding to the design. To be clear: the design review is a collaborative process designed to improve both designs. It is not a competitive process; it is not intended to become an argument over which of two designs is better. In many cases there are multiple ways to do something, all of which are quite reasonable. However, you will do your peer review partners a great service if you pinpoint areas where they are being vague. This is almost always a sign that there is either a lack of understanding or a misunderstanding about an important part of the assignment. After both teams have completed reviewing the documents, there are two different ways to move forward. The sequential version: Pick one team to "go first." Let's say you decide that the Cobras will review the Mongooses first. The Cobras should start working their way through the design and the list of issues identified, asking for clarification. We expect that both teams will be taking notes during this process -- the Cobras should be answering the questions they jotted down and the Mongooses should be making notes on parts of their design that are unclear or that need to be modified. We also expect that we'll see people drawing things on whiteboards as part of this discussion. Please be cognizant of time -- after the initial document reviews, each team will have approximately 30 to 35 minutes to review and be reviewed. We will make announcements when we expect roles to switch, but you don't want to be caught by surprise. The parallel version: Another way to do this is to progress through the design by topic, discussing each team's design for that topic. During the discussions, the process will be similar to that outlined above, but there will be more back and forth. Pick whichever approach is more appealing; if you can't decide, try the sequential approach this time. Here is a list of topics that we expect you'll cover in your reviews. Please take notes on the list below, so you can turn in two copies of this form with your assignment -- one copy is the one you construct while reviewing someone else's design and the other is the one you construct while having your design reviewed. 1. The Log/Journal * Will you use an UNDO log, a REDO log, or a REDO/UNDO? * How will you enforce write-ahead-logging? * What do you need to do to checkpoint? How often will you take them? (A checkpoint is way of ensuring that all the updates corresponding to some set of log records have been propagated to disk; checkpoints let you bound how much of the log you need to use at recovery time. Constructing a proper checkpoint will invoke calls on the container that reclaim space compatible with your checkpoint. 2. The Recovery Process * What is your algorithm that must be executed after a failure to restore the file system to a consistent state? More specifically, how many times and in what direction(s) do you need to traverse the log? * What assumptions can your recovery process make? * What guarantees will your recovery process make? (That is, what can you say about the state of the system after recovery runs.) 3. Specific Log Records * For each file system operation, what are the lower-level micro operations that comprise it (note: these are the operations for which you will write log records and recovery routines). * Where can you make calls to logging routines? * How will you propagate information about the log to the locations at which you need to write log records? * What recovery actions are needed for each log record? Have you specified enough information in your log records to facilitate what you need during recovery? 4. Transactions * How are you keeping track of the set of log records that comprise a file system operation? * What happens if you crash after you've written some, but not all of these operations? * How does SFS's synchronization and the log container's synchronization interact with your transaction implementation? 5. Testing * How will you test your system? * Can you outline a testing philosophy or strategy that will let you approach testing in a methodical fashion? * Can you perform unit testing? How? 6. Miscellaneous * What features can you design into your system to facilitate debugging? * While you may assume there is no file system activity while you run recovery, there will almost certainly be activity while you write log records, take checkpoints, and reclaim log space. What kind of synchronization will you need for that?