Assignment 4 Peer Design Review Guide

As with previous assignments, the goal of this activity is to give you
experience in reading and understanding someone else's design
document as well as to provide feedback on your own design. You
should feel free to incorporate things you learn through this process
into your own designs, but please be sure to cite, in your own
design document, who you worked with and what aspects of your design
were influenced by the interaction.

Here is how we recommend you approach the review.

You may ignore the code-reading questions.

Take about 10-15 minutes to read through the entire design. Each
time you encounter something that you don't understand or something
that doesn't seem quite right, make a note of it on the document.
A perfectly written design document should be sufficiently clear and
detailed that you could begin coding to the design. To be clear:
the design review is a collaborative process designed to improve
both designs.  It is not a competitive process; it is not intended
to become an argument over which of two designs is better. In many
cases there are multiple ways to do something, all of which are quite
reasonable. However, you will do your peer review partners a great service
if you pinpoint areas where they are being vague. This is almost always
a sign that there is either a lack of understanding or a misunderstanding
about an important part of the assignment.

After both teams have completed reviewing the documents, there are
two different ways to move forward.

The sequential version:
Pick one team to "go first."  Let's say you decide that the Cobras will
review the Mongooses first. The Cobras should start working their
way through the design and the list of issues identified, asking
for clarification.  We expect that both teams will be taking notes
during this process -- the Cobras should be answering the questions
they jotted down and the Mongooses should be making notes on parts
of their design that are unclear or that need to be modified. We
also expect that we'll see people drawing things on whiteboards
as part of this discussion.

Please be cognizant of time -- after the initial document reviews,
each team will have approximately 30 to 35 minutes to review and
be reviewed.  We will make announcements when we expect roles to
switch, but you don't want to be caught by surprise.

The parallel version:
Another way to do this is to progress through the design by topic, discussing
each team's design for that topic. During the discussions, the process will be
similar to that outlined above, but there will be more back and forth.

Pick whichever approach is more appealing; if you can't decide, try the
sequential approach this time.

Here is a list of topics that we expect you'll cover in your
reviews. Please take notes on the list below, so you can turn
in two copies of this form with your assignment -- one copy is
the one you construct while reviewing someone else's design and
the other is the one you construct while having your design
reviewed.

1. The Log/Journal

* Will you use an UNDO log, a REDO log, or a REDO/UNDO?
* How will you enforce write-ahead-logging?
* What do you need to do to checkpoint?
	How often will you take them?  (A checkpoint is way of
	ensuring that all the updates corresponding to some set of
	log records have been propagated to disk; checkpoints let
	you bound how much of the log you need to use at recovery
	time. Constructing a proper checkpoint will invoke calls
	on the container that reclaim space compatible with your
	checkpoint.

2. The Recovery Process

* What is your algorithm that must be executed after a failure to restore the
	file system to a consistent state?  More specifically, how many times
	and in what direction(s) do you need to traverse the log?
* What assumptions can your recovery process make?
* What guarantees will your recovery process make? (That is, what can you
	say about the state of the system after recovery runs.)

3. Specific Log Records

* For each file system operation, what are the lower-level micro
operations that comprise it (note: these are the operations for
which you will write log records and recovery routines).

* Where can you make calls to logging routines?
* How will you propagate information about the log to the locations
	at which you need to write log records?
* What recovery actions are needed for each log record? Have you
	specified enough information in your log records to facilitate
	what you need during recovery?

4. Transactions

* How are you keeping track of the set of log records that
	comprise a file system operation?
* What happens if you crash after you've written some, but not
	all of these operations?
* How does SFS's synchronization and the log container's
synchronization interact with your transaction implementation?


5. Testing

* How will you test your system?

* Can you outline a testing philosophy or strategy that will let you
	approach testing in a methodical fashion?

* Can you perform unit testing?  How?


6. Miscellaneous

* What features can you design into your system to facilitate
debugging?

* While you may assume there is no file system activity while you
	run recovery, there will almost certainly be activity while
	you write log records, take checkpoints, and reclaim log
	space.  What kind of synchronization will you need for that?