DIVISION OF ENGINEERING AND APPLIED SCIENCES
HARVARD UNIVERSITY

CS 161. Operating Systems

Matt Welsh
Spring 2005

Assignment 4: File System

[CS161 Home Page]

Due: Design document due 11:59pm on Wednesday, April 25, 2007
Final submission due 5pm on Friday, May 4, 2007

Introduction

The OS/161 file system, emufs, is just a layer on top of the Unix file system. In this assignment you will augment sfs, the native OS/161 file system, in four ways:

As usual, this assignment is broken into two parts. Be aware of the due dates for each part! The design document includes some code-reading questions, the results of a test run, and your design for the modifications outlined above. The implementation consists of everything you need to do to make your design run!

Code Reading (10 points)

The OS/161 sfs file system we provide is very simple. The implementation resides in the fs/sfs directory. The fs/vfs directory provides the infrastructure to support multiple file systems.

kern/include: You should examine the files fs.h, vfs.h, vnode.h, and sfs.h.

Question 1. What is the difference between VOP_ routines and FSOP_ routines?

kern/fs/vfs: The file device.c implements raw device support.

Question 2. What vnode operations are permitted on devices?

devnull.c implements the OS/161 equivalent of /dev/null, called "null:". vfscwd.c implements current working directory support.

Question 3. Why is VOP_INCREF called in vfs_getcurdir()?

vfslist.c implements operations on the entire set of file systems.

Question 4. How do items get added to the vfslist?

vfslookup.c contains name translation operations. vfspath.c supports operations on path names and implements the vfs operations. vnode.c has initialization and reference count management.

kern/fs/sfs/sfs_fs.c has file system routines for sfs.

Question 5. There is no buffer cache currently in sfs. However, the bitmaps and superblock are not written to disk on every modification. How is this possible?

Question 6. What do the statements in Question 5 mean about the integrity of your file system after a crash?

Question 7. Can you unmount a file system on which you have open files?

Question 8. List 3 reasons why a mount might fail.

sfs_io.c has block I/O routines, and sfs_vnode.c has file routines.

Question 9. Why is a routine like sfs_partialio() necessary? Why is this currently a performance problem? What part of this assignment will make it less of one?

sbin/mksfs implements the mksfs utility which creates an sfs file system on a device. disk.h/disk.c defines what the disk looks like.

Question 10. What is the inode number of the root?

Question 11. How do files get removed from the system?

Setting up

Propagate any changes you've made to your previous config files into the ASST4 config file. Then config and build a kernel. Tag your current repository asst4-begin.

Initial Testing

The results of this section should be placed in a file called initial.txt (for final submission) as well as submitted with your initial design document. Once you have everything built, format the disk and run the file system performance test from the kernel menu by specifying fs1.

New System Calls

You will find a utility in sbin called dumpsfs. Having a tool that can dump an entire file system is an invaluable debugging aid. As you modify your file system, be sure to keep this utility up to date, so that it can continue to be useful to you. First, you will need to add support for the system calls listed below.

The general requirements for error codes are the same as in Assignment 2; for details, consult the OS/161 man pages. Specific requirements:

Synchronization

SFS does not currently protect itself from concurrent access by multiple threads. If you think back to Assignment 1, we asked you to solve the cat and mouse problem. If your code was not properly synchronized, mice would get eaten, which leads to an ugly scene. A similar thing can happen here. For example, because there is no synchronization on the free block bitmap, two threads creating files could both decide to use the same free sector. Your mission is to bullet-proof the filesystem so that two threads can use the filesystem harmoniously. You must allow multiple threads to have the same file open. When this is the case, your filesystem needs to implement the following (UNIX-like) filesystem semantics.

Be careful to return appropriate error codes from calls to file-related methods in the file system! The syscalls you implemented in Assignment 2 rely on these values to operate correctly!

Synchronization Design

In your design document, write a one to two page description of what you will change. Discuss how you will provide synchronization for the file system. List which pieces need to be protected, how, and which synchronization primitives you will use.

Synchronization Implementation

Synchronize SFS. Ensure that the semantics described above are supported.

Be sure to test your code. Include test scripts, programs, and/or output with your submission. The f_test and associated programs in testbin are good places to start.

Hierarchical Directories

At this point, the file system should be working well; however, it would be much nicer if it handled hierarchical directories (i.e., pathnames). Not only is this possible, but fairly straightforward: currently, each entry in a directory is a regular file. By careful modification SFS can be extended to store both regular files and directories.

We have provided you with programs that implement the mkdir, rmdir, and ls commands. They are found in bin.

If you've answered the questions above, you'll notice that our pathnames are a superset of typical UNIX pathnames. As in UNIX, we use "/" as a pathname component separator.

Your code must do the following:

Your code must not assume that the user wants all the missing directories created automatically when presented with a pathname that doesn't exist, on a create. For example, if there is a directory named /bim/ska/la and you mistakenly try to create a file named /bum/ska/la/bim, I don't want SFS to create the directories /bum, /bum/ska, and /bum/ska/la so that it can create bim. It should return an error.

Hierarchical Directory Design

Explain how you will implement hierarchical directories. Discuss any new data structures and synchronization you will need. Identify the parts of the system that will need to change. Where do you expect to have the greatest difficulty?

Explicitly discuss how you will implement cross-directory rename. This is very tricky as you will need to synchronize across multiple directories. While better than race conditions, deadlocks will make your system unusable. If the rename fails in any way, you must ensure that the file system is left in a consistent and correct state. Think very carefully about this!

When deleting a directory, make sure that it contains no in-use entries.

Hierarchical Directory Implementation

Implement hierarchical directories. The mkdir and rmdir programs should work with them, as well as cat, cp, and ls. After you've created a file in the root directory named test, make sure that all of the following commands work when typed at your shell:

  % /bin/ls /
  % /bin/mkdir /foo
  % /bin/cp /test /foo/test
  % /bin/ls /foo
  % /bin/mkdir /foo/bar
  % /bin/cp /test /foo/bar/test
  % /bin/ls /foo
  % /bin/ls /foo/bar
  % /bin/cat /foo/bar/test

Buffer Cache

You will find that the performance of SFS is not great. In particular, there is no buffer cache. You should add a buffer cache to OS/161. You will need to decide what data structures you need to implement this and what interface routines you will use. This is probably the most critical piece of your design. Figure out exactly where the buffer cache will interface with the rest of the system, and how you will maintain the integrity of your file system in the face of caching.

We recommend that you add the buffer cache last. Gather extensive performance measurements before you add the cache and after. You can use those measurements to demonstrate the effectiveness of your cache.

Buffer Cache Design

Include a description of your interface and data structures, your replacement algorithms, and any extra precautions you take to ensure the integrity of the file system. Also, discuss any additional support necessary to make sure that buffered data eventually gets written to disk.

Tools

It is required that the following user-level binaries (whose source we give you in bin) are functional when you submit! You shouldn't have to modify the source code for these tools; if, for some reason, you may think you do (because of a different implementation of the OS/161 filesystem than that assumed in these files), contact a TF first and discuss it!

What to Submit