DIVISION OF ENGINEERING AND APPLIED SCIENCES
HARVARD UNIVERSITY

CS 161. Operating Systems

Matt Welsh
Spring 2007

Assignment 0: An Introduction to OS/161

[CS161 Home Page]

Due: In class on Tuesday, February 13, 2007.

Introduction

This assignment will familiarize you with OS/161, the operating system with which you will be working this semester, and System/161, the machine simulator on which OS/161 runs. We also introduce tools that will make your work this semester easier:

CVS (Concurrent Versions System)
CVS is a source code revision control system. It manages the source files of a software package so that multiple programmers may work simultaneously. Each programmer has a private copy of the source tree and makes modifications independently. CVS attempts to intelligently merge multiple people's modifications, highlighting potential conflicts when it fails.

GDB (Gnu Debugger)
GDB allows you to examine what is happening inside a program while it is running. It lets you execute programs in a controlled manner and view and set the values of variables. In the case of OS/161, it allows you to debug the operating system you are building instead of the machine simulator on which that operating system is running.

The first part of this document briefly discusses the code on which you'll be working and the tools you'll be using. You can find more detailed information on CVS and GDB in separate handouts. The following sections provide precise instructions on exactly what you must do for the assignment. Each section with (hand me in) at the beginning indicates a section where there is something that you must do for the assignment.

What are OS/161 and System/161?

The code for this semester is divided into two main parts:

The OS/161 distribution contains a full operating system source tree, including some utility programs and libraries. After you build the operating system you boot, run, and test it on the simulator.

We use a simulator in CS161 because debugging and testing an operating system on real hardware is extremely difficult. The System/161 machine simulator has been found to be an excellent platform for rapid development of operating system code, while still retaining a high degree of realism. Apart from floating point support and certain issues relating to RAM cache management, it provides an accurate emulation of a MIPS R3000 processor.

There will be an OS/161 programming assignment for each of the following topics:

CS161 assignments are cumulative. Ideally you will build each assignment on top of your previous submission. If, however, at any point you wish to build on top of a solution set, contact your TF for a copy of the appropriate solution set, even if we have not yet released it.

Using the solution sets may seem like an attractive alternative, since they are guaranteed to work. Keep in mind, however, that using the solution set requires that you understand a code base that you did not write. We encourage groups to refrain from using the solution sets except in the most dire circumstances.

About CVS

Most programming you have probably done at Harvard has been in the form of 'one-off' assignments: you get an assignment, you complete it yourself, you turn it in, you get a grade, and then you never look at it again.

The commercial software world uses a very different paradigm: development continues on the same code base producing releases at regular intervals. This kind of development normally requires multiple people working simultaneously within the same code base, and necessitates a system for tracking and merging changes. Beginning with ASST2 you will work in teams on OS/161, and therefore it is imperative that you start becoming comfortable with CVS, the Concurrent Versions System.

CVS is very powerful, but for CS161 you only need to know a subset of its functionality. The Introduction to CVS handout contains all the information you need to know and should serve as a reference throughout the semester. If you'd like to learn more, there is comprehensive documentation available here .

About GDB

In some ways debugging a kernel is no different from debugging an ordinary program. On real hardware, however, a kernel crash will crash the whole machine, necessitating a time-consuming reboot. The use of a machine simulator like System/161 provides several debugging benefits. First, a kernel crash will only crash the simulator, which only takes a few keystrokes to restart. Second, the simulator can sometimes provide useful information about what the kernel did to cause the crash, information that may or may not be easily available when running directly on top of real hardware.

You must use the CS161 version of GDB to debug OS/161. You can run on the UNIX systems used for the course as cs161-gdb. This copy of GDB has been configured for MIPS and has been patched to be able to communicate with your kernel through System/161.

An important difference between debugging a regular program and debugging an OS/161 kernel is that you need to make sure that you are debugging the operating system, not the machine simulator. Type

	% cs161-gdb sys161
and you are debugging the simulator. Detailed instructions on how to debug your operating system and a brief introduction to GDB are contained in the handout Introduction to GDB for CS161 .

Setting up your account

We have created some scripts to help you set up your environment so that you can easily access the tools that you will need for this course. (A note to bash users: the equivalent set of scripts for bash are located in the same directory and entitled cs161.bashrc, cs161.bash_login. Some of the commands that follow need to be modified to work on bash. We trust that you will know what to do.)

  1. Add the following lines to your .cshrc (or equivalent) file:
      source ~cs161/usr/etc/cs161.cshrc
    
  2. For reasons that will become clear later, you want to be able to identify the machine you are logged into. If you don't already do this, you can edit the set prompt line to be something like:
      % set prompt = "%m:%~ %!> "
    
  3. Add the following line to your .login file:
      source ~cs161/usr/etc/cs161.login
    
  4. Log out and back in.

Getting the Distribution (hand me in)

First, download the OS161 source here: os161-current.tgz.

In addition to the OS161, you will see distributions of System/161, the machine simulator, and the OS161 toolchain, in the downloads directory here. If you are developing on a supported FAS environment, such as ice, nice, or one of the linux workstations in the Science Center basement, you do not need these additional files, as they are already installed. If you wish to develop on your home machine at home, you will need to download, build, and install these packages as well.

  1. Script the following session using the script command.
  2. Make a directory in which you will do all your CS161 work. For the purposes of the remainder of this assignment, we'll assume that it will be called cs161. Also, create a directory called asst0 into which you'll place the files that you will submit.
      % mkdir cs161
      % mkdir cs161/asst0
      % cd cs161
      % mv ../os161-current.tgz .
    
  3. Unpack the OS/161 distribution by typing
      % gunzip -c os161-current.tgz | tar xf -
    
    This will create a directory named os161-version.
  4. Rename your OS/161 source tree to just os161.
      % mv os161-1.10 os161
    
  5. End your script session by typing exit or by pressing Ctrl-D. Rename your typescript file to be setup.script.
      % mv typescript ~/cs161/asst0/setup.script
    

Setting up your CVS repository (hand me in)

  1. Script the following session using the script command.
  2. Create your CVS repository directory. Throughout the rest of this assignment, we will assume that you created ~/cs161/cvsroot.
  3. Set your CVSROOT environment variable. This will keep you from having to specify the -d argument every time you use CVS.
      % setenv CVSROOT ~/cs161/cvsroot
    
  4. Add this setenv line to your .cshrc file as well (do not include that editing session in your script output.)
  5. Echo $CVSROOT and make sure that it is what you expect (~/cs161/cvsroot).
  6. Initialize your repository by typing:
      % cvs init
    
    This tells CVS to create all the files it uses to track stuff in your repository. (If it complains about "not known CVSROOT", you probably didn't finish Item 3).
  7. Change directories into the OS/161 distribution that you unpacked in the previous section and import your source tree.
     
      % cd ~/cs161/os161
      % cvs import -m "Import of os161" src os161 os161-1_10    
    
    You can alter the arguments as you like; here's a quick explanation.

    -m "Initial import of os161" is the log message that CVS records. (If you don't specify it on the command line, it will start up a text editor). src is where CVS will put the files within your repository. It will also be the name that you specify when you check out your system. os161 is the "branch tag." You needn't worry about the full implications of this; think of it as giving CVS a name to help you remember from where you got this code. os161-1_10 is the name of the version of the code that you are importing. (Always use the actual version, whatever it is. Replace dots with underscores.) You can use this name later with cvs diff and other CVS commands.

  8. Now, remove the source tree that you just imported.
      % cd ..
      % rm -rf os161
    
    Don't worry - now that you have imported the tree in your repository, there is a copy saved away. In the next step, you'll get a copy of the source tree that is yours to work on. You can safely remove the original tree. You cannot ever remove your CVS repository located in $CVSROOT.
  9. Now, checkout a source tree in which you will work.
      % cd ~/cs161
      % cvs checkout src
    
  10. End your script session. Rename your script output to cvsinit.script.
      % mv typescript ~/cs161/asst0/cvsinit.script
    

Code Reading (hand me in)

One of the challenges of CS161 is that you are going to be working with a large body of code that was written by someone else. When doing so, it is important that you grasp the overall organization of the entire code base, understand where different pieces of functionality are implemented, and learn how to augment it in a natural and correct fashion. As you and your partner develop code, although you needn't understand every detail of your partner's implementation, you still need to understand its overall structure, how it fits into the greater whole, and how it works.

In order to become familiar with a code base, there is no substitute for actually sitting down and reading the code. Admittedly, most code makes poor bedtime reading (except perhaps as a soporific), but it is essential that you read the code. It is all right if you don't understand most of the assembly code in the codebase; it is not important for this class that you know assembly.

You should use the code reading questions included below to help guide you through reviewing the existing code. While you needn't review every line of code in the system in order to answer all the questions, we strongly recommend that you look over every file in the system.

The key part of this exercise is understanding the base system. Your goal is to understand how it all fits together so that you can make intelligent design decisions when you approach future assignments. This may seem tedious, but if you understand how the system fits together now, you will have much less difficulty completing future assignments. Also, it may not be apparent yet, but you have much more time to do so now than you will at any other point in the semester.

The file system, I/O, and network sections may seem confusing since we have not discussed how these components work. However, it is still useful to review the code now and get a high-level idea of what is happening in each subsystem. If you do not understand the low-level details now, that is OK.

These questions are not meant to be tricky -- most of the answers can be found in comments in the OS/161 source, though you may have to look elsewhere (such as Tannenbaum) for some background information. Place the answers to the following questions in a file called ~/cs161/asst0/code-reading.txt.

Top Level Directory

The top level directory of many software packages is called src or source. The top of the OS/161 source tree is also called src. In this directory, you will find the following files:

Makefile: top-level makefile; builds the OS/161 distribution, including all the provided utilities, but does not build the operating system kernel.

configure: this is an autoconf-like script. It sets up things like `How to run the compiler.' You needn't understand this file, although we'll ask you to specify certain pathnames and options when you build your own tree.

defs.mk: this file is generated when you run ./configure. You needn't do anything to this file.

defs.mk.sample: this is a sample defs.mk file. Ideally, you won't be needing it either, but if configure fails, use the comments in this file to fix defs.mk.

You will also find the following directories:

bin: this is where the source code lives for all the utilities that are typically found in /bin, e.g., cat, cp, ls, etc. The things in bin are considered "fundamental" utilities that the system needs to run.

include: these are the include files that you would typically find in /usr/include (in our case, a subset of them). These are user level include files; not kernel include files.

kern: here is where the kernel source code lives.

lib: library code lives here. We have only two libraries: libc, the C standard library, and hostcompat, which is for recompiling OS/161 programs for the host UNIX system. There is also a crt0 directory, which contains the startup code for user programs.

man: the OS/161 manual ("man pages") appear here. The man pages document (or specify) every program, every function in the C library, and every system call. You will use the system call man pages for reference in the course of assignment 2. The man pages are HTML and can be read with any browser.

mk: this directory contains pieces of makefile that are used for building the system. You don't need to worry about these, although in the long run we do recommend that anyone working on large software systems learn to use make effectively.

sbin: this is the source code for the utilities typically found in /sbin on a typical UNIX installation. In our case, there are some utilities that let you halt the machine, power it off and reboot it, among other things.

testbin: these are pieces of test code.

You needn't understand the files in bin, sbin, and testbin now, but you certainly will later on. Eventually, you will want to modify these and/or write your own utilities and these are good models. Similarly, you need not read and understand everything in lib and include, but you should know enough about what's there to be able to get around the source tree easily. The rest of this code walk-through is going to concern itself with the kern subtree.

The Kern Subdirectory

Once again, there is a Makefile. This Makefile installs header files but does not build anything.

In addition, we have more subdirectories for each component of the kernel as well as some utility directories. kern/arch: This is where architecture-specific code goes. By architecture-specific, we mean the code that differs depending on the hardware platform on which you're running.

For our purposes, you need only concern yourself with the mips subdirectory.

kern/arch/mips/conf:

conf.arch: This tells the kernel config script where to find the machine-specific, low-level functions it needs (see kern/arch/mips/mips).

Makefile.mips: Kernel Makefile; this is copied when you "config a kernel".

kern/arch/mips/include: These files are include files for the machine-specific constants and functions.

Question 1. Which register number is used for the stack pointer (sp) in OS/161?

Question 2. What bus/busses does OS/161 support?

Question 3. What is the difference between splhigh and spl0?

Question 4. What are some of the details which would make a function "machine dependent"? Why might it be important to maintain this separation, instead of just putting all of the code in one function?

kern/arch/mips/mips: These are the source files containing the machine-dependent code that the kernel needs to run. Most of this code is quite low-level.

Question 5. What does splx return?

Question 6. What is the highest interrupt level?

kern/asst1: This is the directory that contains the framework code that you will need to complete assignment 1. You can safely ignore it for now.

kern/compile: This is where you build kernels. In the compile directory, you will find one subdirectory for each kernel you want to build. In a real installation, these will often correspond to things like a debug build, a profiling build, etc. In our world, each build directory will correspond to a programming assignment, e.g., ASST1, ASST2, etc. These directories are created when you configure a kernel (described in the next section). This directory and build organization is typical of UNIX installations and is not universal across all operating systems. kern/conf: config is the script that takes a config file, like ASST1, and creates the corresponding build directory. So, in order to build a kernel, you should:

  % cd kern/conf
  % ./config ASST0
  % cd ../compile/ASST0
  % make depend
  % make
This will create the ASST0 build directory and then actually build a kernel in it. Note that you should specify the complete pathname ./config when you configure OS/161. If you omit the ./, you may end up running the configuration command for the system on which you are building OS/161, and that is almost guaranteed to produce rather strange results!

kern/dev: This is where all the low level device management code is stored. Unless you are really interested, you can safely ignore most of this directory.

kern/include: These are the include files that the kernel needs. The kern subdirectory contains include files that are visible not only to the operating system itself, but also to user-level programs. (Think about why it's named "kern" and where the files end up when installed.)

Question 7. How frequently are hardclock interrupts generated?

Question 8. What is the standard interface to a file system (i.e., what functions must you implement to implement a new file system)?

Question 9. How large are OS/161 pids? How many processes do you think OS/161 could support as you have it now (ASST0)? A sentence or two of justification is fine.

Question 10. A vnode is a kernel abstraction that represents a file. What operations can you do on a vnode? If two different processes open the same file, do we need to create two vnodes?

Question 11. What is the system call number for a reboot? Is this value available to userspace programs? Why or why not.

kern/lib: These are library routines used throughout the kernel, e.g., managing sleep queues, run queues, kernel malloc, etc.

Question 12. What is the purpose of functions like copyin and copyout in copyinout.c? What do they protect against? Where might you want to use these functions?

Question 13.Why are there two separate implementations of 'malloc' (one in kern/lib/kheap.c, one in lib/libc/malloc.c)? Give one major difference between the two implementations.

kern/main: This is where the kernel is initialized and where the kernel main function is implemented.

kern/thread: Threads are the fundamental abstraction on which the kernel is built.

Question 14 Is it OK to initialize the thread system before the scheduler? Why (not)?

Question 15. What are the possible states that a thread can be in? When do "zombie" threads finally get cleaned up?

Question 16. What function puts a thread to sleep? When might you want to use this function?

kern/userprog: This is where you will add code to create and manage user level processes. As it stands now, OS/161 runs only kernel threads; there is no support for user level code. In Assignment 2, you'll implement this support.

kern/vm: This directory is also fairly vacant. In Assignment 3, you'll implement virtual memory and most of your code will go in here.

kern/fs: The file system implementation has two subdirectories. We'll talk about each in turn. kern/fs/vfs is the file-system independent layer (vfs stands for "Virtual File System"). It establishes a framework into which you can add new file systems easily. You will want to go look at vfs.h and vnode.h before looking at this directory.

Question 17. What does a device pathname in OS/161 look like?

Question 18. What does a raw device name in OS/161 look like?

Question 19. What lock protects the vnode reference count?

kern/fs/sfs: This is the simple file system that OS/161 contains by default. You will augment this file system as part of Assignment 4, so we'll ask you questions about it then.

Building a kernel (hand me in)

Now it is time to build a kernel. As described above, you will need to configure a kernel and then build it.

  1. Script the following steps using the script command.
  2. Configure your tree for the machine on which you are working. If you want to work in a directory that's not $HOME/cs161 (which you will be doing when you test your later submissions) you might want to use the --ostree option. ./configure --help explains the other options.
      % cd ~/cs161/src
      % ./configure
    
  3. Configure a kernel named ASST0.
      % cd ~/cs161/src/kern/conf
      % ./config ASST0
    
  4. Build the ASST0 kernel.
      % cd ../compile/ASST0
      % make depend
      % make
    
  5. Install the ASST0 kernel.
      % make install
    
  6. Now also build the user level utilties.
      % cd ~/cs161/src
      % make
    
  7. End your script session. Rename your script output to build.script.
      % mv typescript ~/cs161/asst0/build.script
    

Running your kernel (hand me in)

  1. Download the file sys161.conf from the course web site and place it in your OS/161 root directory (~/cs161/root).
  2. Script the following session.
  3. Change into your root directory.
      % cd ~/cs161/root
    
  4. Run the machine simulator on your operating system.
      % sys161 kernel
    
  5. At the prompt, type p /sbin/poweroff <return>. This tells the kernel to run the "poweroff" program that shuts the system down.
  6. End your script session. Rename your script output to run.script.
      % mv typescript ~/cs161/asst0/run.script
    

Practice modifying your kernel (hand me in)

  1. Create a file called main/hello.c.
  2. In this file, write a function called hello that uses kprintf() to print "Hello World\n".
  3. Edit main/main.c and add a call (in a suitable place) to hello().
  4. Make your kernel build again. You will need to edit conf/conf.kern, reconfig, and rebuild.
  5. Make sure that your new kernel runs and displays the new message.
  6. Once your kernel builds, script a session demonstrating a config and build of your modified kernel. Call the output of this script session newbuild.script.
      % mv typescript ~/cs161/asst0/newbuild.script
    

Using GDB (hand me in)

  1. Script the following gdb session (that is, you needn't script the session in the run window, only the session in the debug window). Be sure both your run window and your debug window are on the same machine.
  2. Run the kernel in gdb by first running the kernel and then attaching to it from gdb.
      (In the run window:)
      % cd ~/cs161/root
      % sys161 -w kernel
    
      (In the debug window:)
      % script
      % cd ~/cs161/root
      % cs161-gdb kernel
      (gdb) target remote unix:.sockets/gdb
      (gdb) break menu
      (gdb) c
         [gdb will stop at menu() ...]
      (gdb) where
         [displays a nice back trace...]
      (gdb) detach
      (gdb) quit
    
  3. End your script session. Rename your script output to gdb.script.
      % mv typescript ~/cs161/asst0/gdb.script
    

Practice with CVS (hand me in)

In order to build your kernel above, you already checked out a source tree. Now we'll demonstrate some of the most common features of CVS. Create a script of the following session (the script should contain everything except the editing sessions; do those in a different window). Call this file cvs-use.script.

  1. Edit the file kern/main/main.c. Add a comment with your name in it.
  2. Execute
      % cvs diff -c kern/main/main.c
    
    to display the differences in your version of this file.
  3. Now commit your changes using cvs commit.
  4. Remove the first 100 lines of main.c.
  5. Try to build your kernel (this ought to fail).
  6. Realize the error of your ways and get back a good copy of the file.
      % rm main.c
      % cvs update -d main.c
    
  7. Try to build your tree again.
  8. Now, examine the DEBUG macro in lib.h. Based on your earlier reading of the operating system, add ten useful debugging messages to your operating system.
  9. Now, show us where you inserted these DEBUG statements by doing a diff.
      % cd ~/cs161/src
      % cvs diff -c
    
  10. Finally, you should create a release (refer to the document on Preparing your Assignments for Submission ).
      % cs ~/cs161
      % cvs tag asst0-end src
      % cd ~/cs161/asst0
      % cvs export -rasst0-end src
      % tar cf - src | gzip -c > mygroup-asst0.tar.gz
    
    After you tar your release, be sure to remove the src directory (rm -rf src/), as specified in the handout.

What (and how) to hand in

Like most CS courses, we'll be using the online submit utility. Your asst0 directory should contain everything you need to submit, specifically:

Review the document Preparing your Assignments for Submission . After you have completed everything described in that document, run the submit command. Do not print out this assignment, regardless of what the handout tells you to do!!