CS 222 -- Algorithms at the End of the Wire
Handouts and Class Materials
- [9/22] There's an "assignment" up on compression. It is not actually due, but is meant to give you problems to try/practice to make sure you understand the material. Some of it is on papers we'll get to later so you might not be able to do all the problems immediately. I think we'll plan one or more sections with the TA Daniel for people who want to work through the problems together.
- [9/11] On Canvas there's a video release form you need to sign --
it is necessary for us to record the course. It's in the form of a quiz.
Please check on Canvas (a link will be in your Canvas announcements).
- [9/11] Just a reminder that schedule may change on readings, etc. as we go.
- [9/11] I'm putting up preliminary project directions, as people have asked about them. For the record, it's still very early to talk about projects -- we haven't even started information theory, compression, etc. !! -- but some people want to think ahead, and it's good to have expectations up.
- [9/3] Discussions for next week are up.
- [8/19] I'm told that there's help for (extension students, but maybe harvard students can use it too)
- [8/19] Equipment Requirements: I've been told to put on the syllabus that ipads with stylus or related devices are required for homeworks, office hours, class discussions etc. So I am doing so! This will allow you to borrow equipment from Harvard -- the contact number for this is (617) 495-7777. Of course if you don't manage to borrow one we will find ways to deal but it will certainly be a lot better if you do -- it's a lot easier to work on problems with peers (or with me) with a device.
- [8/15] I'll have a Piazza page up soon. This will be for class questions; we'll probably use a separate tool for class discussion issues.
The Piazza link is piazza.com/harvard/fall2020/cs222 . I can enter you if you can't sign up.
- [8/15] We will plan to use Gradescope+Canvas to turn in assignments. Please make sure you are
signed up to Gradescope+Canvas. (This probably won't be set up for a week or two.)
Basic Class Information
All reading dates are tentative and will be confirmed in class.
We may go faster, we may go slower. We may have to move things
for other reasons.
You should consider the assigned papers a minimum of what you should be
reading for this class. Feel free to explore on the Web or otherwise
(and additional suggested readings for each topic will be listed as well). We are just
touching the surface of these topics; there's much more out there.
Unit 0: Fun Stuff to Start us off
Please read all these, preferably before class begins, to see if you're interested.
Unit 1: Data Sketches
- Class 3 [9/10]: MapReduce: Simplified Data Processing on Large Clusters
by Dean and Ghemawat.
- Class 3 [9/10]: A Model of Computation for MapReduce by Karloff, Suri and Vassilvitskii.
- Questions for Discussion Groups: Compare and contrast the theory and practice of the MapReduce paradigm. What kind of tasks might MapReduce not be good for?
Are there any eventual scaling problems for this paradigm? Suppose you had access to a large-scale MapReduce system -- what would you
most want to use it for?
- Class 4 [9/15]:
Constant Time Updates in Hierarchical Heavy Hitters
by Ran Ben Basat, Gil Einziger, Roy Friedman, Marcelo Caggiani Luizelli, Erez Waisbard.
- Class 4 [9/15]:
Faster and More Accurate Measurement through Additive-Error Counters
by Ran Ben Basat, Gil Einziger, Michael Mitzenmacher, Shay Vargaftik.
- Questions for Discussion Groups: The sketching framework offers lots of opportunities, or "tricks", for putting together pieces in various ways to get better results. Describe the base operations that get put together in these papers. Do you find these combinations compelling? Can you think of other areas where putting together a combination of ideas in the right way might be useful?
Additional useful papers and other stuff for Unit 1:
Unit 2: Compression and Basic Information Theory
- [9/22] and ongoing: We will be covering the basics of compression and
information theory, including Huffman coding, arithmetic coding, LZ-style coding, etc.
Some good online introductions to the material include:
Information Theory, Inference, and Learning Algorithms, specifically part 1 (Data Compression) of Mackay's Book. (Though the whole book is good.)
Introduction to Data Compression,
notes by Guy Blelloch.
- Class 6 [9/22] For today the plan is (likely to be) that you review the last class's lecture on compression topics, and we discuss/do problems/etc. in class.
- Class 7 [9/24] On Compressing Social Networks, by F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan.
- Class 7 [9/24]: Permuting Web and Social Graphs, by P. Boldi, M. Santini, and S. Vigna.
- Questions for 9/24: How are compressing social networks and the web alike, and different? What role does having an underlying model
play in determining how to compress these types of structures? What properties of the model(s) appear important?
Unit 3: Link Information
- [Class 18] 11/03: No class for election day; please work on reviews for mock PC program.
- [Class 19] 11/05: Tentatively: last guest speaker.
- [Class 20] 11/10: Fountain codes. by MacKay.
- [Class 20] 11/10: Digital Fountains: A survey and look forward. by Mitzenmacher.
- [Class 20] 11/10 XORs in The air: Practical Wireless Network Coding , by
Katti, Rahul, Hu, Katabi, Medard,and Crowcroft.
- [Class 20] 11/10: NOTE: not required, just background reading if you like! Network Coding, an Instant Primer , by Fragouli, Le Boudec,and Widmer.
- Questions for 11/10: Explain, in your own words, what is a Digital Fountain. How close can we get to a Digital Fountain in practice?
- Questions for 11/10: What is the difference between network coding and digital fountains? Where might each be helpful?
- [Class 21] 11/12: Embedded ethics lecture. Please read the below before lecture and answer the discussion question.
- [Class 21] 11/12:
Quantifying fairness in queueing systems.
Questions for 11/12: The authors of the paper discuss two scheduling policies: First-In-First-Out and Shortest-Job-First. Which do you think is more fair, and why?