- [7/30] We'll have a Piazza page and such up before class begins. We may use Canvas to turn in assignments. Similarly, we'll have a link to online lectures as soon as possible.

- 9/1 Background: Review Markov chains.
- Start with wikipedia. The first external link is to a useful book chapter on Markov chains.

- 9/6: Authoritative Sources in a Hyperlinked Environment. by Jon Kleinberg.
- 9/6: The PageRank Citation Algorithm. by Brin, Page, Motwani, Winograd.
- Questions for 9/6: How does PageRank differ from Kleinberg's algorithm? How is it the same? Can you think of ways to improve Kleinberg's algorithm, or PageRank?

- 9/8: Improved Algorithms for Topic Distillation in a Hyperlinked Environment. by Bharat and Henzinger.
- 9/8: Analysis of a Very Large Altavista Query Log. Henzinger, Marais, Moricz, and Silverstein.
- Questions for 9/8: What techniques are suggested for improving on Kleinberg's algorithm? Do they appear worthwhile given the costs?
- Questions for 9/8: What sort of data do they try to mine from the query log? Which seems the most useful? Can you think of anything they should be looking for but did not?

Additional useful papers for Unit 1:

- Graph Structure in the Web by Broder et al.
- The Link Database: Fast Access to Graphs of the Web by Randall et al.
- The Eigentrust Algorithm for Reputation Management in P2P Networks by Kamvar, Schlosser, and Garcia-Molina.
- Trust-Based Recommendation Systems by Andersen et al.
- PicASHOW: Pictorial Authority Search by Hyperlinks on the Web. by Lempel and Soffer.
- The Stochastic Approach for Link-Structure Analysis (SALSA) and the TKC Effect by Lempel and Moran.
- Power Laws, Pareto Distributions, and Zipf's Law. by Mark Newman.
- On Power-Law Relationships of the Internet Topology. by Faloutsos, Faloutsos, and Faloutsos.
- Power-Law Distributions in Empirical Data. by Clauset, Shalizi, and Newman.
- The Anatomy of the Long Tail. by Goel, Broder, Gabrilovich, and Pang.
- Aggregating Inconsistent Information. by Ailon, Charikar, Newman.
- Editorial: The Future of Power Law Research. by Mitzenmacher.

- 9/15, and ongoing: We will be covering the basics of compression and
information theory, including Huffman coding, arithmetic coding, LZ-style coding, etc.
Some good online introductions to the material include:

Information Theory, Inference, and Learning Algorithms, specifically part 1 (Data Compression) of Mackay's Book. (Though the whole book is good.)

Introduction to Data Compression, notes by Guy Blelloch.

-->