I’m assuming that you have Python installed and are using a fairly recent version from the 2 series, e.g., Python 2.7. If you have a Mac, note that even though it comes with a version of Python installed, you probably want to go ahead install it yourself; you’ll run into fewer problems if you do this.

If you’re just getting started with Python, I highly recommend Mark Lutz’s Learning Python. This is an excellent O’Reilly book that clearly introduces Python’s syntax and data structures, and how to really leverage the language through “Pythonic” thinking. A few years ago, I spent about two weeks reading this book and going through the examples, and then I was ready to start using Python for my daily work. Also check out the awesome new Online Python Tutor by Philip Guo.

If want to do any sort of scientific computing or data analysis in Python, you almost definitely want NumPy. If you are trying to construct a MATLAB-like environment in Python, then at a bare minimum you want NumPy + IPython + matplotlib. A great way to get this all this good stuff and more is via the Enthought Python distribution (EPD).

- NumPy (short for Numerical Python) gives you a powerful N-dimensional array object, fast linear algebra routines, and lots of convenient. It is the basis of many other Python packages for scientific computing and data analysis. If you are a MATLAB user, check out NumPy for MATLAB users.
- IPython is an incredibly useful Python shell for interactive computing and exploratory data analysis. Many GUI toolkits, including matplotlib, depend on IPython.
- matplotlib is a rich 2D plotting environment. Its look and interfaces should feel very familiar to any MATLAB user.
- SciPy is a suite of tools for scientific computing. Check out their Getting Started page.

A lot of my code also depends on tabular, a package of Python modules for working with tabular data, built on top of NumPy. You should install NumPy 1.6 or higher before installing tabular.

If you use Git, the best way to get tabular is to clone the git repository and then do the usual installation for Python packages:

```
$ git clone https://github.com/yamins81/tabular.git
$ cd tabular
$ sudo python setup.py install
```

While not recommended, you can also get it from PyPI or use pip. You can download and unzip the PyPI archive and then do the usual installation for Python packages:

```
$ cd tabular
$ sudo python setup.py install
```

Here are a few highlights from the Python documentation:

- built-in functions -
`all()`,`len()`,`max()`, etc. - built-in constants -
`True`,`False`,`None`, etc. - built-in types - tons of basic stuff
- csv - CSV file reading and writing
- datetime - basic date and time types
- hashlib - secure hashes and message digests
- pickle - Python object serialization
- os - miscellaneous operating system interfaces
- re - regular expression operations
- time - time access and conversions
- unittest - unit testing framework

Use pip for installing and managing Python packages whenever possible. Its basic usage looks like this:

`$ pip install coolpackage`

Here’s a partial list of Python packages that I use:

- BeautifulSoup for parsing HTML and XML (download the .tar for 3.0.8)
- hcluster for agglomerative clustering
- mechanize for programmatic web browsing
- pymongo for using MongoDB
- nose for better unittesting
- pip for managing Python dependencies
- Sphinx for generating these and other Python docs
- StarCluster for interacting with Amazon EC2
- xlrd for dealing with MS Excel files

And finally, other tools that I use all the time: