Introduction to Programming: more stuff!

In the last section, we learned the most fundamental programming concepts, like loops and conditionals. This was enough to let us solve the Hamming distance problem. However we missed some very useful concepts worth going over.

Libraries

In solving our Hamming distance problem, we used some pre-existing functions like range and len. But we only used built-in Python functions. This is actually quite unusual - a lot of the time when solving a problem, we'll want to reuse code that other people have already written but which is not a default part of Python.

This is done with libraries, also called Python packages or modules. To load a library into Python, use the import command.

Here's a library called requests, which is used to read web pages. We could write our own function to connect to a website and read a webpage, but it would be a lot of work. Instead, import requests:

In [1]:
import requests

Now all the functions in the requests library will work. We can ask requests' get() function to retrieve a web page for us. It will get the source code (the HTML) for the page we ask for.

In [2]:
webpage = requests.get("http://unimelb.edu.au/")
print webpage.text

We won't show the output of the above command as it's very long - try running it yourself!

There are a lot of libraries out there, far too many to know them all. You'll generally learn about them as you need them.

One important set of Python libraries for scientists are scipy and numpy.

  • numpy provides many functions for efficiently working with matrices, vectors, random numbers, etc.
  • scipy has many parts, including functions for statistics and machine learning.

Let's look at an example:

In [4]:
import numpy
In [5]:
# 10 random numbers from the normal distribution
x = numpy.random.randn(10)
print x
[ 0.83575465 -0.38725235 -1.85253931 -0.18169539 -1.63805177  1.2733453
 -1.58089026 -1.02711677 -0.90158365  0.06021346]

Libraries sometimes define their own data types. In this case the randn() function has returned a numpy array, which we can see with the type() function:

In [6]:
print type(x)
<type 'numpy.ndarray'>