02 - Python Basics

Ways to run Python code

In this workshop we're using a software platform call the Jupyter Notebook, which lets you run Python code inside your web-browser, e.g. click on the next cell and press Ctrl+Enter to run this snippet of Python:

In [1]:
print("Hello world")
Hello world

Python can also be run interactively at the command line in your terminal window, where >>> represents the interactive Python prompt and quit() is the simplest way to exit.

$ python
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 12:04:33) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello world")
Hello world
>>> quit()

More commonly people write Python scripts. These are plain text files usually ending with the .py extension, which can be run like this:

$ python example.py
...

Any self contained snippet of Python (including many of the examples here) could be run this way - but doing it in the notebook is in many ways easier, especially for interactive work and keeping notes with the code. The Jupyter Notebooks really shine when producing graphics with Python, as we will see this afternoon.

Strings

Python strings can be defined using double quotes, or single quotes. It doesn't matter which you use, but they have to match. Strings can be added together (concatenated) with the + operation, or duplicated by multiplying by an integer number:

In [2]:
name = "Hello"
message = name + " world"
print(message)
print(message * 3)
Hello world
Hello worldHello worldHello world

It is very common to want to combine strings togther, often including numbers or other values. A widely used approach to string formating works with percent sign place holders:

  • %s to insert a string
  • %i to insert an integer number
  • %f to insert a floating point number

(This convention was introduced in the C programming language, which was enormously influential in later programming language design.)

In [3]:
name = "Peter"
message = "Hello %s, your name has %i letters" % (name, len(name))
print(message)
Hello Peter, your name has 5 letters

Lists

The Python list serves as a general purpose data structure for holding an ordered collection of values. This is similar to an 'array' in other languages.

You can have lists of strings, lists of integers, etc. The length of a list is defined as the number of elements in the list.

In [4]:
names = ["Peter", "Sue", "Leighton"]
print(len(names))
3

For loops

Most programming languages, including Python, have several ways to repeat a block of code multiple times. Python's for loop works with a loop variable (letter in the example below) which takes in turn each of the values to be looped over (here the letters in string variable message):

In [5]:
message = "Hello world"
for letter in message:
    print(letter)
H
e
l
l
o
 
w
o
r
l
d

Another common situation is to loop over a list of values:

In [6]:
for value in ["alpha", "beta", "gamma", "delta"]:
    print(value)
alpha
beta
gamma
delta

Later in the workshop you'll see this syntax used with other constructs, including parsing a sequence file, where we loop over each sequence record in the file.

Defining Functions

Often as your Python code gets longer you will find you repeat snippets of code. In this situation it is usually best to turn the repeated code into a function which can be defined once and then used multiple times (reproducibility).

In [7]:
# Python keyword def is short for define
# Here defining a function taking one argument
def make_message(name):
    length = len(name)
    # Python keyword return exits the function with this value:
    return "Hello %s, your name is %i characters long" % (name, length)

print(make_message("Peter"))
print(make_message("Sue"))
print(make_message("Leighton"))
Hello Peter, your name is 5 characters long
Hello Sue, your name is 3 characters long
Hello Leighton, your name is 8 characters long

For loops are also very important for reducing duplicated code, so in this little example rather than calling our function three times we could do this:

In [8]:
# Assumes you've already executed the cells above which defined
# the list *names* and the function *make_message*
for name in names:
    print(make_message(name))
Hello Peter, your name is 5 characters long
Hello Sue, your name is 3 characters long
Hello Leighton, your name is 8 characters long

The examples we have shown so far are functions taking a single argument, but functions can take multiple arguments. This example is a function which requires two arguments:

In [9]:
def letter_frequency(text, letter):
    return text.count(letter) / len(text)

sequence = "AGTGACACAGGT"
for base in "ACGT":
    print("Frequency of letter %s is %f" % (base, letter_frequency(sequence, base)))
Frequency of letter A is 0.333333
Frequency of letter C is 0.166667
Frequency of letter G is 0.333333
Frequency of letter T is 0.166667

This example also introduced something new for counting the letters in a string. Python strings have lots of methods, a special kind of Python function acting on the the object itself via this .method(...) syntax.

In [10]:
print(message.upper())
print(message.lower())
print(message.count("l"))
HELLO WORLD
hello world
3

Resources

We've tried to introduce a minimum of concepts and syntax here. There will be more Python examples later on, which we won't have time to explain in detail as we want to focus on the Bioinformatics instead.