We use Python
for this:
Use what your colleagues (tend to) use
To analyse and visualise experimental data
Tabular (comma-separated) data
We can do this with a little programming
Before we begin…
cd ~/Desktop
mkdir python-novice-inflammation
cd python-novice-inflammation
LIVE DEMO
Before we begin…
cp 2017-03-23-standrews/lessons/python-01/data/python-novice-inflammation-data.zip ./
cp 2017-03-23-standrews/lessons/python-01/data/python-novice-inflammation-code.zip ./
unzip python-novice-inflammation-data.zip
unzip python-novice-inflammation-code.zip
(you can download files via Etherpad
)
(http://pad.software-carpentry.org/2017-03-23-standrews)
LIVE DEMO
Jupyter
At the command-line, start Jupyter
notebook:
jupyter notebook
Jupyter
landing page
variables
)
Jupyter
documents are comprised of cellsJupyter
cell can have one of several typesMarkdown
Markdown
allows us to enter formatted text.Shift + Enter
Shift + Enter
name
, containing "Samia"
print()
function shows the contents of a variable
weight_kg = 55
print(weight_kg)
2.2 * weight_kg
print("weight in pounds", 2.2 * weight_kg)
weight_kg = 57.5
print("weight in kilograms is now:", weight_kg)
weight_lb = 2.2 * weight_kg
print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb)
weight_kg = 100
print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb)
What are the values in mass
and age
after the following code is executed?
mass = 47.5
age = 122
mass = mass * 2.0
age = age - 20
mass == 47.5
, age == 122
mass == 95.0
, age == 102
mass == 47.5
, age == 102
mass == 95.0
, age == 122
What does the following code print out?
first, second = 'Grace', 'Hopper'
third, fourth = second, first
print(third, fourth)
Hopper Grace
Grace Hopper
"Grace Hopper"
"Hopper Grace"
Jupyter
notebook or iPython
terminal…%whos
will show you all defined variables
data/inflammation-01.csv
$ head data/inflammation-01.csv
0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0
0,1,2,1,2,1,3,2,2,6,10,11,5,9,4,4,7,16,8,6,18,4,12,5,12,7,11,5,11,3,3,5,4,4,5,5,1,1,0,1
0,1,1,3,3,2,6,2,5,9,5,7,4,5,4,15,5,11,9,10,19,14,12,17,7,12,11,7,4,2,10,5,4,2,2,3,2,2,1,1
0,0,2,0,4,2,2,1,6,7,10,7,9,13,8,8,15,10,10,7,17,4,4,7,6,15,6,4,9,11,3,5,6,3,3,4,2,3,2,1
0,1,1,3,3,1,3,5,2,4,4,7,6,5,3,10,8,10,6,17,9,14,9,7,13,9,12,6,7,7,9,6,3,2,2,4,2,0,1,1
0,0,1,2,2,4,2,1,6,4,7,6,6,9,9,15,4,16,18,12,12,5,18,9,5,3,10,3,12,7,8,4,7,3,5,4,4,3,2,1
0,0,2,2,4,2,2,5,5,8,6,5,11,9,4,13,5,12,10,6,9,17,15,8,9,3,13,7,8,2,8,8,4,2,3,5,4,1,1,1
0,0,1,2,3,1,2,3,5,3,7,8,8,5,10,9,15,11,18,19,20,8,5,13,15,10,6,10,6,7,4,9,3,5,2,5,3,2,2,1
0,0,0,3,1,5,6,5,5,8,2,4,11,12,10,11,9,10,17,11,6,16,12,6,8,14,6,13,10,11,4,6,4,7,6,3,2,1,0,0
0,1,1,2,1,3,5,3,5,8,6,8,12,5,13,6,13,8,16,8,18,15,16,14,12,7,3,8,9,11,2,5,4,5,1,4,1,2,0,0
numpy
libraryPython
librariesPython
contains many powerful, general toolsimport
import numpy
import seaborn
JUPYTER
MAGICJupyter
is through magic%pylab inline
import numpy
import seaborn
Jupyter
notebooksnumpy
, seaborn
, pylab
numpy
: work with matrices and arrays in Python
seaborn
: attractive statistical summary graphspylab
: numerical operations and visualisation in Python
Calling %pylab inline
shows graphics within the notebook itself
numpy
provides a function loadtxt()
to load tabular data:numpy.loadtxt(fname='data/inflammation-01.csv', delimiter=',')
loadtxt()
belongs to numpy
fname
: an argument expecting the path to a filedelimiter
: an argument expecting the character that separates columns...
indicate missing rows or columns1 == 1. == 1.0
)data
type(data)
print(data.dtype)
print(data.shape)
LIVE DEMO
data
data.<attribute>
e.g. data.shape
print('first value in data:', data[0, 0])
print('middle value in data:', data[30, 20])
LIVE DEMO
:
(colon).print(data[0:4, 0:10])
print(data[5:10, 0:10])
LIVE DEMO
Python
assumes the first elementPython
assumes the end elementQUESTION: What would :
on its own indicate?
small = data[:3, 36:]
print('small is:')
print(small)
LIVE DEMO
We can take slices of any series, not just arrays.
element = 'oxygen'
print('first three characters:', element[0:3])
first three characters: oxy
What is the value of element[:4]
?
oxyg
gen
oxy
en
array
s know how to perform operations on their values+
, -
, *
, /
, etc. are elementwisedoubledata = data * 2.0
print('original:')
print(data[:3, 36:])
print('doubledata:')
print(doubledata[:3, 36:])
tripledata = doubledata + data
print('tripledata:')
print(tripledata[:3, 36:])
LIVE DEMO
numpy
functionsnumpy
provides functions to operate on arraysprint(numpy.mean(data))
maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data)
print('maximum inflammation:', maxval)
print('minimum inflammation:', minval)
print('standard deviation:', stdval)
maxval, minval, stdval = data.max(), data.min(), data.std()
print('maximum inflammation:', maxval)
print('minimum inflammation:', minval)
print('standard deviation:', stdval)
LIVE DEMO
patient_0 = data[0, :] # Row zero only, all columns
print('maximum inflammation for patient 0:', patient_0.max())
print('maximum inflammation for patient 0:', numpy.max(data[0, :]))
print('maximum inflammation for patient 2:', numpy.max(data[2, :]))
LIVE DEMO
numpy
operations on axesnumpy
functions take an axis=
parameter: 0
(columns) or 1
(rows)print(numpy.max(data, axis=1))
print(data.mean(axis=0))
LIVE DEMO
Here’s one I prepared earlier (for the Software Sustainability Institute):
matplotlib
matplotlib
is the de facto standard plotting library in Python
import
ed seaborn
earlier, which makes matplotlib
output nicer%pylab inline
earlier, which puts matplotlib
output in the notebookimport matplotlib.pyplot
image = matplotlib.pyplot.imshow(data)
LIVE DEMO
matplotlib
.imshow()
.imshow()
renders matrix values as an imagematplotlib
.plot()
.plot()
renders a line graphave_inflammation = numpy.mean(data, axis=0)
ave_plot = matplotlib.pyplot.plot(ave_inflammation)
LIVE DEMO
.mean()
looks artificialmax_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0))
min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0))
LIVE DEMO
Can you create a plot showing the standard deviation (numpy.std()
) of the inflammation data for each day across all patients?
fig = matplotlib.pyplot.figure()
ax = fig.add_subplot()
ax.set_ylabel()
ax.plot()
LIVE DEMO
Can you modify the last plot to display the three graphs on top of one another, instead of side by side?
for
loopsword = "lead"
print(word[0])
print(word[1])
print(word[2])
print(word[3])
LIVE DEMO
for
loopsfor
loops perform actions for every item in a collectionword = "lead"
for char in word:
print(char)
LIVE DEMO
for
loopsfor element in collection:
<do things with element>
for
loop statement ends in a colon, :
tab
(\t
)for
loop cycleslength = 0
for vowel in 'aeiou':
length = length + 1
print('There are', length, 'vowels')
LIVE DEMO
letter = 'z'
for letter in 'abc':
print(letter)
print('after the loop, letter is', letter)
LIVE DEMO
range()
range()
function creates a sequence of numbersrange
type that can be iterated over.range(3)
range(2, 5)
range(3, 10, 3)
for val in range(3, 10, 3):
print(val)
LIVE DEMO
Python
built-in: print(5 ** 3)
Can you use a for
loop to calculate 5 ** 3
using only multiplication?
Newton
, and produces a new string with the characters in reverse order, e.g. notweN
?enumerate()
enumerate()
function creates paired indices and values for elements of a sequenceenumerate("aeiou")
for idx, val in enumerate("aeiou"):
print(idx, val)
\[y = a_0 + a_1 x + a_2 x^2 + a_3 x^3 + a_4 x^4\]
coeffs = [2, 4, 3, 2, 1]
list
s are a built in Python
datatypeodds = [1, 3, 5, 7]
print('odds are:', odds)
print('first and last:', odds[0], odds[-1])
for number in odds:
print(number)
LIVE DEMO
list
s, like string
s, are sequenceslist
elements can be changed: list
s are mutablestring
s are not mutablenames = ['Newton', 'Darwing', 'Turing'] # typo in Darwin's name
print('names is originally:', names)
names[1] = 'Darwin' # correct the name
print('final value of names:', names)
name = 'Darwin'
name[0] = 'd'
list
s in-placemy_list = [1, 2, 3, 4]
your_list = my_list
my_list[1] = 0
print("my list:", my_list)
print("your list:", your_list)
LIVE DEMO
your_list
?list
copieslist
by slicing it or using the list()
functionnew_list = old_list[:]
my_list = [1, 2, 3, 4]
your_list = my_list[:] # or list(my_list)
print("my list:", my_list)
print("your list:", your_list)
my_list[1] = 0
print("my list:", my_list)
print("your list:", your_list)
LIVE DEMO
list
slist
s can contain any datatype, even other list
sx = [['pepper', 'zucchini', 'onion'],
['cabbage', 'lettuce', 'garlic'],
['apple', 'pear', 'banana']]
LIVE DEMO
list
functionslist
s are Python
objects and have useful functionsodds.append(9)
print("odds after adding a value:", odds)
odds.reverse()
print("odds after reversing:", odds)
print(odds.pop())
print("odds after popping:", odds)
LIVE DEMO
+
) having more than one meaning, depending on the thing it operates on.vowels = ['a', 'e', 'i', 'o', 'u']
vowels_welsh = ['a', 'e', 'i', 'o', 'u', 'w', 'y']
print(vowels + vowels_welsh)
counts = [2, 4, 6, 8, 10]
repeats = counts * 2
print(repeats)
+
) and ‘multiplication’ (*
) do for lists?
<something>
if
some condition is trueif
statement:if <condition>:
<executed if condition is True>
num = 37
if num > 100:
print('greater')
print('done')
LIVE DEMO
if-else
statementsif
statement executes code if the condition evaluates as true
false
?if <condition>:
<executed if condition is True>
else:
<executed if condition is not True>
num = 37
if num > 100:
print('greater')
else:
print('not greater')
print('done')
LIVE DEMO
if-elif-else
elif
(else if
)if <condition1>:
<executed if condition1 is True>
elif <condition2>:
<executed if condition2 is True and condition1 is not True>
else:
<executed if no conditions True>
num = -3
if num > 0:
print(num, "is positive")
elif num == 0:
print(num, "is zero")
else:
print(num, "is negative")
LIVE DEMO
and
, or
and not
if (1 > 0) and (-1 > 0):
print('both parts are true')
else:
print('at least one part is false')
LIVE DEMO
What is the result of executing the code below?
if 4 > 5:
print('A')
elif 4 == 5:
print('B')
elif 4 < 5:
print('C')
A
B
C
B
and C
==
(equality) and in
(membership)print(1 == 1)
print(1 == 2)
print('a' in 'toast')
print('b' in 'toast')
print(1 in [1, 2, 3])
print(1 in range(3))
print(1 in range(2, 10))
LIVE DEMO
letters = 'abcdefghijklmnopqrstuvwxyz'
vowels = 'aeiou'
result = [l.upper() for l in letters if l in vowels]
print(result)
LIVE DEMO
os
moduleos
module allows interaction with the filesystemimport os
%pylab inline
import matplotlib.pyplot
import numpy as np
import os
import seaborn
LIVE DEMO
os.listdir()
.listdir()
function lists the contents of a directoryfor
loop or list comprehension'data'
directoryos.listdir('data')
files = [f for f in os.listdir('data') if f.startswith('inflammation')]
print(files)
os.path.join()
os.listdir()
function only returns filenames, not the path (relative or absolute)os.path.join()
builds a path from directory and filenames, suitable for the underlying OSprint(os.path.join('data', 'inflammation-01.csv'))
LIVE DEMO
Now we have all the tools we need to load all the inflammation data files, and visualise the mean, minimum and maximum values in an array of plots.
os
and a list comprehensionnp.loadtxt()
mp.mean()
, np.max()
, etc.matplotlib
.add_subplot()
filenames = [os.path.join('data', f) for f in os.listdir('data')
if f.startswith('inflammation')]
for f in filenames:
print(f)
data = np.loadtxt(fname=f, delimiter=',')
fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0))
axes1 = fig.add_subplot(1, 3, 1)
axes2 = fig.add_subplot(1, 3, 2)
axes3 = fig.add_subplot(1, 3, 3)
axes1.set_ylabel('average')
axes1.plot(np.mean(data, axis=0))
axes2.set_ylabel('max')
axes2.plot(np.max(data, axis=0))
axes3.set_ylabel('min')
axes3.plot(np.min(data, axis=0))
fig.tight_layout()
matplotlib.pyplot.show()
LIVE DEMO