09 - Programming for `KEGG`¶

Table of Contents¶

Introduction
Python imports
Running a remote KEGG query

Introduction¶

The KEGG browser interface, while able to integrate searches across comprehensive and quite disparate datasets, does not always present the most convenient interface to extract that information (such as downloading FASTA sequences for an entry). As with all browser-based interfaces, it can also be tedious and time-consuming to point-and-click your way through a large number of searches.

This notebook presents examples of methods for using `KEGG` programmatically, via the Biopython programming libraries, and you will be controlling the searches using Python code in this notebook.

As with all programmatic searches, there are a number of advantages to an automated approach:

It is easy to set up repeatable searches for many sequences, or collections of sequences
It is easy to read in the search results and conduct downstream analyses that add value to your search

Where it is not practical to submit a large number of simultaneous queries via a web form (because it is tiresome to point-and-click over and over again), this can be handled programmatically instead. You have the opportunity to change custom options to help refine your query, compared to the website interface. If you need to repeat a query, it can be trivial to get the same settings every time, if you use a programmatic approach.

The Biopython interface to KEGG has several other advantages that we will not cover in this lesson, in that it allows for a much greater range of image manipulations for the pathway maps that KEGG returns.

The `KEGG` interface is not as well documented as some other resources (such as `NCBI` or `Ensembl`), and `KEGG` does not provide any usage guidelines. To avoid risking overloading the service, Biopython restricts us to three calls per second.

Be warned also that the conditions of service include:

"This service should not be used for bulk data downloads".

Python imports¶

# Show plots as part of the notebook
%matplotlib inline

# Show images inline
from IPython.display import Image

# Standard library packages
import io
import os

# Import Biopython modules to interact with KEGG
from Bio import SeqIO
from Bio.KEGG import REST
from Bio.KEGG.KGML import KGML_parser
from Bio.Graphics.KGML_vis import KGMLCanvas

# Import Pandas, so we can use dataframes
import pandas as pd

Python functions¶

In the cell below, we define a couple of useful functions that convert some returned output into Pandas dataframe form, and display .pdf images directly in the notebook.

You do not need to understand these to follow the lesson.

# A bit of code that will help us display the PDF output
def PDF(filename):
    return HTML('<iframe src=%s width=700 height=350></iframe>' % filename)

# Some code to return a Pandas dataframe, given tabular text
def to_df(result):
    return pd.read_table(io.StringIO(result), header=None)

Running a remote `KEGG` query¶

There is typically only a single step involved in obtaining a result from `KEGG` with Biopython:

run one of the functions provided by Bio.KEGG.REST, and catch the result in a variable.

The available functions are:

kegg_conv() - convert identifiers from KEGG to those for other databases
kegg_find() - find KEGG entries with matching query data
kegg_get() - retrieve data for a specific entry from KEGG
kegg_info() - get information about a KEGG database
kegg_link() - find entries in KEGG using a database cross-reference
kegg_list() - list entries in a a database

The generic form of using these functions to get information from KEGG and place the output in the variable myvar is:

myvar = REST.<function>(<query>, <arg1>, <arg2>, `...`).read()

where <function> is one of the functions above, <query> is a string containing yoru query for KEGG, and <arg1>, <arg2> and so on are arguments that may be required for some of the functions.

You will use some of these functions in the notebook cells below to get information from KEGG.

`kegg_info()`¶

This function returns basic information about a specified `KEGG` database - much like visiting the landing page for that database.

For instance, to get information about the KEGG databases as a whole, you can use kegg_info("kegg") to get a handle from KEGG describing the databases, and catch it in a variable:

result = REST.kegg_info("kegg").read()

We could convert this handle to a Pandas dataframe with the function defined above: to_df():

to_df(result)

Not all data is suited to `pandas` dataframe representation

or .read() the handle, and print it to output directly with the print() statement:

print(result)

# Perform the query
result = REST.kegg_info("kegg").read()

# Print the result
print(result)

# Convert result to dataframe
# NOTE: kegg_info() requests do not produce a suitable data format for dataframe representation
#to_df(result)

kegg             Kyoto Encyclopedia of Genes and Genomes
kegg             Release 85.0+/03-04, Mar 18
                 Kanehisa Laboratories
                 pathway     567,969 entries
                 brite       203,224 entries
                 module      461,504 entries
                 orthology    21,840 entries
                 genome        5,616 entries
                 genes     25,801,250 entries
                 compound     18,257 entries
                 glycan       11,015 entries
                 reaction     10,828 entries
                 rclass        3,108 entries
                 enzyme        7,146 entries
                 disease       2,035 entries
                 drug         10,486 entries
                 dgroup        2,048 entries
                 environ         856 entries
                 network         296 entries
                 variant         123 entries

This gives us a similar overview to the available resources as the KEGG landing page. However, the kegg_info() function is a little more powerful, as it can find information about specific databases:

# Print information about the PATHWAY database
result = REST.kegg_info("pathway").read()
print(result)

pathway          KEGG Pathway Database
path             Release 85.0+/03-04, Mar 18
                 Kanehisa Laboratories
                 567,969 entries

linked db        module
                 ko
                 genome
                 <org>
                 compound
                 glycan
                 reaction
                 rclass
                 enzyme
                 network
                 disease
                 drug
                 pubmed

and even about specific organisms (identified with their three-letter code):

# Print information about Kitasatospora setae
result = REST.kegg_info("ksk").read()
print(result)

T01648           Kitasatospora setae KEGG Genes Database
ksk              Release 85.0+/03-04, Mar 18
                 Kanehisa Laboratories
                 7,673 entries

linked db        pathway
                 brite
                 module
                 ko
                 genome
                 enzyme
                 ncbi-proteinid
                 uniprot

`kegg_list()`¶

The `kegg_list()` function returns a table of entry identifiers and definitions for a specified database.

For example, to list all the entries in the PATHWAY database, you could use:

# Get all entries in the PATHWAY database as a dataframe
result = REST.kegg_list("pathway").read()
to_df(result)

and to restrict the results only to those pathways that are present in K. setae, you can filter the database results with a query string ksk, as the second argument:

# Get all entries in the PATHWAY database for K. setae as a dataframe
result = REST.kegg_list("pathway", "ksk").read()
to_df(result)

QUESTIONS

How many entries are in the complete PATHWAY database
How many entries in the PATHWAY database are also present in K. setae
Are these the same answers you got in `lesson 08`?

If, instead of specifying one of the top-level KEGG databases, you specify an organism code, KEGG will return a list of gene entries for that organism:

# Get all genes from K. setae as a dataframe
result = REST.kegg_list("ksk").read()
to_df(result)

`kegg_find()`¶

The `kegg_find()` function will search a named `KEGG` database with a specified query term.

For instance, to query the GENES database with the entry accession KSE_17560 you could use:

# Find a specific entry with a precise search term
result = REST.kegg_find("genes", "KSE_17560").read()
to_df(result)

With the query above, KEGG returns information for the exact entry we've requested. But we can also use less precise search terms, and combine them with the + symbol. For example, to search for shiga toxin we would use the query:

"shiga+toxin"

# Find all shiga toxin genes
result = REST.kegg_find("genes", "shiga+toxin").read()
to_df(result)

We can restrict this search to specific organisms, such as Escherichia coli O111 H-11128 (EHEC), by supplying its three letter code (here, eoi) as the database to be searched:

# Find all shiga toxin genes in eoi
result = REST.kegg_find("eoi", "shiga+toxin").read()
to_df(result)

The kegg_find() query string can also search in specific fields of the entry. The format for this is:

"<query_value>/<field>"

So, to search for all compounds with a molecular weight between 300 and 310 mass units, you can use the code:

# Find all compounds with mass between 300 and 310 units
result = REST.kegg_find("compound", "300-310/mol_weight").read()
to_df(result)

`kegg_get()`¶

Most functions you've seen so far will return two columns of data: the first column being the entry accession, and the second column being a description of that entry, or the requested value.

The `kegg_get()` function lets us retrieve specific entries from `KEGG` - such as our search results - in named formats.

For example, the first compound in our search for molecular weights in the range 300-310 above has entry accession cpd:C00051. We can recover this entry as follows:

# Get the entry information for cpd:C00051
result = REST.kegg_get("cpd:C00051").read()
print(result)

ENTRY       C00051                      Compound
NAME        Glutathione;
            5-L-Glutamyl-L-cysteinylglycine;
            N-(N-gamma-L-Glutamyl-L-cysteinyl)glycine;
            gamma-L-Glutamyl-L-cysteinyl-glycine;
            GSH;
            Reduced glutathione
FORMULA     C10H17N3O6S
EXACT_MASS  307.0838
MOL_WEIGHT  307.3235
REMARK      Same as: D00014
REACTION    R00094 R00115 R00120 R00274 R00494 R00497 R00499 R00527 
            R00547 R00900 R01108 R01109 R01110 R01111 R01113 R01262 
            R01292 R01736 R01875 R01917 R01918 R02530 R02824 R03059 
            R03082 R03167 R03522 R03822 R03915 R03956 R03984 R04039 
            R04090 R04860 R05267 R05269 R05402 R05403 R05714 R05717 
            R05748 R06982 R07002 R07003 R07004 R07023 R07024 R07025 
            R07026 R07034 R07035 R07069 R07070 R07083 R07084 R07091 
            R07092 R07093 R07094 R07100 R07113 R07116 R07124 R08280 
            R08350 R08351 R08352 R08353 R08354 R08355 R08511 R08512 
            R08678 R09338 R09367 R09368 R09409 R11411 R11650 R11652 
            R11659 R11734 R11736 R11737 R11739 R11861 R11905 R11929 
            R11947
PATHWAY     map00270  Cysteine and methionine metabolism
            map00480  Glutathione metabolism
            map01100  Metabolic pathways
            map02010  ABC transporters
            map04216  Ferroptosis
            map04918  Thyroid hormone synthesis
            map04976  Bile secretion
MODULE      M00118  Glutathione biosynthesis, glutamate => glutathione
ENZYME      1.5.4.1         1.8.1.7         1.8.1.9         1.8.1.10        
            1.8.3.3         1.8.4.1         1.8.4.2         1.8.4.3         
            1.8.4.4         1.8.4.7         1.8.4.9         1.8.5.1         
            1.8.5.7         1.8.5.8         1.11.1.9        1.11.1.12       
            1.13.11.18      1.14.14.43      1.14.14.45      1.20.4.2        
            2.3.2.2         2.3.2.15        2.5.1.18        2.5.1.-         
            2.8.1.3         3.1.2.6         3.1.2.7         3.1.2.12        
            3.1.2.13        3.4.19.13       3.5.1.78        3.5.1.-         
            4.3.2.7         4.4.1.5         4.4.1.20        4.4.1.22        
            4.4.1.34        6.3.1.8         6.3.1.9         6.3.2.3
BRITE       Compounds with biological roles [BR:br08001]
             Vitamins and Cofactors
              Cofactors
               Coenzymes
                C00051  Glutathione
            Anatomical Therapeutic Chemical (ATC) classification [BR:br08303]
             V VARIOUS
              V03 ALL OTHER THERAPEUTIC PRODUCTS
               V03A ALL OTHER THERAPEUTIC PRODUCTS
                V03AB Antidotes
                 V03AB32 Glutathione
                  D00014  Glutathione (JP17)
            Therapeutic category of drugs in Japan [BR:br08301]
             1  Agents affecting nervous system and sensory organs
              13  Agents affecting sensory organs
               131  Ophthalmic agents
                1319  Others
                 D00014  Glutathione (JP17)
             3  Agents affecting metabolism
              39  Other agents affecting metabolism
               392  Antidotes
                3922  Glutathiones
                 D00014  Glutathione (JP17)
            Drugs listed in the Japanese Pharmacopoeia [BR:br08311]
             Chemicals
              D00014  Glutathione
            Drug classes of therapeutic agents [br08360.html]
             Endocrine and hormonal agents
              D00014
            Animal drugs in Japan [BR:br08331]
             96  Agents affecting metabolism
              967  Agents for liver disease and antidotes
               9676  Other amino acid and preparations
                C00051  Glutathione
DBLINKS     CAS: 70-18-8
            PubChem: 3353
            ChEBI: 16856
            ChEMBL: CHEMBL1514919 CHEMBL1543
            KNApSAcK: C00001518
            PDB-CCD: GSH
            3DMET: B01138
            NIKKAJI: J10.686K
ATOM        20
            1   O6a O    24.0100  -16.3100
            2   C6a C    25.2000  -15.6100
            3   C1c C    26.4600  -16.3100
            4   C1b C    27.6500  -15.6100
            5   C1b C    28.8400  -16.3100
            6   C5a C    30.1000  -15.6100
            7   N1b N    31.2900  -16.3100
            8   C1c C    32.4800  -15.6100
            9   C5a C    33.7400  -16.3100
            10  N1b N    34.9300  -15.6100
            11  C1b C    36.1200  -16.3100
            12  C6a C    37.3800  -15.6100
            13  O6a O    38.5700  -16.3100
            14  O6a O    25.2000  -14.2100
            15  N1a N    26.4600  -17.7100
            16  O5a O    30.1000  -14.2100
            17  C1b C    32.4800  -14.2100
            18  S1a S    33.6700  -13.5100
            19  O5a O    33.7400  -17.7100
            20  O6a O    37.3800  -14.2100
BOND        19
            1     1   2 1
            2     2   3 1
            3     3   4 1
            4     4   5 1
            5     5   6 1
            6     6   7 1
            7     7   8 1
            8     8   9 1
            9     9  10 1
            10   10  11 1
            11   11  12 1
            12   12  13 1
            13    2  14 2
            14    3  15 1 #Down
            15    6  16 2
            16    8  17 1 #Up
            17   17  18 1
            18    9  19 2
            19   12  20 2
///

QUESTIONS

What information is returned in the default result?

KEGG provides a number of different entry types, which cannot all be recovered in exactly the same ways. For instance, the COMPOUND entries typically have an associated molecular structure image, which can be recovered with kegg_get() by specifying the format to be "image":

# Display molecular structure for cpd:C00051
result = REST.kegg_get("cpd:C00051", "image").read()
Image(result)

GENE entries are sequences, so can be recovered as their database entries (default), or as FASTA format nucleotide and/or protein sequences:

# Get entry information for KSE_17560
result = REST.kegg_get("ksk:KSE_17560").read()
print(result)

ENTRY       KSE_17560         CDS       T01648
NAME        dxs1
DEFINITION  (GenBank) putative 1-deoxy-D-xylulose-5-phosphate synthase
ORTHOLOGY   K01662  1-deoxy-D-xylulose-5-phosphate synthase [EC:2.2.1.7]
ORGANISM    ksk  Kitasatospora setae
PATHWAY     ksk00730  Thiamine metabolism
            ksk00900  Terpenoid backbone biosynthesis
            ksk01100  Metabolic pathways
            ksk01110  Biosynthesis of secondary metabolites
            ksk01130  Biosynthesis of antibiotics
MODULE      ksk_M00096  C5 isoprenoid biosynthesis, non-mevalonate pathway
BRITE       KEGG Orthology (KO) [BR:ksk00001]
             Metabolism
              Metabolism of cofactors and vitamins
               00730 Thiamine metabolism
                KSE_17560 (dxs1)
              Metabolism of terpenoids and polyketides
               00900 Terpenoid backbone biosynthesis
                KSE_17560 (dxs1)
            Enzymes [BR:ksk01000]
             2. Transferases
              2.2  Transferring aldehyde or ketonic groups
               2.2.1  Transketolases and transaldolases
                2.2.1.7  1-deoxy-D-xylulose-5-phosphate synthase
                 KSE_17560 (dxs1)
POSITION    complement(1952373..1954298)
MOTIF       Pfam: DXP_synthase_N Transket_pyr Transketolase_C TPP_enzyme_C E1_dh Transketolase_N DUF4054 PFOR_II
DBLINKS     NCBI-ProteinID: BAJ27580
            NITE: KSE_17560
            UniProt: E4N8P9
AASEQ       641
            MPLLSQITGPADLRRLHPEQLPLLADEIRDFLIDAVTRTGGHLGPNLGVVELSIALHRVF
            DSPRDRVLWDTGHQAYVHKLLTGRQDFSRLRAKDGLSGYPSRAESEHDLIENSHASTALG
            YADGIAKANQLLGADRHTVAVIGDGALTGGMAWEALNNIAEAEDRPLVIVVNDNERSYAP
            TIGGLAHHLATLRTTRGYERFLAWGKDALQRTPVVGPPLFDALHGAKKGFKDAFAPQGMF
            EDLGLKYLGPIDGHDIAAVEQALRQARNFGGPVIVHCLTVKGRGYRPAEQDEADRFHAVG
            PIDPYTCLPISPSAGASWTSVFSQEMLALGAERPDLVAVTAAMLHPVGLGPFAAAHPGRT
            YDVGIAEQHAVASAAGLATGGLHPVVAVYATFLNRAFDQVLMDVALHKLGVTFVLDRAGV
            TGNDGASHNGMWDMSILQVVPGLRLAAPRDADRLREQLREAVAVEDAPTVVRFPKGDLGP
            EIPAVERIGGVDVLARTGPSPDVLLVAVGSMAPACLDAAALLAAEGITATVVDPRWVKPV
            DPALVALAAAHRMVVTVEDNGRAGGVGAAVAQAMRDAEVDTPLRDLGVPQEFLAHASRGE
            ILEEIGLTGTGVAAQTAAYARRLLPGTRSGAQEYRPRVPRK
NTSEQ       1926
            atgccactgctgagccagatcaccgggcccgccgacctcagacgactgcaccccgagcag
            ctgccgctgctcgccgacgagatccgcgacttcctgatcgacgccgtcacccgcaccggc
            ggccacctcggccccaacctcggcgtggtcgagctcagcatcgccctacaccgggtcttc
            gactccccgcgcgaccgcgtcctgtgggacaccggccaccaggcctacgtgcacaagctg
            ctcaccggccggcaggacttcagccggctgcgcgccaaggacggcctctccggctacccc
            tcgcgcgccgagtccgaacacgacctgatcgagaactcgcacgcctccaccgcgctcggc
            tacgccgacggcatcgccaaggccaaccaactgctcggcgccgaccggcacaccgtcgcc
            gtgatcggcgacggcgcgctcaccggcggcatggcctgggaggcgctcaacaacatcgcc
            gaggccgaggaccgcccgctggtcatcgtcgtcaacgacaacgagcgctcctacgcgccc
            accatcggcggcctcgcccaccacctcgccaccctgcgcaccacccgcggctacgagcgc
            ttcctcgcctggggcaaggacgccctgcagcgcacccccgtggtcgggccgccgctgttc
            gacgcgctgcacggcgccaagaagggcttcaaggacgccttcgccccgcagggcatgttc
            gaggacctcggtctgaagtacctcggcccgatcgacggccacgacatcgccgccgtcgaa
            caggcgctgcgccaggcccggaacttcggcgggcccgtcatcgtgcactgcctgaccgtc
            aagggccgcggctaccggcccgccgagcaggacgaggccgaccgcttccacgccgtcggc
            ccgatcgacccgtacacctgcctgccgatctcgccgtccgccggggcctcctggacttcg
            gtgttcagccaggagatgctcgccctcggcgccgagcggcccgacctggtcgccgtcacc
            gccgcgatgctgcaccccgtcgggctcggcccgttcgccgccgcgcaccccgggcggacc
            tacgacgtcgggatcgccgagcagcacgccgtcgcctccgccgccggcctggccaccggg
            gggctgcaccccgtcgtcgcggtgtacgcgaccttcctgaaccgggccttcgaccaggtg
            ctgatggacgtcgcgctgcacaagctgggcgtcaccttcgtgctcgaccgggccggggtc
            accggcaacgacggggcctcgcacaacggcatgtgggacatgtcgatcctgcaggtcgtg
            cccgggctgcggctggccgcgccgcgcgacgccgaccggctgcgcgaacagctccgggag
            gccgtcgcggtcgaggacgcgcccaccgtggtgcgcttccccaagggcgacctcggcccc
            gagatcccggcggtcgagcggatcggcggcgtcgacgtgctggcccgcaccggccccagc
            cccgacgtgctgctggtcgccgtcggctcgatggcccccgcctgcctggacgccgccgcg
            ctgctcgccgccgagggcatcaccgccaccgtcgtcgacccgcgctgggtcaagcccgtc
            gaccccgccctcgtcgcgctggccgccgcgcaccggatggtggtcaccgtcgaggacaac
            gggcgggccggcggcgtcggcgccgccgtcgcccaggcgatgcgggacgccgaggtcgac
            accccgctgcgcgacctcggcgtcccgcaggagttcctggcgcacgcctcgcgcggtgag
            atcctggaggagatcggactcaccggcaccggcgtcgccgcccagaccgccgcctacgcc
            cgccgcctgctgcccggcacccggagcggcgcccaggagtaccggccccgggtgccgcgc
            aagtag
///

# Get coding sequence for KSE_17560
result = REST.kegg_get("ksk:KSE_17560", "ntseq").read()
print(result)

>ksk:KSE_17560 K01662 1-deoxy-D-xylulose-5-phosphate synthase [EC:2.2.1.7] | (GenBank) dxs1; putative 1-deoxy-D-xylulose-5-phosphate synthase (N)
atgccactgctgagccagatcaccgggcccgccgacctcagacgactgcaccccgagcag
ctgccgctgctcgccgacgagatccgcgacttcctgatcgacgccgtcacccgcaccggc
ggccacctcggccccaacctcggcgtggtcgagctcagcatcgccctacaccgggtcttc
gactccccgcgcgaccgcgtcctgtgggacaccggccaccaggcctacgtgcacaagctg
ctcaccggccggcaggacttcagccggctgcgcgccaaggacggcctctccggctacccc
tcgcgcgccgagtccgaacacgacctgatcgagaactcgcacgcctccaccgcgctcggc
tacgccgacggcatcgccaaggccaaccaactgctcggcgccgaccggcacaccgtcgcc
gtgatcggcgacggcgcgctcaccggcggcatggcctgggaggcgctcaacaacatcgcc
gaggccgaggaccgcccgctggtcatcgtcgtcaacgacaacgagcgctcctacgcgccc
accatcggcggcctcgcccaccacctcgccaccctgcgcaccacccgcggctacgagcgc
ttcctcgcctggggcaaggacgccctgcagcgcacccccgtggtcgggccgccgctgttc
gacgcgctgcacggcgccaagaagggcttcaaggacgccttcgccccgcagggcatgttc
gaggacctcggtctgaagtacctcggcccgatcgacggccacgacatcgccgccgtcgaa
caggcgctgcgccaggcccggaacttcggcgggcccgtcatcgtgcactgcctgaccgtc
aagggccgcggctaccggcccgccgagcaggacgaggccgaccgcttccacgccgtcggc
ccgatcgacccgtacacctgcctgccgatctcgccgtccgccggggcctcctggacttcg
gtgttcagccaggagatgctcgccctcggcgccgagcggcccgacctggtcgccgtcacc
gccgcgatgctgcaccccgtcgggctcggcccgttcgccgccgcgcaccccgggcggacc
tacgacgtcgggatcgccgagcagcacgccgtcgcctccgccgccggcctggccaccggg
gggctgcaccccgtcgtcgcggtgtacgcgaccttcctgaaccgggccttcgaccaggtg
ctgatggacgtcgcgctgcacaagctgggcgtcaccttcgtgctcgaccgggccggggtc
accggcaacgacggggcctcgcacaacggcatgtgggacatgtcgatcctgcaggtcgtg
cccgggctgcggctggccgcgccgcgcgacgccgaccggctgcgcgaacagctccgggag
gccgtcgcggtcgaggacgcgcccaccgtggtgcgcttccccaagggcgacctcggcccc
gagatcccggcggtcgagcggatcggcggcgtcgacgtgctggcccgcaccggccccagc
cccgacgtgctgctggtcgccgtcggctcgatggcccccgcctgcctggacgccgccgcg
ctgctcgccgccgagggcatcaccgccaccgtcgtcgacccgcgctgggtcaagcccgtc
gaccccgccctcgtcgcgctggccgccgcgcaccggatggtggtcaccgtcgaggacaac
gggcgggccggcggcgtcggcgccgccgtcgcccaggcgatgcgggacgccgaggtcgac
accccgctgcgcgacctcggcgtcccgcaggagttcctggcgcacgcctcgcgcggtgag
atcctggaggagatcggactcaccggcaccggcgtcgccgcccagaccgccgcctacgcc
cgccgcctgctgcccggcacccggagcggcgcccaggagtaccggccccgggtgccgcgc
aagtag

# Get protein sequence for KSE_17560
result = REST.kegg_get("ksk:KSE_17560", "aaseq").read()
print(result)

>ksk:KSE_17560 K01662 1-deoxy-D-xylulose-5-phosphate synthase [EC:2.2.1.7] | (GenBank) dxs1; putative 1-deoxy-D-xylulose-5-phosphate synthase (A)
MPLLSQITGPADLRRLHPEQLPLLADEIRDFLIDAVTRTGGHLGPNLGVVELSIALHRVF
DSPRDRVLWDTGHQAYVHKLLTGRQDFSRLRAKDGLSGYPSRAESEHDLIENSHASTALG
YADGIAKANQLLGADRHTVAVIGDGALTGGMAWEALNNIAEAEDRPLVIVVNDNERSYAP
TIGGLAHHLATLRTTRGYERFLAWGKDALQRTPVVGPPLFDALHGAKKGFKDAFAPQGMF
EDLGLKYLGPIDGHDIAAVEQALRQARNFGGPVIVHCLTVKGRGYRPAEQDEADRFHAVG
PIDPYTCLPISPSAGASWTSVFSQEMLALGAERPDLVAVTAAMLHPVGLGPFAAAHPGRT
YDVGIAEQHAVASAAGLATGGLHPVVAVYATFLNRAFDQVLMDVALHKLGVTFVLDRAGV
TGNDGASHNGMWDMSILQVVPGLRLAAPRDADRLREQLREAVAVEDAPTVVRFPKGDLGP
EIPAVERIGGVDVLARTGPSPDVLLVAVGSMAPACLDAAALLAAEGITATVVDPRWVKPV
DPALVALAAAHRMVVTVEDNGRAGGVGAAVAQAMRDAEVDTPLRDLGVPQEFLAHASRGE
ILEEIGLTGTGVAAQTAAYARRLLPGTRSGAQEYRPRVPRK

Retrieving pathways¶

`KEGG` is practically synonymous with its excellent pathway diagrams, and it should be no surprise that you can retrive these using Python, too. You can get these images directly with `kegg_get()`, using the `"image"` format.

To specify one of the generic pathway maps, you can combine the map prefix with the pathway number to make the query mapNNNNN as in the cells, below.

# Get map of fatty-acid biosynthesis
result = REST.kegg_get("map00061", "image").read()
Image(result)

# Get map of central metabolism
result = REST.kegg_get("map01100", "image").read()
Image(result)

If you want to retrieve the pathway map corresponding to a particular organism, then you can replace the prefix map with the three-letter code for that organism, as in the examples below for Kitasatospora where map is replaced with ksk:

# Get map of fatty-acid biosynthesis in Kitasatospora
result = REST.kegg_get("ksk00061", "image").read()
Image(result)

# Get map of central metabolism in Kitasatospora
result = REST.kegg_get("ksk01100", "image").read()
Image(result)

KEGG provides copious information about pathways in the accompanying database entries, which can be obtained by not providing a download format:

# Get data for fatty-acid biosynthesis in Kitasatospora
result = REST.kegg_get("ksk00061").read()
print(result)

ENTRY       ksk00061                    Pathway
NAME        Fatty acid biosynthesis - Kitasatospora setae
CLASS       Metabolism; Lipid metabolism
PATHWAY_MAP ksk00061  Fatty acid biosynthesis
MODULE      ksk_M00082  Fatty acid biosynthesis, initiation [PATH:ksk00061]
            ksk_M00083  Fatty acid biosynthesis, elongation [PATH:ksk00061]
ORGANISM    Kitasatospora setae [GN:ksk]
GENE        KSE_65020  putative acyl-CoA carboxylase [KO:K01962 K01963] [EC:2.1.3.15 6.4.1.2 2.1.3.15 6.4.1.2]
            KSE_72490  putative acyl-CoA carboxylase [KO:K01962 K01963] [EC:2.1.3.15 6.4.1.2 2.1.3.15 6.4.1.2]
            KSE_72500  putative acyl-CoA carboxylase [KO:K02160]
            KSE_72510  putative acyl-CoA carboxylase [KO:K01961] [EC:6.3.4.14 6.4.1.2]
            KSE_26830  accA; putative acetyl-CoA carboxylase biotin carboxylase [KO:K11263] [EC:6.3.4.14 6.4.1.3 6.4.1.2]
            KSE_29850  putative propionyl-CoA carboxylase alpha subunit [KO:K11263] [EC:6.3.4.14 6.4.1.3 6.4.1.2]
            KSE_24970  fabD1; putative malonyl-CoA--acyl carrier protein transacylase [KO:K00645] [EC:2.3.1.39]
            KSE_42300  fabD2; putative malonyl-CoA--acyl carrier protein transacylase [KO:K00645] [EC:2.3.1.39]
            KSE_73570  bfmI; malonyl transferase [KO:K00645] [EC:2.3.1.39]
            KSE_27260  hypothetical protein [KO:K00648] [EC:2.3.1.180]
            KSE_27270  hypothetical protein [KO:K00648] [EC:2.3.1.180]
            KSE_33040  fabH3; putative 3-oxoacyl-[acyl-carrier-protein] synthase III [KO:K00648] [EC:2.3.1.180]
            KSE_65900  hypothetical protein [KO:K00648] [EC:2.3.1.180]
            KSE_24980  fabH1; putative 3-oxoacyl-[acyl-carrier-protein] synthase III [KO:K00648] [EC:2.3.1.180]
            KSE_65240  hypothetical protein [KO:K00648] [EC:2.3.1.180]
            KSE_65510  fabH5; putative 3-oxoacyl-[acyl-carrier-protein] synthase III [KO:K00648] [EC:2.3.1.180]
            KSE_65850  hypothetical protein [KO:K00648] [EC:2.3.1.180]
            KSE_67490  hypothetical protein [KO:K00648] [EC:2.3.1.180]
            KSE_73340  fabH8; putative 3-oxoacyl-[acyl-carrier-protein] synthase III [KO:K00648] [EC:2.3.1.180]
            KSE_71440  fabH6; putative 3-oxoacyl-[acyl-carrier-protein] synthase III [KO:K00648] [EC:2.3.1.180]
            KSE_27340  fabH2; putative 3-oxoacyl-[acyl-carrier-protein] synthase III [KO:K00648] [EC:2.3.1.180]
            KSE_42420  putative 3-oxoacyl-[acyl-carrier-protein] synthase [KO:K09458] [EC:2.3.1.179]
            KSE_15050  fabF1; putative 3-oxoacyl-[acyl-carrier-protein] synthase II [KO:K09458] [EC:2.3.1.179]
            KSE_65980  putative 3-oxoacyl-[acyl-carrier-protein] synthase [KO:K09458] [EC:2.3.1.179]
            KSE_25000  fabF2; putative 3-oxoacyl-[acyl-carrier-protein] synthase II [KO:K09458] [EC:2.3.1.179]
            KSE_67350  fabF4; putative 3-oxoacyl-[acyl-carrier-protein] synthase II [KO:K09458] [EC:2.3.1.179]
            KSE_65990  putative 3-oxoacyl-[acyl-carrier-protein] synthase [KO:K09458] [EC:2.3.1.179]
            KSE_59460  fabF7; putative 3-oxoacyl-[acyl-carrier-protein] synthase II [KO:K09458] [EC:2.3.1.179]
            KSE_65090  fabF3; putative 3-oxoacyl-[acyl-carrier-protein] synthase II [KO:K09458] [EC:2.3.1.179]
            KSE_65120  putative 3-oxoacyl-[acyl-carrier-protein] synthase [KO:K09458] [EC:2.3.1.179]
            KSE_67380  putative 3-oxoacyl-[acyl-carrier-protein] synthase [KO:K09458] [EC:2.3.1.179]
            KSE_73360  fabF5; putative 3-oxoacyl-[acyl-carrier-protein] synthase II [KO:K09458] [EC:2.3.1.179]
            KSE_27320  fabF6; putative 3-oxoacyl-[acyl-carrier-protein] synthase II [KO:K09458] [EC:2.3.1.179]
            KSE_42370  fabG; putative 3-oxoacyl-[acyl-carrier-protein] reductase [KO:K00059] [EC:1.1.1.100]
            KSE_56930  putative oxidoreductase [KO:K00059] [EC:1.1.1.100]
            KSE_57080  putative 3-oxoacyl-[acyl-carrier-protein] reductase [KO:K00059] [EC:1.1.1.100]
            KSE_65960  putative 3-oxoacyl-[acyl-carrier-protein] reductase [KO:K00059] [EC:1.1.1.100]
            KSE_54450  putative oxidoreductase [KO:K00059] [EC:1.1.1.100]
            KSE_64480  putative 3-oxoacyl-[acyl-carrier-protein] reductase [KO:K00059] [EC:1.1.1.100]
            KSE_17510  putative 3-oxoacyl-[acyl-carrier-protein] reductase [KO:K00059] [EC:1.1.1.100]
            KSE_11260  putative oxidoreductase [KO:K00059] [EC:1.1.1.100]
            KSE_46920  putative 3-oxoacyl-[acyl-carrier-protein] reductase [KO:K00059] [EC:1.1.1.100]
            KSE_08640  putative oxidoreductase [KO:K00059] [EC:1.1.1.100]
            KSE_75270  putative oxidoreductase [KO:K00059] [EC:1.1.1.100]
            KSE_42390  fabZ; putative beta-hydroxyacyl-[acyl-carrier-protein] dehydratase [KO:K02372] [EC:4.2.1.59]
            KSE_57090  fabI; putative enoyl-[acyl-carrier-protein] reductase [KO:K00208] [EC:1.3.1.10 1.3.1.9]
            KSE_54350  desA; acyl-[acyl-carrier-protein] desaturase [KO:K03921] [EC:1.14.19.26 1.14.19.11 1.14.19.2]
            KSE_25760  fadD4; putative long-chain fatty-acid--CoA ligase [KO:K01897] [EC:6.2.1.3]
            KSE_21700  fadD1; putative long-chain fatty-acid--CoA ligase [KO:K01897] [EC:6.2.1.3]
            KSE_17190  fadD3; putative long-chain fatty-acid--CoA ligase [KO:K01897] [EC:6.2.1.3]
            KSE_73410  bfmM; acyl-CoA ligase [KO:K01897] [EC:6.2.1.3]
            KSE_62790  fadD2; putative long-chain fatty-acid--CoA ligase [KO:K01897] [EC:6.2.1.3]
            KSE_13030  hypothetical protein [KO:K01897] [EC:6.2.1.3]
COMPOUND    C00024  Acetyl-CoA
            C00083  Malonyl-CoA
            C00154  Palmitoyl-CoA
            C00229  Acyl-carrier protein
            C00249  Hexadecanoic acid
            C00712  (9Z)-Octadecenoic acid
            C01203  Oleoyl-[acyl-carrier protein]
            C01209  Malonyl-[acyl-carrier protein]
            C01530  Octadecanoic acid
            C01571  Decanoic acid
            C02679  Dodecanoic acid
            C03939  Acetyl-[acyl-carrier protein]
            C04088  Octadecanoyl-[acyl-carrier protein]
            C04180  cis-Dec-3-enoyl-[acp]
            C04246  But-2-enoyl-[acyl-carrier protein]
            C04618  (3R)-3-Hydroxybutanoyl-[acyl-carrier protein]
            C04619  (3R)-3-Hydroxydecanoyl-[acyl-carrier protein]
            C04620  (3R)-3-Hydroxyoctanoyl-[acyl-carrier protein]
            C04633  (3R)-3-Hydroxypalmitoyl-[acyl-carrier protein]
            C04688  (3R)-3-Hydroxytetradecanoyl-[acyl-carrier protein]
            C05223  Dodecanoyl-[acyl-carrier protein]
            C05744  Acetoacetyl-[acp]
            C05745  Butyryl-[acp]
            C05746  3-Oxohexanoyl-[acp]
            C05747  (R)-3-Hydroxyhexanoyl-[acp]
            C05748  trans-Hex-2-enoyl-[acp]
            C05749  Hexanoyl-[acp]
            C05750  3-Oxooctanoyl-[acp]
            C05751  trans-Oct-2-enoyl-[acp]
            C05752  Octanoyl-[acp]
            C05753  3-Oxodecanoyl-[acp]
            C05754  trans-Dec-2-enoyl-[acp]
            C05755  Decanoyl-[acp]
            C05756  3-Oxododecanoyl-[acp]
            C05757  (R)-3-Hydroxydodecanoyl-[acp]
            C05758  trans-Dodec-2-enoyl-[acp]
            C05759  3-Oxotetradecanoyl-[acp]
            C05760  trans-Tetradec-2-enoyl-[acp]
            C05761  Tetradecanoyl-[acp]
            C05762  3-Oxohexadecanoyl-[acp]
            C05763  trans-Hexadec-2-enoyl-[acp]
            C05764  Hexadecanoyl-[acp]
            C06423  Octanoic acid
            C06424  Tetradecanoic acid
            C08362  (9Z)-Hexadecenoic acid
            C16219  3-Oxostearoyl-[acp]
            C16220  (R)-3-Hydroxyoctadecanoyl-[acp]
            C16221  (2E)-Octadecenoyl-[acp]
            C16520  Hexadecenoyl-[acyl-carrier protein]
            C20794  n-7 Unsaturated acyl-[acyl-carrier protein]
REFERENCE   PMID:12061798
  AUTHORS   Salas JJ, Ohlrogge JB.
  TITLE     Characterization of substrate specificity of plant FatA and FatB acyl-ACP thioesterases.
  JOURNAL   Arch Biochem Biophys 403:25-34 (2002)
            DOI:10.1016/S0003-9861(02)00017-6
REFERENCE   PMID:12518017
  AUTHORS   Zhang YM, Marrakchi H, White SW, Rock CO.
  TITLE     The application of computational methods to explore the diversity and structure of bacterial fatty acid synthase.
  JOURNAL   J Lipid Res 44:1-10 (2003)
            DOI:10.1194/jlr.R200016-JLR200
REFERENCE   PMID:11337402
  AUTHORS   Voelker T, Kinney AJ.
  TITLE     VARIATIONS IN THE BIOSYNTHESIS OF SEED-STORAGE LIPIDS.
  JOURNAL   Annu Rev Plant Physiol Plant Mol Biol 52:335-361 (2001)
            DOI:10.1146/annurev.arplant.52.1.335
REFERENCE   PMID:17573542
  AUTHORS   Barker GC, Larson TR, Graham IA, Lynn JR, King GJ.
  TITLE     Novel insights into seed fatty acid synthesis and modification pathways from genetic diversity and quantitative trait Loci analysis of the Brassica C genome.
  JOURNAL   Plant Physiol 144:1827-42 (2007)
            DOI:10.1104/pp.107.096172
KO_PATHWAY  ko00061
///

Retrieving pathway components¶

As you can see from the database entry for ksk00061 above, the pathway is composed of many GENE and COMPOUND entries, but the returned data format is not easy to work with to extract that data.

You can use the `kegg_link()` function to identify the components of a pathway, by specifying first the `` you want to make a connection to, then the `` for the database entry you are interested in:

result = REST.kegg_link(<database>, <entry>).read()

For instance, to identify the COMPOUND entries represented in the map00061 pathway, you would compose the query:

result = REST.kegg_link("compound", "map00061").read()

as below:

# Get genes involved with fatty-acid biosynthesis in Kitasatospora
result = REST.kegg_link("compound", "map00061").read()
to_df(result)

You can use any of the databases in KEGG with this function, though not all may give you a result for any given query.

You can use this function to query generic pathways against the very useful reference databases of KEGG:

ko: KEGG orthologues - a collection of functional orthologues
ec: EC numbers - a collection of Enzyme Commission classifications
rn: REACTION entries - descriptions of chemical interconversions

For example, to identify reactions that are involved in the fatty-acid synthesis pathway, and then get the database entry for one of these, you could use the queries in the cells below:

# Get reactions involved with fatty-acid biosynthesis
result = REST.kegg_link("rn", "map00061").read()
to_df(result)

# Get reactions R00742
result = REST.kegg_get("R00742").read()
print(result)

ENTRY       R00742                      Reaction
NAME        acetyl-CoA:carbon-dioxide ligase (ADP-forming)
DEFINITION  ATP + Acetyl-CoA + HCO3- <=> ADP + Orthophosphate + Malonyl-CoA
EQUATION    C00002 + C00024 + C00288 <=> C00008 + C00009 + C00083
COMMENT     two-step reaction (see R04385 + R04386)
RCLASS      RC00002  C00002_C00008
            RC00040  C00024_C00083
            RC00367  C00083_C00288
ENZYME      6.4.1.2
PATHWAY     rn00061  Fatty acid biosynthesis
            rn00254  Aflatoxin biosynthesis
            rn00620  Pyruvate metabolism
            rn00640  Propanoate metabolism
            rn00720  Carbon fixation pathways in prokaryotes
            rn01100  Metabolic pathways
            rn01110  Biosynthesis of secondary metabolites
            rn01120  Microbial metabolism in diverse environments
            rn01130  Biosynthesis of antibiotics
            rn01200  Carbon metabolism
            rn01212  Fatty acid metabolism
MODULE      M00082  Fatty acid biosynthesis, initiation
            M00375  Hydroxypropionate-hydroxybutylate cycle
            M00376  3-Hydroxypropionate bi-cycle
ORTHOLOGY   K01946  acetyl-CoA carboxylase / biotin carboxylase 2 [EC:6.4.1.2 6.3.4.14 2.1.3.15]
            K01961  acetyl-CoA carboxylase, biotin carboxylase subunit [EC:6.4.1.2 6.3.4.14]
            K01962  acetyl-CoA carboxylase carboxyl transferase subunit alpha [EC:6.4.1.2 2.1.3.15]
            K01963  acetyl-CoA carboxylase carboxyl transferase subunit beta [EC:6.4.1.2 2.1.3.15]
            K01964  acetyl-CoA/propionyl-CoA carboxylase [EC:6.4.1.2 6.4.1.3]
            K02160  acetyl-CoA carboxylase biotin carboxyl carrier protein
            K11262  acetyl-CoA carboxylase / biotin carboxylase 1 [EC:6.4.1.2 6.3.4.14 2.1.3.15]
            K11263  acetyl-CoA/propionyl-CoA carboxylase, biotin carboxylase, biotin carboxyl carrier protein [EC:6.4.1.2 6.4.1.3 6.3.4.14]
            K15036  acetyl-CoA/propionyl-CoA carboxylase [EC:6.4.1.2 6.4.1.3 2.1.3.15]
            K15037  biotin carboxyl carrier protein
            K18472  acetyl-CoA/propionyl-CoA carboxylase carboxyl transferase subunit [EC:6.4.1.2 6.4.1.3 2.1.3.15]
            K18603  acetyl-CoA/propionyl-CoA carboxylase [EC:6.4.1.2 6.4.1.3]
            K18604  acetyl-CoA/propionyl-CoA carboxylase [EC:6.4.1.2 6.4.1.3 2.1.3.15]
            K18605  biotin carboxyl carrier protein
DBLINKS     RHEA: 11311
///

Exercise 01 (15min)¶

The UniProt record Q05655 describes a human protein kinase. Using KEGG, can you discover:

Which genes are associated with this UniProt entry?

	0	1
0	path:map00010	Glycolysis / Gluconeogenesis
1	path:map00020	Citrate cycle (TCA cycle)
2	path:map00030	Pentose phosphate pathway
3	path:map00040	Pentose and glucuronate interconversions
4	path:map00051	Fructose and mannose metabolism
5	path:map00052	Galactose metabolism
6	path:map00053	Ascorbate and aldarate metabolism
7	path:map00061	Fatty acid biosynthesis
8	path:map00062	Fatty acid elongation
9	path:map00071	Fatty acid degradation
10	path:map00072	Synthesis and degradation of ketone bodies
11	path:map00073	Cutin, suberine and wax biosynthesis
12	path:map00100	Steroid biosynthesis
13	path:map00120	Primary bile acid biosynthesis
14	path:map00121	Secondary bile acid biosynthesis
15	path:map00130	Ubiquinone and other terpenoid-quinone biosynt...
16	path:map00140	Steroid hormone biosynthesis
17	path:map00190	Oxidative phosphorylation
18	path:map00195	Photosynthesis
19	path:map00196	Photosynthesis - antenna proteins
20	path:map00220	Arginine biosynthesis
21	path:map00230	Purine metabolism
22	path:map00231	Puromycin biosynthesis
23	path:map00232	Caffeine metabolism
24	path:map00240	Pyrimidine metabolism
25	path:map00250	Alanine, aspartate and glutamate metabolism
26	path:map00253	Tetracycline biosynthesis
27	path:map00254	Aflatoxin biosynthesis
28	path:map00260	Glycine, serine and threonine metabolism
29	path:map00261	Monobactam biosynthesis
...	...	...
493	path:map07057	Antiparkinsonian agents
494	path:map07110	Benzoic acid family
495	path:map07112	1,2-Diphenyl substitution family
496	path:map07114	Naphthalene family
497	path:map07117	Benzodiazepine family
498	path:map07211	Serotonin receptor agonists/antagonists
499	path:map07212	Histamine H1 receptor antagonists
500	path:map07213	Dopamine receptor agonists/antagonists
501	path:map07214	beta-Adrenergic receptor agonists/antagonists
502	path:map07215	alpha-Adrenergic receptor agonists/antagonists
503	path:map07216	Catecholamine transferase inhibitors
504	path:map07217	Renin-angiotensin system inhibitors
505	path:map07218	HIV protease inhibitors
506	path:map07219	Cyclooxygenase inhibitors
507	path:map07220	Cholinergic and anticholinergic drugs
508	path:map07221	Nicotinic cholinergic receptor antagonists
509	path:map07222	Peroxisome proliferator-activated receptor (PP...
510	path:map07223	Retinoic acid receptor (RAR) and retinoid X re...
511	path:map07224	Opioid receptor agonists/antagonists
512	path:map07225	Glucocorticoid and mineralocorticoid receptor ...
513	path:map07226	Progesterone, androgen and estrogen receptor a...
514	path:map07227	Histamine H2/H3 receptor agonists/antagonists
515	path:map07228	Eicosanoid receptor agonists/antagonists
516	path:map07229	Angiotensin receptor and endothelin receptor a...
517	path:map07230	GABA-A receptor agonists/antagonists
518	path:map07231	Sodium channel blocking drugs
519	path:map07232	Potassium channel blocking and opening drugs
520	path:map07233	Ion transporter inhibitors
521	path:map07234	Neurotransmitter transporter inhibitors
522	path:map07235	N-Metyl-D-aspartic acid receptor antagonists

	0	1
0	path:ksk00010	Glycolysis / Gluconeogenesis - Kitasatospora s...
1	path:ksk00020	Citrate cycle (TCA cycle) - Kitasatospora setae
2	path:ksk00030	Pentose phosphate pathway - Kitasatospora setae
3	path:ksk00040	Pentose and glucuronate interconversions - Kit...
4	path:ksk00051	Fructose and mannose metabolism - Kitasatospor...
5	path:ksk00052	Galactose metabolism - Kitasatospora setae
6	path:ksk00053	Ascorbate and aldarate metabolism - Kitasatosp...
7	path:ksk00061	Fatty acid biosynthesis - Kitasatospora setae
8	path:ksk00071	Fatty acid degradation - Kitasatospora setae
9	path:ksk00072	Synthesis and degradation of ketone bodies - K...
10	path:ksk00121	Secondary bile acid biosynthesis - Kitasatospo...
11	path:ksk00130	Ubiquinone and other terpenoid-quinone biosynt...
12	path:ksk00190	Oxidative phosphorylation - Kitasatospora setae
13	path:ksk00220	Arginine biosynthesis - Kitasatospora setae
14	path:ksk00230	Purine metabolism - Kitasatospora setae
15	path:ksk00240	Pyrimidine metabolism - Kitasatospora setae
16	path:ksk00250	Alanine, aspartate and glutamate metabolism - ...
17	path:ksk00253	Tetracycline biosynthesis - Kitasatospora setae
18	path:ksk00260	Glycine, serine and threonine metabolism - Kit...
19	path:ksk00261	Monobactam biosynthesis - Kitasatospora setae
20	path:ksk00270	Cysteine and methionine metabolism - Kitasatos...
21	path:ksk00280	Valine, leucine and isoleucine degradation - K...
22	path:ksk00281	Geraniol degradation - Kitasatospora setae
23	path:ksk00290	Valine, leucine and isoleucine biosynthesis - ...
24	path:ksk00300	Lysine biosynthesis - Kitasatospora setae
25	path:ksk00310	Lysine degradation - Kitasatospora setae
26	path:ksk00311	Penicillin and cephalosporin biosynthesis - Ki...
27	path:ksk00330	Arginine and proline metabolism - Kitasatospor...
28	path:ksk00332	Carbapenem biosynthesis - Kitasatospora setae
29	path:ksk00340	Histidine metabolism - Kitasatospora setae
...	...	...
100	path:ksk01057	Biosynthesis of type II polyketide products - ...
101	path:ksk01059	Biosynthesis of enediyne antibiotics - Kitasat...
102	path:ksk01100	Metabolic pathways - Kitasatospora setae
103	path:ksk01110	Biosynthesis of secondary metabolites - Kitasa...
104	path:ksk01120	Microbial metabolism in diverse environments -...
105	path:ksk01130	Biosynthesis of antibiotics - Kitasatospora setae
106	path:ksk01200	Carbon metabolism - Kitasatospora setae
107	path:ksk01210	2-Oxocarboxylic acid metabolism - Kitasatospor...
108	path:ksk01212	Fatty acid metabolism - Kitasatospora setae
109	path:ksk01220	Degradation of aromatic compounds - Kitasatosp...
110	path:ksk01230	Biosynthesis of amino acids - Kitasatospora setae
111	path:ksk01501	beta-Lactam resistance - Kitasatospora setae
112	path:ksk01502	Vancomycin resistance - Kitasatospora setae
113	path:ksk01503	Cationic antimicrobial peptide (CAMP) resistan...
114	path:ksk02010	ABC transporters - Kitasatospora setae
115	path:ksk02020	Two-component system - Kitasatospora setae
116	path:ksk02024	Quorum sensing - Kitasatospora setae
117	path:ksk02060	Phosphotransferase system (PTS) - Kitasatospor...
118	path:ksk03010	Ribosome - Kitasatospora setae
119	path:ksk03018	RNA degradation - Kitasatospora setae
120	path:ksk03020	RNA polymerase - Kitasatospora setae
121	path:ksk03030	DNA replication - Kitasatospora setae
122	path:ksk03050	Proteasome - Kitasatospora setae
123	path:ksk03060	Protein export - Kitasatospora setae
124	path:ksk03070	Bacterial secretion system - Kitasatospora setae
125	path:ksk03410	Base excision repair - Kitasatospora setae
126	path:ksk03420	Nucleotide excision repair - Kitasatospora setae
127	path:ksk03430	Mismatch repair - Kitasatospora setae
128	path:ksk03440	Homologous recombination - Kitasatospora setae
129	path:ksk04122	Sulfur relay system - Kitasatospora setae

	0	1
0	ksk:KSE_00010t	ttrA1; putative helicase
1	ksk:KSE_00020t	hypothetical protein
2	ksk:KSE_00030t	hypothetical protein
3	ksk:KSE_00040t	hypothetical protein
4	ksk:KSE_00060t	putative helicase
5	ksk:KSE_00070t	hypothetical protein
6	ksk:KSE_00080t	hypothetical protein
7	ksk:KSE_00090t	hypothetical protein
8	ksk:KSE_00100t	hypothetical protein
9	ksk:KSE_00110t	hypothetical protein
10	ksk:KSE_00120t	putative transposase
11	ksk:KSE_00130t	hypothetical protein
12	ksk:KSE_00140t	hypothetical protein
13	ksk:KSE_00150t	hypothetical protein
14	ksk:KSE_00160t	hypothetical protein
15	ksk:KSE_00170t	hypothetical protein
16	ksk:KSE_00180t	hypothetical protein
17	ksk:KSE_00190t	hypothetical protein
18	ksk:KSE_00200t	putative sesquiterpene cyclase
19	ksk:KSE_00210t	hypothetical protein
20	ksk:KSE_00220t	hypothetical protein
21	ksk:KSE_00230t	hypothetical protein
22	ksk:KSE_00240t	hypothetical protein
23	ksk:KSE_00250t	hypothetical protein
24	ksk:KSE_00260t	hypothetical protein
25	ksk:KSE_00270t	hypothetical protein
26	ksk:KSE_00280t	hypothetical protein
27	ksk:KSE_00290t	hypothetical protein
28	ksk:KSE_00300t	putative transposase
29	ksk:KSE_00310t	hypothetical protein
...	...	...
7643	ksk:KSE_76430t	hypothetical protein
7644	ksk:KSE_76440t	putative transposase
7645	ksk:KSE_76450t	hypothetical protein
7646	ksk:KSE_76460t	hypothetical protein
7647	ksk:KSE_76470t	hypothetical protein
7648	ksk:KSE_76480t	hypothetical protein
7649	ksk:KSE_76490t	hypothetical protein
7650	ksk:KSE_76500t	hypothetical protein
7651	ksk:KSE_76510t	hypothetical protein
7652	ksk:KSE_76520t	hypothetical protein
7653	ksk:KSE_76530t	hypothetical protein
7654	ksk:KSE_76540t	putative sesquiterpene cyclase
7655	ksk:KSE_76550t	hypothetical protein
7656	ksk:KSE_76560t	hypothetical protein
7657	ksk:KSE_76570t	hypothetical protein
7658	ksk:KSE_76580t	hypothetical protein
7659	ksk:KSE_76590t	hypothetical protein
7660	ksk:KSE_76600t	hypothetical protein
7661	ksk:KSE_76610t	hypothetical protein
7662	ksk:KSE_76620t	putative transposase
7663	ksk:KSE_76630t	hypothetical protein
7664	ksk:KSE_76640t	hypothetical protein
7665	ksk:KSE_76650t	hypothetical protein
7666	ksk:KSE_76660t	hypothetical protein
7667	ksk:KSE_76670t	hypothetical protein
7668	ksk:KSE_76680t	putative helicase
7669	ksk:KSE_76700t	hypothetical protein
7670	ksk:KSE_76710t	hypothetical protein
7671	ksk:KSE_76720t	hypothetical protein
7672	ksk:KSE_76730t	ttrA2; putative helicase

	0	1
0	ece:Z1464	stx2A; shiga-like toxin II A subunit encoded b...
1	ece:Z1465	stx2B; shiga-like toxin II B subunit encoded b...
2	ece:Z3343	stx1B; shiga-like toxin 1 subunit B encoded wi...
3	ece:Z3344	stx1A; shiga-like toxin 1 subunit A encoded wi...
4	ecs:ECs1205	Shiga toxin 2 subunit A
5	ecs:ECs1206	Shiga toxin 2 subunit B
6	ecs:ECs2973	Shiga toxin I subunit B
7	ecs:ECs2974	Shiga toxin I subunit A
8	ecf:ECH74115_2905	shigatoxin 2, subunit B
9	ecf:ECH74115_2906	shiga toxin subunit A
10	ecf:ECH74115_3532	shiga toxin 2 B subunit
11	ecf:ECH74115_3533	shiga toxin subunit A
12	etw:ECSP_2722	stx2cB; Shiga-like toxin II subunit B precursor
13	etw:ECSP_2723	stx2A1; Shiga-like toxin II subunit A precursor
14	etw:ECSP_3252	stx2B; shiga toxin II subunit B
15	etw:ECSP_3253	stx2A2; shiga toxin II subunit A
16	elx:CDCO157_1154	Shiga toxin 2 subunit A
17	elx:CDCO157_1155	Shiga toxin 2 subunit B
18	elx:CDCO157_2738	Shiga toxin I subunit B precursor
19	elx:CDCO157_2739	Shiga toxin I subunit A precursor
20	eoj:ECO26_1599	Shiga toxin 1 subunit A
21	eoj:ECO26_1600	Shiga toxin 1 subunit B
22	eoi:ECO111_2429	Shiga toxin 2 subunit B
23	eoi:ECO111_2430	Shiga toxin 2 subunit A
24	eoi:ECO111_3361	Shiga toxin 1 subunit A
25	eoi:ECO111_3362	Shiga toxin 1 subunit B
26	eoh:ECO103_2844	Shiga toxin 2 subunit B
27	eoh:ECO103_2845	Shiga toxin 2 subunit A
28	eoh:ECO103_5197	Shiga toxin 1 subunit A
29	eoh:ECO103_5198	Shiga toxin 1 subunit B
...	...	...
78	vg:26516283	stxB, AU083_gp11; Escherichia phage phi191; sh...
79	vg:26516284	stxA, AU083_gp12; Escherichia phage phi191; sh...
80	vg:26519429	AU154_gp39; Shigella phage Ss-VASD; Stx1 A sub...
81	vg:26519430	AU154_gp40; Shigella phage Ss-VASD; Stx1 B sub...
82	vg:1481767	stx2A, Stx2II_p143; Escherichia phage Stx2 II;...
83	vg:1481768	stx2B, Stx2II_p144; Escherichia phage Stx2 II;...
84	vg:26798065	AXI88_gp33; Shigella phage 75/02 Stx; shiga to...
85	vg:26798066	AXI88_gp34; Shigella phage 75/02 Stx; shiga to...
86	vg:1481747	stx1A, Stx1_p142; Escherichia Stx1 converting ...
87	vg:1481748	stx1B, Stx1_p143; Escherichia Stx1 converting ...
88	vg:1261950	stxA2, 933Wp40; Enterobacteria phage 933W; Shi...
89	vg:1262010	stxB2, 933Wp41; Enterobacteria phage 933W; Shi...
90	vg:2641645	stxA1, PBV4795_ORF40; Enterobacteria phage BP-...
91	vg:2641657	stxB1, PBV4795_ORF41; Enterobacteria phage BP-...
92	vg:929695	stxA2e, P27p25; Enterobacteria phage phiP27; s...
93	vg:929727	stxB2e, P27p26; Enterobacteria phage phiP27; s...
94	vg:4397483	stx2A, Stx2-86_gp01; Stx2-converting phage 86;...
95	vg:4397484	stx2B, Stx2-86_gp02; Stx2-converting phage 86;...
96	vg:6159405	stx2A, pMIN27_41; Escherichia phage Min27; Shi...
97	vg:6159351	stx2B, pMIN27_42; Escherichia phage Min27; Shi...
98	vg:6973138	Stx2-1717_gp41; Stx2-converting phage 1717; ve...
99	vg:6972909	stx2cB, Stx2-1717_gp42; Stx2-converting phage ...
100	vg:6973079	YYZ_gp39; Enterobacteria phage YYZ-2008; Shiga...
101	vg:6973080	YYZ_gp40; Enterobacteria phage YYZ-2008; Shiga...
102	vg:13828571	stx2A, D300_gp43; Escherichia phage P13374; sh...
103	vg:13828535	stx2B, D300_gp42; Escherichia phage P13374; sh...
104	vg:14005228	F366_gp36; Escherichia phage TL-2011c; Shiga t...
105	vg:14005229	F366_gp37; Escherichia phage TL-2011c; Shiga t...
106	vg:1262249	stx2A, VT2-Sap42; Enterobacteria phage VT2-Sak...
107	vg:1262250	stx2B, VT2-Sap43; Enterobacteria phage VT2-Sak...

	0	1
0	eoi:ECO111_2429	Shiga toxin 2 subunit B
1	eoi:ECO111_2430	Shiga toxin 2 subunit A
2	eoi:ECO111_3361	Shiga toxin 1 subunit A
3	eoi:ECO111_3362	Shiga toxin 1 subunit B

	0	1
0	cpd:C00051	307.323480
1	cpd:C00200	306.336960
2	cpd:C00219	304.466880
3	cpd:C00239	307.197122
4	cpd:C00270	309.269860
5	cpd:C00357	301.187702
6	cpd:C00365	308.181882
7	cpd:C00389	302.235700
8	cpd:C00732	308.372760
9	cpd:C00777	300.435120
10	cpd:C00836	301.507760
11	cpd:C00891	302.494060
12	cpd:C00892	304.509940
13	cpd:C00941	305.181242
14	cpd:C01143	308.116884
15	cpd:C01169	307.429440
16	cpd:C01294	301.191002
17	cpd:C01416	303.352940
18	cpd:C01513	306.320420
19	cpd:C01541	308.327940
20	cpd:C01564	305.368820
21	cpd:C01617	304.251580
22	cpd:C01632	306.313820
23	cpd:C01670	306.360180
24	cpd:C01682	304.299700
25	cpd:C01709	302.278760
26	cpd:C01804	304.347600
27	cpd:C01851	303.352940
28	cpd:C02197	300.392060
29	cpd:C02354	305.181242
...	...	...
472	cpd:C20121	302.346242
473	cpd:C20129	308.327940
474	cpd:C20149	302.451000
475	cpd:C20201	302.364880
476	cpd:C20203	306.396640
477	cpd:C20208	302.451000
478	cpd:C20329	304.466880
479	cpd:C20389	308.341160
480	cpd:C20423	308.116884
481	cpd:C20428	300.435120
482	cpd:C20429	300.435120
483	cpd:C20431	300.349000
484	cpd:C20559	305.221002
485	cpd:C20693	302.407940
486	cpd:C20726	306.205762
487	cpd:C20848	308.116884
488	cpd:C20939	301.709520
489	cpd:C20962	300.311000
490	cpd:C20978	308.328160
491	cpd:C21053	309.269860
492	cpd:C21107	306.205762
493	cpd:C21255	307.343400
494	cpd:C21256	305.327520
495	cpd:C21257	305.327520
496	cpd:C21258	304.342760
497	cpd:C21259	306.358640
498	cpd:C21296	304.423820
499	cpd:C21323	306.310520
500	cpd:C21561	304.466880
501	cpd:C21562	304.466880

	0	1
0	path:map00061	cpd:C00024
1	path:map00061	cpd:C00083
2	path:map00061	cpd:C00154
3	path:map00061	cpd:C00229
4	path:map00061	cpd:C00249
5	path:map00061	cpd:C00712
6	path:map00061	cpd:C01203
7	path:map00061	cpd:C01209
8	path:map00061	cpd:C01530
9	path:map00061	cpd:C01571
10	path:map00061	cpd:C02679
11	path:map00061	cpd:C03939
12	path:map00061	cpd:C04088
13	path:map00061	cpd:C04180
14	path:map00061	cpd:C04246
15	path:map00061	cpd:C04618
16	path:map00061	cpd:C04619
17	path:map00061	cpd:C04620
18	path:map00061	cpd:C04633
19	path:map00061	cpd:C04688
20	path:map00061	cpd:C05223
21	path:map00061	cpd:C05744
22	path:map00061	cpd:C05745
23	path:map00061	cpd:C05746
24	path:map00061	cpd:C05747
25	path:map00061	cpd:C05748
26	path:map00061	cpd:C05749
27	path:map00061	cpd:C05750
28	path:map00061	cpd:C05751
29	path:map00061	cpd:C05752
30	path:map00061	cpd:C05753
31	path:map00061	cpd:C05754
32	path:map00061	cpd:C05755
33	path:map00061	cpd:C05756
34	path:map00061	cpd:C05757
35	path:map00061	cpd:C05758
36	path:map00061	cpd:C05759
37	path:map00061	cpd:C05760
38	path:map00061	cpd:C05761
39	path:map00061	cpd:C05762
40	path:map00061	cpd:C05763
41	path:map00061	cpd:C05764
42	path:map00061	cpd:C06423
43	path:map00061	cpd:C06424
44	path:map00061	cpd:C08362
45	path:map00061	cpd:C16219
46	path:map00061	cpd:C16220
47	path:map00061	cpd:C16221
48	path:map00061	cpd:C16520
49	path:map00061	cpd:C20794

	0	1
0	path:map00061	rn:R00742
1	path:map00061	rn:R01280
2	path:map00061	rn:R01624
3	path:map00061	rn:R01626
4	path:map00061	rn:R01706
5	path:map00061	rn:R02814
6	path:map00061	rn:R03370
7	path:map00061	rn:R04014
8	path:map00061	rn:R04355
9	path:map00061	rn:R04428
10	path:map00061	rn:R04429
11	path:map00061	rn:R04430
12	path:map00061	rn:R04533
13	path:map00061	rn:R04534
14	path:map00061	rn:R04535
15	path:map00061	rn:R04536
16	path:map00061	rn:R04537
17	path:map00061	rn:R04543
18	path:map00061	rn:R04544
19	path:map00061	rn:R04566
20	path:map00061	rn:R04568
21	path:map00061	rn:R04724
22	path:map00061	rn:R04725
23	path:map00061	rn:R04726
24	path:map00061	rn:R04952
25	path:map00061	rn:R04953
26	path:map00061	rn:R04954
27	path:map00061	rn:R04955
28	path:map00061	rn:R04956
29	path:map00061	rn:R04957
30	path:map00061	rn:R04958
31	path:map00061	rn:R04959
32	path:map00061	rn:R04960
33	path:map00061	rn:R04961
34	path:map00061	rn:R04962
35	path:map00061	rn:R04963
36	path:map00061	rn:R04964
37	path:map00061	rn:R04965
38	path:map00061	rn:R04966
39	path:map00061	rn:R04967
40	path:map00061	rn:R04968
41	path:map00061	rn:R04969
42	path:map00061	rn:R04970
43	path:map00061	rn:R07639
44	path:map00061	rn:R07762
45	path:map00061	rn:R07763
46	path:map00061	rn:R07764
47	path:map00061	rn:R07765
48	path:map00061	rn:R08157
49	path:map00061	rn:R08158
50	path:map00061	rn:R08159
51	path:map00061	rn:R08161
52	path:map00061	rn:R08162
53	path:map00061	rn:R08163
54	path:map00061	rn:R10700
55	path:map00061	rn:R10707
56	path:map00061	rn:R10714

09 - Programming for KEGG¶