UniProt
¶UniProt
provides three key databases:
UniProtKB
is likely to be the database you use most frequently to find information on gene product/protein molecular function. It is the central hub of functional information on proteins, and collates functional annotations from many other databases, ontologies, and references. It keeps records of how annotations are derived (e.g. experimentally or computationally), and is divided into two sections: one contains manually-annotated records (UniProtKB/Swiss-Prot
), and a second contains computational annotations that are waiting for manual curation (UniProtKB/TrEMBL
).
UniParc
is a comprehensive, non-redundant database that contains most of the publicly-available protein sequences from a range of sources.
UniRef
provides clustered sets of sequences from UniProtKB
and UniParc
. A number of clusterings at different stringencies are provided.
These databases can be queried in a number of ways, including:
UniProt
website http://www.uniprot.org/ in your web browserUniProt
website, using a programming languageUniProt
website¶UniProt
webpage.The landing page offers options for each of the three main databases: UniProtKB, UniRef, and UniParc. It also offers sets of complete proteomes for a range of organisms, and databases of proteins organised by supporting data, such as literature, taxonomic classification, and subcellular location.
UniProtKB
¶UniProtKB
link. This will take you to the UniProtKB front page, with a summary of entries, and a number of links.Search
button."kitasatospora"
as an organism or by taxonomy. Click on the organism
filter.History
. Click on this button. A small window will open, with a link to Previous versions
. Click on this link.Compare
button.In this section, you'll use the advanced searches to identify candidate human proteins that are found in the nucleus, and have been associated with some disease activity or function.
Organism [OS]
with search term "Human". The dropdown will offer you several options as you type, but do not select them (you could have entered the organism "Homo sapiens" here, also).AND
on the left, and select Subcellular location
with search term "nucleus". The dropdown will offer several options but, again, do not select them. At this point, allow any assertion method for the evidence code.+
) to get another search term field.AND
on the left, and select Pathology & Biotech
with class "Disease", and no search term. At this point, allow any assertion method for the evidence code.Evidence
option for the Pathology & Biotech
part of the search to manual "Experimental" evidence.Term
option for the Pathology & Biotech
part of the search to "melanoma".UniProtKB
Search Results¶After the search above, you should be left with a small set of proteins that satisfy the following criteria:
gzip
ped) or as raw recordsFASTA
)Excel
, tab-separated
)XML
, RDF
)Using the UniProtKB
search tools, can you find and download sets of proteins that satisfy the following requirements: