REST API for the OMA Browser
The OMA Brower now supports its own REST API, which serves as a window into its database. It enables the abstraction of OMA browser’s data which also allows it to be more accessible via other programming languages. Furthermore, due to its RESTful architecture, it allows the users to access the data they need easily through HTTP requests sent to the server which can be found at:
The OMA REST API supports many data formats and its predominant use of the JSON data format makes it very browser-compatible.To further facilitate the ease of access, we have also developed a Python library and R package to serve as user friendly wrappers around the API. The links to their documentation can be found below:
- R package: OmaDB bioconductor package; project repository
- Python library: OmaDB python package
Pagination in the API
Note that some endpoints, such as the list of all Hierarchical Orthologous groups, return a very long list. In order to keep the response size and time manageable the API will take advantage of pagination, which is a mechanism for returning a subset of the results for a request (by default, we return the results in chunks of 100 objects) and allowing for subsequent requests to “page” through the rest of the results until the end is reached. This pagination is implemented with a "page" query parameter and a "per_page" parameter to specify the number of objects per page.
Information about pagination is provided in the Link header of an API call. This is a popular way how to implement pagination for APIs. In essence this strategy includes in the HTTP reply header a "Link" header with the urls to to next, previous, first and last page. Furthermore, we include in the "X-Total-Count" header the total number of objects that the request will return over all pages. Here is an example request:
curl -I "https://omabrowser.org/api/genome/?page=2"
HTTP 200 OK Allow: GET, HEAD, OPTIONS Content-Type: application/json Link: <https://omabrowser.org/api/genomes/>; rel="first", <https://omabrowser.org/api/genomes/>; rel="prev", <https://omabrowser.org/api/genomes/?page=3>; rel="next", <https://omabrowser.org/api/genomes/?page=22>; rel="last" Vary: Accept X-Total-Count: 2198
HTTP status codes
Available API Endpoints
Below we list all the available endpoints together with a brief description of what they return and the parameters they take.
function
list
Annotate a sequence with GO functions based on all annotations in OMA. The sequence is expected to be a simple string of amino acids and can be passed as a query parameter
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
query required | the sequence to be annotated |
genome
list
List of all the genomes present in the current release.
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
page | A page number within the paginated result set. |
per_page | Number of results to return per page. |
read
Retrieve the information available for a given genome.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
genome_id required | an unique identifier for a genome - either its ncbi taxon id or the UniProt species code |
proteins
Retrieve the list of all the protein entries available for a genome.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
genome_id required | an unique identifier for a genome - either its ncbi taxon id or the UniProt species code |
group
list
List of all the OMA Groups in the current release.
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
page | A page number within the paginated result set. |
per_page | Number of results to return per page. |
read
Retrieve the information available for a given OMA group.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
group_id required | an unique identifier for an OMA group - either its group number, its fingerprint or an entry id of one of its members |
close_groups
Retrieve the sorted list of closely related groups for a given OMA group.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
group_id required | an unique identifier for an OMA group - either its group number, its fingerprint or an entry id of one of its members |
hog
list
List of all the HOGs identified by OMA.
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
level | filter the list of HOGs by a specific taxonomic level. |
compare_with | compares the hog at `level` with those passed with this argument (must be a parent level) and annotates all hogs with the evolutionary events that occured between the two points in time. |
page | A page number within the paginated result set. |
per_page | Number of results to return per page. |
read
Retrieve the detail available for a given HOG, along with its deepest level (i.e. root level) as well as the list of all the taxonomic levels that the HOG spans through.
For a given hog_id, the endpoint searches the deepest taxonomic level that
has this ID, unless a more recent level has been chosen with the level
query
parameter in which case the following information is returned for all induced
hogs.
The endpoint returns an object per hog with the level, urls to the members and a list of parent and children hogs. The parent and children hogs are more ancient resp. more recent levels that involve at least one duplication event on the lineage from the query hog. So, in the parent_hogs, one can find hogs for which we infer a duplication event to the query hog level, where as for the children_hogs there happened at least one duplication event after the query hog level. In addition, we indicate alternative levels for which we infer that no event happened between those levels for this specific hog.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
hog_id required | an unique identifier for a hog_group - either its hog id or one of its member proteins |
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
level | taxonomic level of restriction for a HOG. The special level 'root' can be used to identify the level at the roothog, i.e. the deepest level of that HOG. |
members
Retrieve a list of all the protein members for a given hog_id.
The hog_id parameter uses an encoding of the inferred duplication events along the evolution of the family using the LOFT schema (see https://doi.org/10.1186/1471-2105-8-83).
The hog_id changes only after duplication events and hence, the ID remains the same for potentially many taxonomic levels. If no level parameter is provided, this endpoint returns the deepest level that contains this specific ID.
If a level is provided, the endpoint returns the members with respect to this level. Note that if the level is a more ancient taxonomic level than the deepest level for the specified hog_id, the endpoint retuns the members of for that more ancient level (but adjusting the hog_id in the result object). The special level "root" will always return the members of the root HOG together with its deepest level.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
hog_id required | an unique identifier for a hog_group - either its hog id starting with "HOG:" or one of its member proteins in which case the specific HOG ID of that protein is used. (Example: HOG:0001221.1a, P12345) |
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
level | taxonomic level of restriction for a HOG - default is its deepest level for a given HOG ID. (Example: Mammalia) |
similar_profile_hogs
Returns the HOGs with the most similar phylogenetic profiles.
The profiles are based on the number of duplications, losses and retained genes along the phylogenetic tree. Hence, the profiles are computed on the deepest level only and all sub-hogs ids will return the same similar HOGs.
Similar profile search is only useful for hogs that have a certain size, i.e. 100 species. For smaller query HOGs, the result will simply be empty.
The result contains for both, the query HOG as well as the similar HOGs
a field in_species
that contains a list of all species in which at
least one copy of the gene is present in the HOG.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
hog_id required | an unique identifier for a hog_group - either its hog id starting with "HOG:" or one of its member proteins in which case the specific HOG ID of that protein is used. (Example: HOG:0450897, P12345) |
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
max_results | the number of similar profiles to return. Must be a positive number less than 50. By default the 10 most HOGs with the most similar profiles are returned. (Example: 20) |
pairs
read
List the pairwise relations among two genomes
The relations are orthologs in case the genomes are different and close paralogs and homoeologs in case they are the same.
By using the query_params 'chr1' and 'chr2', one can limit the relations to a certain chromosome for one or both genomes. The id of the chromosome corresponds to the ids returned by the genome endpoint.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
genome_id1 required | an unique identifier for the first genome - either its ncbi taxon id or the UniProt species code |
genome_id2 required | an unique identifier for the second genome - either its ncbi taxon id or the UniProt species code |
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
chr1 | id of the chromosome of interest in the first genome |
chr2 | id of the chromosome of interest in the second genome |
rel_type | limit relations to a certain type of relations, e.g. '1:1'. |
protein
bulk_retrieve
Retrieve the information available for multiple protein IDs at once.
The POST request must contain a json-encoded list of IDs of up to 1000 IDs for which the information is returned.
In case the ID is not unique or unknown, an empty element is returned for this query element.
changed in verison 1.7: the endpoint returns now a list with tuples (query_id, target) instead of a simple list of proteins in the order of the query ids.
Request Body
The request body should be "application/json"
encoded, and should contain a single item.
Parameter | Description |
---|---|
ids required | A list of ids of proteins to retrieve |
read
Retrieve the information available for a protein entry.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either it entry number, omaid or its canonical id |
domains
List of the domains present in a protein.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either it entry number, omaid or its canonical id |
gene_ontology
Gene ontology information available for a protein.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either its entry number, omaid or its canonical id |
hog_derived_orthologs
List of the orthologs derived from the hog for a given protein.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either it entry number, omaid or its canonical id |
homoeologs
List of all the homoeologs for a given protein.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either its entry number, omaid or canonical id. |
isoforms
List of isoforms for a protein.
The result contains a list of proteins with information on their locus and and exon structure for all the isoforms recored in OMA belonging to the gene of the query protein.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either it entry number, omaid or its canonical id |
orthologs
List of all the identified pairwise orthologues for a protein. Filtering specific subtypes of orthology is possible by specifying a rel_type query parameter.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either it entry number, omaid or its canonical id |
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
rel_type | filter for orthologs of a specific relationship type only (Example: 1:1) |
xref
List of cross-references for a protein.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
entry_id required | an unique identifier for a protein - either it entry number, omaid or its canonical id |
sequence
list
Identify a protein sequence.
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
query required | the sequence to be searched. |
search | argument to choose search strategy. Can be set to 'exact', 'approximate' or 'mixed'. Defaults to 'mixed', meaning first tries to find exact match. If no target can be found, uses approximate search strategy to identify query sequence in database. |
full_length | a boolean indicating whether or not for exact matches, the query sequence must be matching the full target sequence. By default, a partial exact match is also reported as exact match. |
summary
shared_ancestry > read
Returns the fraction of shared ancestry between to species of interest.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
genome_id1 required | an unique identifier for the first genome - either its ncbi taxon id or the UniProt species code |
genome_id2 required | an unique identifier for the second genome - either its ncbi taxon id or the UniProt species code |
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
type | type of orthology information to compute the fraction of shared ancestry, either 'hogs' (default) or 'vps'. |
taxonomy
list
Retrieve the taxonomic tree that is available in the current release.
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
type | the type of the returned data - either dictionary (default), newick or phyloxml. |
members | list of members to get the induced taxonomy from. The list is supposed to be a comma-separated list. Member IDs can be either their ncbi taxon IDs or their UniProt species codes - they just have to be consistent. |
collapse | whether or not taxonomic levels with a single child should be collapsed or not. Defaults to yes. |
read
Retrieve the subtree rooted at the taxonomic level indicated.
Path Parameters
The following parameters should be included in the URL path.
Parameter | Description |
---|---|
root_id required | either the taxon id, species name or the 5 letter UniProt species code for a root taxonomic level |
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
type | the type of the returned data - either dictionary (default) or newick. |
collapse | whether or not taxonomic levels with a single child should be collapsed or not. Defaults to yes. |
version
xref
list
List all the crossreferences that match a certain pattern.
Query Parameters
The following parameters should be included as part of a URL query string.
Parameter | Description |
---|---|
search | the pattern to be searched for. The pattern must be at least 3 characters long in order to return a hit. |