OBO flat file import

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(OBO terms)
(Conversion details)
Line 31: Line 31:
  
 
Additionally term id is attached to the term topic as '''obo-id''' text occurrence. Term definition and comment are also attached to the term topic as text occurrences of type '''definition''' and '''comment'''. Wandora gives definition origin and origin description own topics. An association of type '''definition-origin''' is created to link term topic and defining authorship and description of authorship. Wandora creates one '''definition-origin''' association for each definition origin.
 
Additionally term id is attached to the term topic as '''obo-id''' text occurrence. Term definition and comment are also attached to the term topic as text occurrences of type '''definition''' and '''comment'''. Wandora gives definition origin and origin description own topics. An association of type '''definition-origin''' is created to link term topic and defining authorship and description of authorship. Wandora creates one '''definition-origin''' association for each definition origin.
 +
 +
Wandora creates a stub topic for each alternative id using the subject identifier schema described above and links term topic with the alternative term using '''alternative-term''' associations.
  
 
Wandora creates an association of type '''Namespace''' to link term topic and it's namespace.
 
Wandora creates an association of type '''Namespace''' to link term topic and it's namespace.
 +
 +
Xrefs are used in OBO format to link similar terms in external ontologies. Wandora creates a stub topic for xref term and links the term topic and xref topic with a '''xref''' association. It is assumed the xref term gets detailed structure and properties within a merge of ontology describing the term.
  
 
OBO term may have multiple synonym names. Each synonym name has scope, type, and origin. Although scope and type are also features found in variant names of Topic Maps, OBO synonym is '''not''' converted to a variant name but a topic associated to the term. Design decision is due to a rather rigid variant name schema of Wandora. Wandora creates a topic for each term synonym, scope, type, origin, and origin description. An association of type '''synonym''' is created to link the term and synonym. If synonym has scope, type, and described origin, the association has 5 players.
 
OBO term may have multiple synonym names. Each synonym name has scope, type, and origin. Although scope and type are also features found in variant names of Topic Maps, OBO synonym is '''not''' converted to a variant name but a topic associated to the term. Design decision is due to a rather rigid variant name schema of Wandora. Wandora creates a topic for each term synonym, scope, type, origin, and origin description. An association of type '''synonym''' is created to link the term and synonym. If synonym has scope, type, and described origin, the association has 5 players.
  
 
A note should be taken here. Notice the origin description is '''not''' real property of the '''origin''' but a floating property of each synonym and definition-origin association. This design decision is due to an observation that dbxref description used in synonyms and definitions are '''not''' used consistent. Actually some OBO ontologies use dbxrefs and their descriptions as if they were slots and properties. I don't really know if this was the intention of OBO file format authors but the format allows such usage and it has been used. However, this means that dbxref's description may vary along the ontology and each description is valid only in given context.
 
A note should be taken here. Notice the origin description is '''not''' real property of the '''origin''' but a floating property of each synonym and definition-origin association. This design decision is due to an observation that dbxref description used in synonyms and definitions are '''not''' used consistent. Actually some OBO ontologies use dbxrefs and their descriptions as if they were slots and properties. I don't really know if this was the intention of OBO file format authors but the format allows such usage and it has been used. However, this means that dbxref's description may vary along the ontology and each description is valid only in given context.

Revision as of 19:27, 28 January 2008

Converts OBO flat file v1.2 ontology to a topic map and merges converted topic map to Wandora. Import begins with menu option File > Import > OBO import.... Wandora accepts also OBO files to be dropped over Wandora window. If OBO file is dropped to layer stack, new layer is created for the imported file.

OBO flat file format is used mainly in bioinformatics to store and share ontologies related to biosciences. OBO flat file format was initially developed for The Gene Ontology. However, there exists over 60 different and public ontologies in OBO format today. These ontologies can be browsed and downloaded at Open Biological Ontologies Foundry.

As an example of OBO import we have converted the Gene Ontology to a topic map. Below is a screenshot of Wandora with Gene Ontology topic map open. Addition to OBO import, Wandora is also capable to export topic map back to OBO flat file format. Read more at OBO flat file export and OBO round trip.


Gene ontology.gif


Conversion details

OBO import creates a root topic for each namespace. Root topic is named obo (namespace) where namespace is a name of the namespace. Namespace specific root topic is associated to a collection of meta-topics:

  • category (namespace) collects all subsets specified in the OBO file. Subset is a named category of terms in OBO file. Subset is defined in OBO file header with tag subsetdef. If namespace doesn't contain categories category (namespace) is also missing.
  • header (namespace) is a header topic collecting OBO header properties. Each header property is stored as a text occurrences with a type generated using header tag name.
  • term (namespace) collects all OBO terms in given namespace. OBO term is an instance of the term (namespace) topic.
  • obsolete (namespace) collects all obsolete terms in given namespace. Obsolete OBO term is an instance of the obsolete (namespace) topic. Obsolete term topic is also an instance of term (namespace).
  • synonym (namespace) collects all OBO term synonyms in given namespace. OBO term synonym is an instance of the synonym (namespace) topic.
  • description (namespace) collects all dbxref description topics.
  • typedef collects all relationship topics used in OBO file. Typical relationships used in OBO file are part_of and develops_from. Namespace is not used in relationship names.


OBO terms

OBO term is described with [Term] stanza in OBO file. Each OBO term is converted into a topic. Wandora gives term topic a subject identifier constructed using term id. Subject identifier pattern is

http://www.wandora.org/obo/ID

where ID is term's id. As an example term GO:0010480 gets subject identifier http://www.wandora.org/obo/GO:0010480. Wandora gives term topic also a base name constructed with term name and id. As an example term GO:0010480 is given a base name microsporocyte differentiation (GO:0010480). Wandora gives term topic English display variant name that is the name used to specify topic in OBO file.

Additionally term id is attached to the term topic as obo-id text occurrence. Term definition and comment are also attached to the term topic as text occurrences of type definition and comment. Wandora gives definition origin and origin description own topics. An association of type definition-origin is created to link term topic and defining authorship and description of authorship. Wandora creates one definition-origin association for each definition origin.

Wandora creates a stub topic for each alternative id using the subject identifier schema described above and links term topic with the alternative term using alternative-term associations.

Wandora creates an association of type Namespace to link term topic and it's namespace.

Xrefs are used in OBO format to link similar terms in external ontologies. Wandora creates a stub topic for xref term and links the term topic and xref topic with a xref association. It is assumed the xref term gets detailed structure and properties within a merge of ontology describing the term.

OBO term may have multiple synonym names. Each synonym name has scope, type, and origin. Although scope and type are also features found in variant names of Topic Maps, OBO synonym is not converted to a variant name but a topic associated to the term. Design decision is due to a rather rigid variant name schema of Wandora. Wandora creates a topic for each term synonym, scope, type, origin, and origin description. An association of type synonym is created to link the term and synonym. If synonym has scope, type, and described origin, the association has 5 players.

A note should be taken here. Notice the origin description is not real property of the origin but a floating property of each synonym and definition-origin association. This design decision is due to an observation that dbxref description used in synonyms and definitions are not used consistent. Actually some OBO ontologies use dbxrefs and their descriptions as if they were slots and properties. I don't really know if this was the intention of OBO file format authors but the format allows such usage and it has been used. However, this means that dbxref's description may vary along the ontology and each description is valid only in given context.

Personal tools