Topic map conversions of Gellish ontologies

From WandoraWiki
Jump to: navigation, search

Gellish, also known as STEPlib, is a controlled vocabulary used especially in engineering to specify products, facilities, and processes. Gellish vocabulary contains concepts and named binary relations, also called as facts, between concepts. Gellish relation contains also meta data describing relation's discipline, definition, date of creation and modification, signature (originator), reference and status.

Gellish ontologies were converted to topic maps with Wandora's Gellish ontology extractor. Original Gellish ontologies (version 8.0) were downloaded from SourceForge March 2009.

Contents

Download Topic map conversions of Gellish ontologies

All converted Gellish ontologies are available as Wandora project files and zipped XTM 2.0 dumps.


Short name Long name Original version Wandora project file Zipped XTM 2.0
Activities Occurrencies, including activities, events and processes 8.0 (2008-04-21) Download 751K Download 837K
Aspects Aspects, Properties, States, Qualities, Roles & Knowledge about facts 8.0 (2008-05-21) Download 765K Download 773K
Documents Documents, Data, Information & Encoding & Symbols & Annotation 8.0 (2008-04-09) Download 862K Download 891K
Electrical Electrical, Instrumentation, Control, IT, including Valves, Functions & Roles of Signals 8.0 (2008-04-09) Download 1.2M Download 1.2M
Geo Geographic objects and Lifeforms - Organisms, Persons and Organizations 8.0 (2008-04-09) Download 272K Download 292K
Math Mathematical & Geometric objects & Spatial Aspects 8.0 (2008-08-03) Download 1.0M Download 1.0M
Physical objects Physical objects, incl. radiation, fluid batches, static equipment, civil items, plants & process units, piping, connection & protection material 8.0 (2008-05-14) Download 1.1M Download 1.2M
Qualitative aspects Qualitative aspects & Qualitative information, Elements, Fluids, Solids and Waves, incl. Construction material 8.0 (2008-04-21) Download 2.4M Download 2.9M
Roles Roles of physical objects and roles of aspects 8.0 (2008-04-14) Download 2.6M Download 2.6M
Rotating Rotating Equipment, including Electrical machines, Transport equipment, (Un)Loading facilities, and Solids handling 8.0 (2008-04-07) Download 641K Download 749K
Top Upper ontology - Gellish grammar - TOP of specialization hierarchy + Extensions to Upper ontology - TOP of specialization hierarchy 8.0 (2008-04-07) Download 885K Download 852K
Units Scales and Units of measure + Scales and Units of measure - Special symbols (Unicode) 8.0 (2008-08-03) Download 590K Download 559K

History

2009-04-08 First release.

Conversion details

Topic Maps conversions use Gellish excel spreadsheets (See screen capture below).

Gellish excel.gif

First, these Excel spreadsheets were converted to tab delimited text. Then, these tab delimited texts were extracted using Wandora's Gellish ontology extractor. The extractor was programmed to convert these tab texts to Topic Maps format. The conversion process reads the tab text file one line at a time and converts the line to a fact topic, and consequently concept, date, etc. topics. In detail,

  • Each line in extracted file represents one Gellish fact. Gellish fact has an identifier specified in column H. A topic is created for the fact. Topic's subject identifier and basename is derived from fact's identifier.
  • Each fact is basically a named binary relation.
    • Relation has a name specified with columns I and J. Column I contains relation identifier and J name of the relation. A topic is created for relation name. Topic's subject identifier is derived from relation identifier. Topic's basename is derived from relation name and identifier.
    • Relation's left hand operand (player in Topic Maps world) has been specified with colums E and G. Column E contains left hand operand identifier and column G operand name. A topic is created for left hand operand. Topic's subject identifier is derived from identifier.
    • Relation's right hand operand (player in Topic Maps world) has been specified with columns K and M. Column K contains right hand operand identifier and column M operand name. A topic is created for right hand operand. Topic's subject identifier is derived from identifier.

To bind all topics created in previous steps, a quartet association is created. Association's type is fact. Association has four players with distinct roles:

  • Fact topic with role fact.
  • Left hand concept topic with role left hand object.
  • Relation type concept topic with role relation type.
  • Right hand concept topic with role right hand object.

Now, a special note should be added here. It was discovered that Gellish concept has always unique identifier but it's name (specified in columns G, J, and M) varies. For example, Gellish vocabulary includes a relation is a synonym of that binds two names of single concept: Left hand operand identifier is equal to right hand operand identifier but left hand operand name is different compared to right hand operand name. This naming method is little challenging if you think data model of Topic Maps. If you were to model Gellish synonyms with variant names of Topic Maps, you would loose simple association model for facts. Modelling Gellish synonyms with base names would loose the inheritance structure of different names.

The solution models different Gellish names to different concept topics i.e. in Topic Maps conversion multiple topics represents same Gellish concept. Topics representing same Gellish concept have different subject identifier URI fragment number. For example, consider Gellish fact

Left hand id Left hand name Fact id Relation id Relation name Right hand id Right hand name
190 699 produce conceptual process design 1 190 042 1 981 is a synonym of 190 699 conceptual process design

When this fact is converted to topics and assiciations, a separate topic is generated for both left hand concept and right hand concept although they essentially represent same Gellish concept. Left hand concept is set a subject identifier

http://www.wandora.org/gellish/190699#0

and basename

produce conceptual process design (190699#0)

while left hand operand is set a subject identifier

http://www.wandora.org/gellish/190699#1

and basename

conceptual process design (190699#1)

If there already exists a Gellish concept topic with identical name, then it is used and no new topic is created. No explicit association is created to bind different topics representing same Gellish concept. One has to compare external representations of subject identifiers to deduce topics refer same concept. Also, note the URI fragment numbering starts from 0 (zero) and is increased by one each time the concept has different name. User should note that this numbering scheme has consequences in merging Gellish topic maps. If you merge Gellish topic maps, it is very likely that wrong topics representing equivalent Gellish concepts merge. The fragment numbering in first topic map is not equal to the numbering used in second topic map. If you wish to merge Gellish topic maps, you should convert all Gellish ontologies to be merged sequently with Wandora.

After fact topic and association creation, Wandora processes rest columns. Common procedure is to create either a topic or an occurrence for the column value and then associate (attach) the topic or occurrence to the created fact topic. Processed columns are

  • Column D with fact's discipline. Discipline is converted to a topic. Discipline topic is associated with the fact topic.
  • Column N with definition. One could think the column as a definition of right hand object but as the Gellish ontology format binds it to a fact, it was attached to the fact topic as an occurrence. Definition is converted to an occurrence. Definition occurrence is attached to the fact topic.
  • Column O with full definition. Full definition is converted to an occurrence. Full definition occurrence is attached to the fact topic.
  • Column R with remarks. Remarks are converted to an occurrence. Remarks occurrence is attached to the fact topic.
  • Column S with status. Status is converted to a topic. Status topic is associated with the fact topic.
  • Column U with date of start. Date is converted to a topic. Date topic is associated with the fact topic.
  • Column V with date of latest change. Date is converted to a topic. Date topic is associated with the fact topic.
  • Column W with change author. Change author is converted to a topic. Author topic is associated with the fact topic.
  • Column X with reference author. Reference author is converted to a topic. Reference topic is associated with the fact topic.
  • Column Y with a fact collection name. Fact collection name is converted to a topic. Collection topic is associated with the fact topic.

Remaining columns are not processed. One should note that at least column B with language of the left hand object, columns F and L with operand cardinalities are not converted to topics and associations.

Below is a screen capture of Wandora with Documents ontology. Current topic represents Gellish fact that says HTTP open SSL is a synonym of hypertext transfer protocol over secure socket layer. Fact's discipline is information management and fact is signed by Andries van Renssen.

Gellish example.gif


Then, user clicks on topic hypertext transfer protocol over secure socket layer (970276#0) and Wandora opens it for detailed inspection (See below).

Gellish example 02.gif


And then user clicks on topic communication protocol (910839#0) and Wandora opens it for detailed inspection (See below).

Gellish example 03.gif

This topic browsing is common feature in Wandora. User can continue browsing Gellish topics. Beside current topic, Wandora views all associations and occurrences attached to current topic.

Metrics

Ontology # of topics # of associations # of topic base names # of subject identifiers # of subject locators # of occurrences # of distinct topic classes # of distinct types of associations # of distinct roles in associations # of distinct players in associations
Activities 9826 36516 9820 9850 0 7858 9 9 12 9813
Aspects 10212 34742 10206 10251 0 6131 9 9 12 10199
Documents 11060 42948 11054 11095 0 6861 9 9 12 11047
Electrical 15746 59133 15740 15781 0 9205 9 9 12 15733
Geo 4592 17318 4586 4599 0 1735 9 9 12 4579
Math 17265 68404 17259 17280 0 8642 9 9 12 17252
Physical objects 14944 58142 14938 15013 0 10294 9 9 12 14931
Qualitative aspects 35706 141559 35700 35735 0 15468 9 9 12 35693
Roles 35706 141559 35700 35735 0 15468 9 9 12 35693
Rotating 9002 30781 8996 9024 0 5212 9 9 12 8989
Top 12236 46099 12230 12252 0 3928 9 9 12 12223
Units 8199 38173 8193 8211 0 2725 9 9 12 8186

License

Lesser General Public License (LGPL)

Personal tools