Importing RDF

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
Wandora reads [http://www.w3.org/RDF/ RDF] XML, N3, and Turtle files. Import starts with '''File > Import > [[SimpleRDFImport|Simple RDF XML Import...]]''' or '''File > Import > [[SimpleN3Import|Simple RDF N3 Import...]]''' or '''File > Import > Simple RDF Turtle Import...'''. Optionally you can drag and drop RDF files to layer stack. Layer stack automatically imports dropped RDF file and creates a new layer for the file. Wandora converts imported RDF triplets to topics, associations and occurrences. Convert schema is very simple and pays no attention to semantics of RDF file. Lets see the conversion process more detailed.
+
Wandora reads [http://www.w3.org/RDF/ RDF] XML, N3, Turtle and JSON-LD files. Import starts with '''File > Import > [[SimpleRDFImport|Simple RDF XML Import...]]''' or '''File > Import > [[SimpleN3Import|Simple RDF N3 Import...]]''' or '''File > Import > Simple RDF Turtle Import...''' or '''File > Import > Simple RDF JSON-LD Import...'''. Optionally you can drag and drop RDF files to layer stack. Layer stack automatically imports dropped RDF file and creates a new layer for the file. Wandora converts imported RDF triplets to topics, associations and occurrences. Convert schema is very simple and pays no attention to semantics of RDF file. Lets see the conversion process more detailed.
  
 
* A topic is always created for RDF '''subject''' and '''predicate'''. Topics created for the '''subject''' and '''predicate''' are typed with Wandora's predefined type topics.
 
* A topic is always created for RDF '''subject''' and '''predicate'''. Topics created for the '''subject''' and '''predicate''' are typed with Wandora's predefined type topics.

Revision as of 14:01, 29 May 2015

Wandora reads RDF XML, N3, Turtle and JSON-LD files. Import starts with File > Import > Simple RDF XML Import... or File > Import > Simple RDF N3 Import... or File > Import > Simple RDF Turtle Import... or File > Import > Simple RDF JSON-LD Import.... Optionally you can drag and drop RDF files to layer stack. Layer stack automatically imports dropped RDF file and creates a new layer for the file. Wandora converts imported RDF triplets to topics, associations and occurrences. Convert schema is very simple and pays no attention to semantics of RDF file. Lets see the conversion process more detailed.

  • A topic is always created for RDF subject and predicate. Topics created for the subject and predicate are typed with Wandora's predefined type topics.
  • If object is RDF literal, an occurrence (text data) is created for the subject topic. Occurrence's type is the predicate topic and occurrence's value the RDF literal. Occurrence's scope is derived from lang attribute. If lang attribute is not found, scope is language independent.
  • If object is not RDF literal, a topic is created for the object and the topic is associated with the subject topic. Association's type is the predicate topic. Both roles are Wandora's predefined topics. Object topic is typed with Wandora's predefined type topic.

Created topics doesn't contain base names or variant names. Created topics inherit one subject identifier from equivalent RDF resource. Subject identifier is the URI of equivalent RDF resource. Wandora employs Jena RDF framework to read RDF files. Below is the Java code snippet used to handle RDF statements in Wandora.


   public void handleStatement(Statement stmt, TopicMap map,
                               Topic subjectType,
                               Topic predicateType,
                               Topic objectType) throws TopicMapException {
       
       Resource subject   = stmt.getSubject();     // get the subject
       Property predicate = stmt.getPredicate();   // get the predicate
       RDFNode object     = stmt.getObject();      // get the object
       String lan         = null;                  // language attribute
       
       Topic subjectTopic = getOrCreateTopic(map, subject.toString());
       Topic predicateTopic = getOrCreateTopic(map, predicate.toString());
       
       subjectTopic.addType(subjectType);
       predicateTopic.addType(predicateType);
      
       if(object.isLiteral()) {
           try { lan = stmt.getLanguage(); } catch(Exception e) { /* LANG ATTRIBUTE NOT FOUND! */ }
           if(lan==null || lan.length()==0) {
              subjectTopic.setData(predicateTopic,
                                getOrCreateTopic(map, occurrenceScopeSI),
                                                 ((Literal) object).getString());
           }
           else {
              subjectTopic.setData(predicateTopic,
                                getOrCreateTopic(map, XTMPSI.getLang(lan)),
                                                 ((Literal) object).getString());
           }
       }
       else if(object.isResource()) {
           Topic objectTopic = getOrCreateTopic(map, object.toString());
           Association association = map.createAssociation(predicateTopic);
           association.addPlayer(subjectTopic, subjectType);
           association.addPlayer(objectTopic, objectType);
           objectTopic.addType(objectType);
       }
       else if(object.isURIResource()) {
           log("URIResource found but not handled!");
       }        
   }


Post-processing the imported RDF

To make the imported RDF more topic mappish you may want to modify it after import. This chapter discusses about the post-processing techniques to make the RDF-imported topic map more convenient.

Constructing base names

RDF(S) originated topics contain no base names. First step is to add base names to the imported topics. You can create a base name with topic's subject identifier using Make base name with SI tool found in topic table's context menu under Topics > Base names. Base name is automatically constructed using filename and anchor of the subject identifier URLs. If the created topic map contains subject identifiers with identical filenames, take extra care of these topics to prevent automatic merge of topics.

Second step is to clean up base names. You can use Topics > Base names > Regex replace... to filter out undesired parts of the base names. If you start the tool in context of layer, tool processes all base names found in layer's topic map. For example, to filter out starting prefix string in base names you could use regular expression

prefix(.+)

and replacement

$1

Constructing variant names

Third step is to generate variant names from RDF label occurrences. Generally RDF document carries labels attached to RDF concepts. Labels may be language dependent. If such labels exists, a label occurrence is associated to RDF topic. To generate variant names from RDF label occurrences, select all RDF topics and use tool Topics > Variant names > Make display variants with occurrences. Tool copies occurrence texts to variant names.

If variant construction was successful, you may want to remove label occurrences. To remove occurrences of given type use tool Topics > Occurrences > Delete occurrences with type.... Tool seeks all possible occurrence types and asks which occurrences to remove. Once again, if you want to process every topic in topic map, start the tool in context of layer.

Processing associations

Final step is to change roles of RDF originated associations. By default these roles are

You can not rename role topics as all players share same roles. Instead you need to modify associations with Change association role... and Change association type... tools found in context menu of association table. In general this step includes subtasks:

  • Create all new role and association type topics
  • For each association type

See also

Wandora contains also several different RDF extractors that can automatically recognize RDF's name space and create valid base names and association roles for extracted topics and associations. By default these simple RDF extractors locate in File > Extract > Simple RDF extract menu. Current RDF extractors are


Personal tools