SPARQL extractor

From WandoraWiki
Jump to: navigation, search

Wandora's SPARQL extractor is used to transform SPARQL result sets to topic maps. SPARQL extractor takes a SPARQL query and a SPARQL endpoint address as inputs, sends the query to the given endpoint and transforms result set to a topic map. Reader should note SPARQL extractor is not a SPARQL endpoint itself nor does it provide a SPARQL interface to topic maps. However, being able to transform SPARQL result sets to topic maps, the feature is very useful as there exists several public SPARQL endpoints such as DBPedia, data.gov.uk and data.gov. A special emphasis is given to Finnish SPARQL endpoint of Helsinki Region Infoshare (HRI) which opens Helsinki region data sets for mash-ups.

SPARQL extractor is started with menu option File > Extract > Other > SPARQL extractor.... Once started, Wandora opens up a dialog window for endpoint address and SPARQL query. Addition to a general options, we have specified some endpoints forehand to ease extractor usage. These predefined endpoints are

Select endpoint specific tab to send a query to it.


Sparql extractor 02.gif


SPARQL result set conversion to a topic map is simple. Each row in the result set is converted to an association where column values are player topics. Association type is a generic one of type SPARQL Result Set N where N is a query specific hash number. Thus, if your result set has three columns, Wandora creates triple-associations for results. Number of associations is equal to the number of rows in the result set. Conversion transforms all result set values to topics. This applies both resources and literals. Wandora user should note that Wandora merges automatically equal literals (and literals with resources with equal name). User should be careful with the literal values in the result set.

Contents

About SPARQL and queries

This page doesn't teach you how to write SPARQL queries. However, we are able to address some good SPARQL tutorials. For example, see

Addition to a general SPARQL knowledge, user should also know what kind of queries can be sent to the SPARQL endpoint. As a thumb rule, Wandora user should consult the documentation for the SPARQL endpoint. It usually explains what kind of information is stored in the database, which are base classes in information, etc. Also, usually documentation shows some example queries which help the user to start.

Example - DBPedia and French movies

In this rather long example Wandora user first send a query to DBPedia and receives a list of French movies. Then user enriches one of the created movie topics using Wandora's DBPedia extractor. First user starts the SPARQL extractor as shown below


Sparql extractor 01.gif


Wandora opens up the SPARQL extractor dialog window. User selects the DBpedia tab and writes her query to the text are labeled as DBpedia SPARQL query.


Sparql extractor 03.gif


Then user presses Extract button and Wandora connects the SPARQL endpoint of DBpedia and sends the query (along with other options) to it. After the extraction finishes, Wandora user should see topic SPARQL Result Set appearing to the left column topic tree, below Wandora class. Result set of extracted SPARQL query is an instance of this topic. And indeed, if user opens topic labeled as SPARQL Result Set 1616769989, the topic holds an association table with a single column, and column topics represent French movies.


Sparql extractor 05.gif


Wandora user opens now one of the movie topics, named as Jules_and_Jim. Topic has very little information about the actual movie. Topic's subject identifier refers dbpedia.org. This allows the user to continue extractions.


Sparql extractor 06.gif


User chooses menu option File > Extract > Subjects > DBpedia extractor....


Sparql extractor 07.gif


Wandora opens a dialog for a concept name. User presses the button Get context and Wandora picks up the name of the movie topic Jules_and_Jim.


Sparql extractor 08.gif


Next Wandora user presses the Extract button and Wandora fetches the RDF representation of topic Jules_and_Jim from DBpedia. As the extraction finishes, Wandora user sees the topic Jules_and_Jim containing a lot extra information about the movie. As the information was originally RDF, information is not very topic mappish but nevertheless, now user knows very much about the movie.


Sparql extractor 09.gif

Second example - Helsinki Region Infoshare and Helsinki's population centers

In this example Wandora user sends a query to Helsinki Region Infoshare's SPARQL endpoint and receives regional information about Helsinki's suburbs and population centers. Information includes geographical coordinates and polygons of city regions, hierarchy of regions and sometimes a description of a region. Wandora user builds up an occurrence of one region description and translates it to English and Swedish using Google translate feature.

First user starts the SPARQL extractor, selects the HRI tab and writes her query to the text area.


Sparql extractor 10.gif


User presses the Extract button and Wandora sends the query to HRI endpoint and transforms result set to a topic map. When extraction finishes, a new result set topic is available below SPARQL Result Set topic. If user clicks the topic SPARQL Result Set 1540517348 open, she sees an association table with three columns. Association table contains information about Helsinki region population centers. If user knows RDF, the interpretation of table is easy. Each association represents a RDF triplet, where area is a subject, pred is a predicate and obj a RDF object.


Sparql extractor 12.gif


Table contains a variety of different predicates as shown below.


Sparql extractor 13.gif


Looking at one area of _091_122_Alppila shows all relations related to the area. Alppila is a district of Helsinki city.


Sparql extractor 14.gif


Looking further reveals some Helsinki city districts have been described as shown below.


Sparql extractor 15.gif


User opens one specific describing topic and creates an occurrence using topic's base name.


Sparql extractor 16.gif


Sparql extractor 17.gif


Sparql extractor 18.gif


Sparql extractor 19.gif


Then user translates the description occurrence using Google translate. She right mouse clicks the occurrence cell in topic panel and chooses menu option Translate with Google. Wandora translates the text and the describing topic has now three occurrences. One for the original Finnish description and translated English and Swedish description.


Sparql extractor 20.gif


Sparql extractor 21.gif


Sparql extractor 22.gif


Sparql extractor 23.gif


Third example - Europeana and audio files

Bob DuCharme has written a blog writing about Finding Europeana audio with SPARQL. Writing includes a SPARQL query:

PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>  

SELECT ?title ?mediaURL ?creator ?source WHERE {
  ?resource edm:type "SOUND" ;
            ore:proxyIn ?proxy ;
            dc:title ?title ;
            dc:creator ?creator ;
            dc:source ?source . 
  ?proxy edm:isShownBy ?mediaURL . 
 }
OFFSET 600
LIMIT 100

In this example Wandora user sends Bob's query to Europeana's SPARQL endpoint. Query returns quartets of creator, mediaURL, source and title. Wandora transforms each quartet into an association, and each association player into a topic. MediaURL topics include the audio url as a subject identifier. To listen the audio, Wandora user can open the subject identifier URL in external application. Next screen captures view these steps in Wandora.


Sparql extractor example3 1.gif


Sparql extractor example3 2.gif


Sparql extractor example3 3.gif


Sparql extractor example3 4.gif


Sparql extractor example3 5.gif


Sparql extractor example3 6.gif


Sparql extractor example3 7.gif
Personal tools