R in Wandora

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
Wandora can be used with the [http://www.r-project.org/ R language]. R is an environment for statistical computing and graphing. Properties of the topic map and its topics can be accessed from R and statistics and graphs can be generated from them.
+
[http://www.r-project.org/ R] may be used to compute graph statistics related to Topic Maps or other groups of topics in Wandora. The selected group of topics is represented in R as an [http://igraph.sourceforge.net/ igraph]. The igraph library is thus a prerequisite for graph analysis of Topic Maps in R. Several use cases are presented below. Refer to the [http://igraph.sourceforge.net/doc/R/00Index.html igraph documentation] for further methods of statistical graph analysis.
  
== Setting up R ==
+
== Handling topics in R ==
  
To use R in Wandora you need to install R, then install a few R libraries using the package manager inside R and finally possibly adjust some environment parameters in the Wandora startup script. These steps are explained in detail below.
+
The R interface is exposed through the R console found in the toolbar in Wandora. Scripts may also be run using the [http://www.wandora.org/wiki/R_topic_panel R topic panel] in '''view -> Add topic panel -> R'''.
  
=== Installing R ===
+
We may aquire the targeted set of topics from Wandora using methods of classes that have been exposed by the [http://www.wandora.org/api/org/wandora/application/tools/r/RBridge.html Java-R-bridge]. We may for example query for all the visible topics simply by calling <code>getAllTopics()</code> from <code>rinit.r</code> or more specifically
  
Download R from the R website at http://www.r-project.org/ and follow the installation instructions there. The default installation installs both 32-bit and 64-bit versions. If you decide to only install one, make sure it matches your Java runtime environment version.
+
# Retrieve the Wandora Java object using getWandora() from org.wandora.application.Wandora
 +
# Retrieve the topics themselves using Java mehtod calls and unwrap the returned iterator to a list suitable for futher use.
  
On Linux environments R may also be available using the package repository of your Linux distribution. In Ubuntu the name of the package you need is ''r-base''.
+
wandora <- J("org.wandora.application.Wandora")$getWandora()
 +
tm <- wandora$getTopicMap()
 +
ts <- unwrapIterator(getTopicMap()$getTopics())
  
=== Installing required R libraries ===
+
An igraph object is next created with <code>makeGraphJava</code> where the helper class [http://www.wandora.org/api/org/wandora/application/tools/r/RHelper.html org.wandora.application.tools.r.RHelper] is utilized for heavy lifting. A similar but slower function using R is implemented in <code>makeGraph</code>. The resulting object is next used for the actual statistical analysis.
  
At the very least you must install the ''rJava'' library. Do this using the package manager inside R. First run R using the administrator account. On Windows right click the icon and select "Run as Administrator". The installation may have created two icons on your desktop or the start menu, one for 32-bit R and one for 64-bit. You should use the one that matches your Java runtime environment. Note that you might have a 32-bit Java runtime environment even if your Windows is 64-bit.
+
===Example #1: Calculate the graph diameter===
  
On Linux run R with root, for example in console using "sudo R".
+
We may now use R to replicate the functionality of the [http://www.wandora.org/wiki/Topic_map_diameter Topic Map diameter] calculator provided in Wandora. In comparison this process should prove to be faster, less memory consuming and therefore more suitable for large datasets.
  
When installing a package you will be prompted to select a mirror for download. Just select your country or one close to it. To install the package issue the command in R:
+
g <- makeGraphJava(ts)
 +
d <- diameter(g)
  
  install.packages("rJava")
+
A full example is written out in <code>GetDiameter.r</code> found among other examples in build/resources/r.
  
Most likely you will also want to install the ''igraph'' library. It is needed to plot network graphs.
+
===Further statistics===
  
  install.packages("igraph")
+
Having constructed the graph there are a multitude of statistics we may compute using functions provided in igraph. In addition to the diameter calculation introduced above a few basic ones are listed here.
  
Currently Wandora has a problem with the default graphics device in Windows environment. To be able to plot anything in Windows you will need to install the ''JavaGD'' graphics device. This is only needed in Windows.
+
* [http://en.wikipedia.org/wiki/Centrality#Eigenvector_centrality Eigenvector centrality] and [http://en.wikipedia.org/wiki/Betweenness_centrality betweenness centrality] using <code>evcent(g)</code> and <code>betweenness(g)</code>
 +
* Degree distribution via <code>degree.distribution(g)</code>
  
  install.packages("JavaGD")
 
  
=== Setting up enviroment variables ===
+
<gallery perrow=2 widths=300px heights=300px>
 +
File:random500.jpg|Betweenness centrality plotted against eigenvector centrality for a random generated graph with 500 topics and 1000 associations.
 +
File:service.jpg|Betweenness centrality plotted against eigenvector centrality for the [[The Service Map - Helsinki region public services]]
 +
</gallery>
  
Next make sure that the environment variables are setup correctly in the Wandora startup script. In Windows open the ''bin/SetR.bat'' file and in Linux the ''bin/SetR.sh'' file. If you did a standard installation of R then the Linux start-up script likely needs no changes at all.
+
== Mapping vertices to topics ==
  
The Windows start-up script however has two things that may need adjusting. Make sure the first line points to you R installation directory. Especially the R version number may need to be changed. Also make sure that the processor architecture matches your Java installation. Note that you may have a 32-bit Java even if your system is 64-bit. If you aren't sure which Java version you have you can simply try both settings and see which one works. The architecture is specified on lines 5 and 6. For 32-bit use
+
<code>makeGraphJava(topics)</code> transforms the Topic Map representation to a rudimentary undirected graph where topics are represented as vertices identified by a running index. Associations are in turn represented as a set of edges pairing the vertexes together. It is difficult to target a specific topic by it's SI or other identifier in Wandora since all topic data excluding associations is lost in the transformation.
  
  set R_ARCH=i386
+
In order to preserve the mapping between topics and respective vertices we may construct the map by hand and use it to call <code>makeGraph(topics, indiceMap)</code> in rinit.r. More specifically
  REM set R_ARCH=x64
+
  
And for 64-bit
+
# Fetch all the topics we want in the graph. Here we fetch all topics contained in the Topic Map.
 +
# Construct the indice map where each subject identifier string maps to an index in the graph.
 +
# Get the string representation of a subject identifier of each topic. This is equivalent to t.getOneSubjectIdentifier( ).toString();
 +
#Append the si-index pair to the map. Indices range from 1 to <code>length(ts)</code>.
 +
# Call makeGraph with indices specified in order to make sure our indices are used to construct the graph.
 +
<pre>
 +
ts <- getAllTopics()
 +
ind <- list()
 +
for(t in ts){
 +
  si <- .jcall(t,"Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
 +
  si <- .jcall(si,"Ljava/lang/String;","toString")
 +
  ind [[si]]<-length(ind)+1
 +
}
 +
g <- makeGraph(ts,ind)
 +
</pre>
 +
We may now select a topic in Wandora and import it to R with <code>getContextTopics()</code>. A detailed example is detailed below.
  
  REM set R_ARCH=i386
+
# Get the selected topics from Wandora.
  set R_ARCH=x64
+
# Get the string representation of the SI and using it look up the index from the map  constructed earlier.
 +
<pre>
 +
cts <- getContextTopics()
 +
for (ct in cts){
 +
  si <- .jcall(ct, "Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
 +
  si <- .jcall(si, "Ljava/lang/String;","toString")
 +
  cind[[length(cind)+1]] <- ind[[si]]
 +
}
 +
</pre>
  
Other parameters should be correct unless you have customized your R installation beyond a standard setup.
+
Having found the indices of vertices for the selected topics we can now compute statistics related to those vertices in the graph. For example we may compute the immediate neighborhood sizes of those topics with <code>neighborhood.size(g,1,cind)</code>.
  
You should now be able to use R inside Wandora.
+
A full example is again written out in GetContextTopicsNeighbours.r in build/resources/r. We may also define a reverse lookup for SIs in the following manner:
  
== Plotting in Windows ==
+
<pre>
 +
getSI <- function(i){
 +
  for(t in ts){
 +
    si <- .jcall(t,"Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
 +
    si <- .jcall(si,"Ljava/lang/String;","toString")
 +
    if(ind[[si]] == i)
 +
      return(si)
 +
    }
 +
  }
 +
  return("")
 +
}
 +
</pre>
  
The default graphics device doesn't work correctly in Windows. If you try to plot anything you will get an unresponsive graphics window. If at any time you accidentally open it you can close it cleanly in Wandora R Console with
+
===Example #2: Find the topics in a community===
  
  dev.off()
+
This lookup may be utilized in finding topics for a subset of vertices from the graph. In this case we first compute communities for a set of topics. We then pick a community and find the topics in it.
  
To work around this issue you need to use the JavaGD graphics device. You first need to load the JavaGD library with
+
Again, fetch all topics in Wandora.
 +
ts <- getAllTopics()
  
  library("JavaGD")
+
Construct the vertex index to topic SI mapping discussed above.
  
Then initialize the graphics device with
+
<pre>
 +
ind <- list()
 +
for(t in ts){
 +
  si <- .jcall(t,"Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
 +
  si <- .jcall(si,"Ljava/lang/String;","toString")
 +
  ind[[si]]<-length(ind)+1
 +
}
 +
g <- makeGraph(ts,ind)
 +
</pre>
  
  JavaGD()
+
Get the communities of the graph g. Here we use random walk to distinguish communities.
 +
 +
ns <- walktrap.community(g)
  
This will open an empty graphics window. You can then use plot normally to plot in this window.
+
Get all the vertex IDs in the community with the ID 10
  
== R console in Wandora ==
+
<pre>
 +
bigComVer <- list()
 +
mem <- membership(ns)
 +
for(m in mem){
 +
  if(mem[[m]] == 10){
 +
    bigComVer[[length(bigComVer)]] <- m
 +
  }
 +
}
 +
</pre>
  
You can open the R console in Wandora by clicking the ''R console'' button in the top toolbar. Assuming that you have installed R and setup the environment correctly you will get the standard R greeting with R version and license information.
+
Finally find the SIs for the vertices we found above.
 +
 +
sis <- lapply(bigComVer,getSI)
  
  R version 2.11.1 (2010-05-31)
+
This approach may also be used with the diameter calculations to find the vertices of the longest path in the topic. We find the vertice IDs with
  Copyright (C) 2010 The R Foundation for Statistical Computing
+
  ISBN 3-900051-07-0
+
 
+
  R is free software and comes with ABSOLUTELY NO WARRANTY.
+
  You are welcome to redistribute it under certain conditions.
+
  Type 'license()' or 'licence()' for distribution details.
+
 
+
    Natural language support but running in an English locale
+
 
+
  R is a collaborative project with many contributors.
+
  Type 'contributors()' for more information and
+
  'citation()' on how to cite R or R packages in publications.
+
 
+
  Type 'demo()' for some demos, 'help()' for on-line help, or
+
  'help.start()' for an HTML browser interface to help.
+
  Type 'q()' to quit R.
+
 
+
  >
+
  
Otherwise you'll get an error message and instructions about how to setup your R environment.
+
d <- get.diameter(g)
  
You can issue R commands in the text area at the bottom part of the window. This includes almost everything you can do in R, one notable exception is that the help system doesn't work properly so "?plot" and the like don't do anything. Also a few other functions have been disabled because they don't work very well when R is ran inside Java. These functions include ''q'', ''quit'', ''demo'', ''contributors'' and ''citation''.
+
and find the corresponding SIs with
 
+
You can browse the topic map in Wandora while having the R console open. This way you can select topics in the main Wandora window and then get references to those topics in the R environment (see next section).
+
 
+
== Using R with topic maps in Wandora ==
+
 
+
There are a couple of ways to access the topic map in R. Some of these rely heavily on the Java topic map API used in Wandora. You will need to call the Java methods of the topic objects. To find out more about the API look at the [http://www.wandora.org/wandora/docs/api/ javadocs] of Wandora. Mostly you will only need to look at the [http://www.wandora.org/wandora/docs/api/org/wandora/topicmap/Topic.html Topic] and [http://www.wandora.org/wandora/docs/api/org/wandora/topicmap/TopicMap.html TopicMap] classes and a few classes related to them. The Java methods are accessed with the $ indexing operator. For example if the variable ''t'' contains a topic you can get the base name of that with
+
 
+
  t$getBaseName()
+
 
+
The R environment in Wandora is initialized by running the ''/build/resources/conf/rinit.r'' file. This file defines some functions that make accessing the topic map slightly easier. You can get a reference to the topic map object itself with ''getTopicMap'' function in R. Alternatively you can get a list of all the topics in the topic map with ''getAllTopics'' or the currently selected topics in Wandora with ''getContextTopics''. For example, to get the base names of the currently selected topics in Wandora use
+
 
+
  lapply(getContextTopics(),function(t) t$getBaseName())
+
 
+
To plot something you first need to gather the data you want to plot. For example, you could plot a histogram that visualizes the amount of associations the topics have in the topic map. '''Note that you may have to copy this and other examples on this page one line at a time'''.
+
 
+
  ts<-getAllTopics() # get a list of all topics
+
  as<-lapply(ts,function(t) t$getAssociations()$size()) # as has the number of associations in each topic
+
  fac<-factor(unlist(as)) # make a factor from as
+
  plot(fac) # plot the factor
+
 
+
In Windows you should open the JavaGD graphics device before the last plot line. Do this with
+
 
+
  library("JavaGD") # loads the JavaGD, only need to do this once per R session
+
  JavaGD() # opens a JavaGD graphics window
+
 
+
There is also a function to setup a graph object that can be used with the ''igraph'' library. use the ''makeGraphJava'' function and pass it a list of topic objects. It returns a graph object that can be plot directly. See the [http://igraph.sourceforge.net/doc/R/00Index.html igraph R documentation] for more information about how to customize the plot. Especially the plot function may be passed parameters relating to the layout of the graph. Before using the ''makeGraphJava'' you have to load the igraph library. For example to plot a network of the currently selected topics in a circular layout use the following. And of course you must have selected some topics in Wandora for this to work.
+
 
+
  library("igraph") # loads the library, only need to do this once per R session
+
  plot(makeGraphJava(getContextTopics()),layout=layout.circle)
+
 
+
Again in Windows you need to remember to use the JavaGD library before plotting.
+
 
+
Instead of getting a list of selected topics you can get a list of selected associations with ''getContextAssociations''. After this you can get the players with ''getPlayers'' which takes as parameters a list of associations and a role, this can either be a topic object or a string giving the base name of the role. So for example you could select some associations and then get the topics playing the role ''value'' with
+
 
+
  getPlayers(getContextAssociations(),"value")
+
 
+
You can convert topics to strings or numbers using ''as.character'' or ''as.numeric'' respectively. These use the topic base name to do the conversion. If you want to use a variant name or get the data from an occurrence you will have to use the topic map API to get the desired value. But if you have your numeric data in the base name and can get it listed in a table in Wandora and then selected then you can get a simple vector of numbers with something like
+
 
+
  sapply( getPlayers(getContextAssociations(),"value"), as.numeric )
+
 
+
Note that the ''as.numeric'' is fairly lenient when converting topic base names to numbers and knows how to skip non numeric characters in the base name.
+
 
+
== Example ==
+
 
+
In this example we will use the [[SPARQL extractor]] to extract some data relating to the demographics of Helsinki. We will then make a map plot of the districts of Helsinki with colours indicating the total population in the district.
+
 
+
First we use the SPARQL extractor to extract the data. Go to ''File'' menu and select ''Extract/Other/SPARQL extractor''.
+
 
+
[[Image:Rexample1.png]]
+
 
+
We are going to use the [http://www.hri.fi/en/ Helsinki Region Infoshare] SPARQL end point so select the HRI tab. Then clear the query text area and replace it with following. It will get the population of all areas of Helsinki that themselves don't have any sub areas.
+
 
+
SELECT ?area ?poly ?value
+
WHERE {
+
  ?area rdf:type dimension:Alue;
+
      geo:polygon ?poly.
+
  ?item rdf:type scv:Item;
+
      rdf:value ?value;
+
      dimension:ikäryhmä ikäryhmä:Väestö_yhteensä;
+
      dimension:vuosi vuosi:_2009;
+
      dimension:yksikkö yksikkö:Henkilöä;
+
      dimension:alue ?area.
+
  OPTIONAL { ?narrower skos:broader ?area . }
+
  FILTER ( !BOUND(?narrower) )
+
}
+
  
Then click ''Extract''.
+
sis <- lapply(d,getSI)
  
[[Image:Rexample2.png]]
+
== Importing graphs from R to Wandora ==
  
After a few seconds you should get a message informing you that one result set was extracted.
+
Above we've used the rJava bridge to import topics from Wandora to R. We may also import graphs from R to Wandora as topics and associations. Auxiliary data may be specified to use as base names and occurrence data for the extracted topics.
  
Next find the result set topic on the left hand side (labelled 1 below) and double click it to open the topic. Then select all the rows in the result set by right clicking somewhere on the association table (labelled 2). In the context menu choose ''Select/Select all'. Then open the R console by clicking the button in the toolbar (labelled 3).
+
===Example #3: the bull graph===
  
[[Image:Rexample3.png]]
+
As a simple example we use igraph to generate a [http://en.wikipedia.org/wiki/Bull_graph bull graph] which we will import to Wandora. First we create the graph.
  
Then give the following commands to R. You can copy and paste all of it at once in the input box at the lower part of the window. If you aren't doing this in Windows then you can remove lines 5 and 6 (the JavaGD part). Press enter or the ''Evaluate'' button to evaluate the commands.
+
g <- graph.famous("bull");
  
associations<-getContextAssociations()
+
Next we attach a name the vertices of the graph. The order in the list corresponds to the vertex IDs in the graph. We could as well use the actual names specified in the graph if the vertices are already named.
polygons<-getPlayers(associations,"poly")
+
polygons<-lapply(polygons,function(p){extractPolygon(getDisplayName(p),reverse=TRUE)})
+
values<-sapply(getPlayers(associations,"value"),as.numeric)
+
library("JavaGD") # Remove these two lines if you
+
JavaGD()          # aren't doing this on Windows
+
plotPolygons(polygons,values)
+
  
[[Image:Rexample4.png]]
+
names < -c("first","second","third","fourth","fifth")
  
You should get an R plot window containing the final plot.
+
Now we can call <code>createTopics</code> in <code>rinit.r</code> with
  
[[Image:Rexample5.png]]
+
createTopics(g,baseNames=names)
  
We'll now go through the short R code line by line to see what it does.
+
We may further add to the created topics by specifying an array of occurrence data in the form
  
1 associations<-getContextAssociations()
+
{| class="wikitable"
2 polygons<-getPlayers(associations,"poly")
+
| Occ. type / Vertex ID
3 polygons<-lapply(polygons,function(p){extractPolygon(getDisplayName(p),reverse=TRUE)})
+
| 1
4 values<-sapply(getPlayers(associations,"value"),as.numeric)
+
| 2
5 library("JavaGD") # Remove these two lines if you
+
| 3
6 JavaGD()          # aren't doing this on Windows
+
| 4
7 plotPolygons(polygons,values)
+
| ...
 +
|-
 +
| Occurrence type 1
 +
| value 1
 +
| value 2
 +
| value 3
 +
| value 4
 +
| ...
 +
|-
 +
| Occurrence type 2
 +
| value 1
 +
| value 2
 +
| value 3
 +
| value 4
 +
| ...
 +
|-
 +
|...
 +
|...
 +
|...
 +
|...
 +
|...
 +
|...
 +
|}
  
The first line gets the associations we selected in Wandora. The second line gets all the players of those associations that play the role "poly" (strictly speaking "poly" is the base name of the role topic). The third line extracts the polygon data from those topics. This is done by applying the anonymous function to all topics in the ''polygons'' list. The ''extractPolygon'' function extracts the polygon data from a string which we get from the display name of the topic with the ''getDisplayName'' function. The ''reverse'' parameter to ''extractPolygon'' swaps x and y axes. This is done because latitude (the y axis) is first in the polygon data. The fourth line gets the population for each area. This is done by getting the player topics of role "value" and converting them to numeric. Lines five and six setup the JavaGD display device. Finally on line seven the polygons and their associated values are plotted with ''plotPolygons''.
+
As a trivial example we may add the base names as occurrences of type foo with
 +
<pre>
 +
occ<-list()
 +
occ[["foo"]]<-names
 +
createTopics(g,baseNames=names,occurrences=occ)
 +
</pre>
  
Note that all of ''getContextAssociations'', ''getPlayers'', ''extractPolygon'', ''getDisplayName'' and ''plotPolygons'' are defined in the ''rinit.r'' R script that is loaded when you first start the R console in Wandora. In addition to this, ''as.numeric'' is extended to handle topic objects there as well.
+
[[File:R_1.png|600px]]
  
== R language resources ==
+
A vertex of a graph imported into Wandora: the name 'second' is set as a base name as well as an occurrence of type foo
  
This short introduction for R language integration of Wandora applications doesn't even try to teach you the R language. If you find the R language interesting and would like to know more, please refer available on-line [http://cran.r-project.org/manuals.html manuals] such as
 
  
* [http://cran.r-project.org/doc/manuals/R-intro.html An introduction to R]
+
[[File:R_visualization.png|600px]]
* [http://cran.r-project.org/doc/manuals/R-lang.html R language Definition]
+
  
Also, [http://cran.r-project.org/other-docs.html Contributed Documentation] page has an excellent list of R language resources.
+
The imported graph is here visualized with the [[D3 graph service module | D3 graph visualization]]
  
 
== See also ==
 
== See also ==
  
* [[Statistical analysis of Topic Maps in R]]
+
* [[R in Wandora]]

Revision as of 10:24, 15 July 2013

R may be used to compute graph statistics related to Topic Maps or other groups of topics in Wandora. The selected group of topics is represented in R as an igraph. The igraph library is thus a prerequisite for graph analysis of Topic Maps in R. Several use cases are presented below. Refer to the igraph documentation for further methods of statistical graph analysis.

Contents

Handling topics in R

The R interface is exposed through the R console found in the toolbar in Wandora. Scripts may also be run using the R topic panel in view -> Add topic panel -> R.

We may aquire the targeted set of topics from Wandora using methods of classes that have been exposed by the Java-R-bridge. We may for example query for all the visible topics simply by calling getAllTopics() from rinit.r or more specifically

  1. Retrieve the Wandora Java object using getWandora() from org.wandora.application.Wandora
  2. Retrieve the topics themselves using Java mehtod calls and unwrap the returned iterator to a list suitable for futher use.
wandora <- J("org.wandora.application.Wandora")$getWandora() 
tm <- wandora$getTopicMap()
ts <- unwrapIterator(getTopicMap()$getTopics())

An igraph object is next created with makeGraphJava where the helper class org.wandora.application.tools.r.RHelper is utilized for heavy lifting. A similar but slower function using R is implemented in makeGraph. The resulting object is next used for the actual statistical analysis.

Example #1: Calculate the graph diameter

We may now use R to replicate the functionality of the Topic Map diameter calculator provided in Wandora. In comparison this process should prove to be faster, less memory consuming and therefore more suitable for large datasets.

g <- makeGraphJava(ts)
d <- diameter(g)

A full example is written out in GetDiameter.r found among other examples in build/resources/r.

Further statistics

Having constructed the graph there are a multitude of statistics we may compute using functions provided in igraph. In addition to the diameter calculation introduced above a few basic ones are listed here.


Mapping vertices to topics

makeGraphJava(topics) transforms the Topic Map representation to a rudimentary undirected graph where topics are represented as vertices identified by a running index. Associations are in turn represented as a set of edges pairing the vertexes together. It is difficult to target a specific topic by it's SI or other identifier in Wandora since all topic data excluding associations is lost in the transformation.

In order to preserve the mapping between topics and respective vertices we may construct the map by hand and use it to call makeGraph(topics, indiceMap) in rinit.r. More specifically

  1. Fetch all the topics we want in the graph. Here we fetch all topics contained in the Topic Map.
  2. Construct the indice map where each subject identifier string maps to an index in the graph.
  3. Get the string representation of a subject identifier of each topic. This is equivalent to t.getOneSubjectIdentifier( ).toString();
  4. Append the si-index pair to the map. Indices range from 1 to length(ts).
  5. Call makeGraph with indices specified in order to make sure our indices are used to construct the graph.
 ts <- getAllTopics()
 ind <- list()
 for(t in ts){
   si <- .jcall(t,"Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
   si <- .jcall(si,"Ljava/lang/String;","toString")
   ind [[si]]<-length(ind)+1
 }
 g <- makeGraph(ts,ind)

We may now select a topic in Wandora and import it to R with getContextTopics(). A detailed example is detailed below.

  1. Get the selected topics from Wandora.
  2. Get the string representation of the SI and using it look up the index from the map constructed earlier.
 cts <- getContextTopics() 
 for (ct in cts){
   si <- .jcall(ct, "Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
   si <- .jcall(si, "Ljava/lang/String;","toString")
   cind[[length(cind)+1]] <- ind[[si]]
 }

Having found the indices of vertices for the selected topics we can now compute statistics related to those vertices in the graph. For example we may compute the immediate neighborhood sizes of those topics with neighborhood.size(g,1,cind).

A full example is again written out in GetContextTopicsNeighbours.r in build/resources/r. We may also define a reverse lookup for SIs in the following manner:

 getSI <- function(i){
   for(t in ts){
     si <- .jcall(t,"Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
     si <- .jcall(si,"Ljava/lang/String;","toString")
     if(ind[[si]] == i)
       return(si)
     }
   }
   return("")
 }

Example #2: Find the topics in a community

This lookup may be utilized in finding topics for a subset of vertices from the graph. In this case we first compute communities for a set of topics. We then pick a community and find the topics in it.

Again, fetch all topics in Wandora.

ts <- getAllTopics()

Construct the vertex index to topic SI mapping discussed above.

 ind <- list()
 for(t in ts){
   si <- .jcall(t,"Lorg/wandora/topicmap/Locator;","getOneSubjectIdentifier")
   si <- .jcall(si,"Ljava/lang/String;","toString")
   ind[[si]]<-length(ind)+1
 }
 g <- makeGraph(ts,ind)

Get the communities of the graph g. Here we use random walk to distinguish communities.

ns <- walktrap.community(g)

Get all the vertex IDs in the community with the ID 10

 
 bigComVer <- list()
 mem <- membership(ns)
 for(m in mem){
   if(mem[[m]] == 10){
     bigComVer[[length(bigComVer)]] <- m
   }
 }

Finally find the SIs for the vertices we found above.

sis <- lapply(bigComVer,getSI)

This approach may also be used with the diameter calculations to find the vertices of the longest path in the topic. We find the vertice IDs with

d <- get.diameter(g)

and find the corresponding SIs with

sis <- lapply(d,getSI)

Importing graphs from R to Wandora

Above we've used the rJava bridge to import topics from Wandora to R. We may also import graphs from R to Wandora as topics and associations. Auxiliary data may be specified to use as base names and occurrence data for the extracted topics.

Example #3: the bull graph

As a simple example we use igraph to generate a bull graph which we will import to Wandora. First we create the graph.

g <- graph.famous("bull");

Next we attach a name the vertices of the graph. The order in the list corresponds to the vertex IDs in the graph. We could as well use the actual names specified in the graph if the vertices are already named.

names < -c("first","second","third","fourth","fifth")

Now we can call createTopics in rinit.r with

createTopics(g,baseNames=names)

We may further add to the created topics by specifying an array of occurrence data in the form

Occ. type / Vertex ID 1 2 3 4 ...
Occurrence type 1 value 1 value 2 value 3 value 4 ...
Occurrence type 2 value 1 value 2 value 3 value 4 ...
... ... ... ... ... ...

As a trivial example we may add the base names as occurrences of type foo with

 occ<-list()
 occ[["foo"]]<-names
 createTopics(g,baseNames=names,occurrences=occ)

R 1.png

A vertex of a graph imported into Wandora: the name 'second' is set as a base name as well as an occurrence of type foo


R visualization.png

The imported graph is here visualized with the D3 graph visualization

See also

  • R in Wandora
Personal tools