Topic map conversion of WordNet
WordNet is a large lexical database for English. WordNet has been developed at the Cognitive Science Laboratory of Princeton University. Topic map conversion is based on W3's work on RDF version of WordNet 2.0.
Contents |
Download WordNet topic map
There are two versions of WordNet topic map available:
- Wandora project file with separate layers. This version is targeted to Wandora users who want to play with, filter, and modify the WordNet.
- Single merged XTM dump. This version is a general topic map file usable in many topic map applications.
Usage in Wandora
Topic map version of WordNet contains over 100 000 topic and associations, and requires at least 2 GB of memory to be used properly in Wandora. To get such a memory for Wandora, start the application with bin/Wandora-huge.bat or adjust Java's memory settings in bin/Wandora.bat. Below is a screenshot of Wandora with WordNet's meeting topic open. Note the layer structure.
Conversion details
The topic map conversion of WordNet is based on W3's RDF version of WordNet. The conversion had (little simplified) steps
- Import each single RDF file of WordNet to Wandora as a separate layer. For each imported layer
- Manually fix RDF triplets to topic map associations
- Map RDF's subject and object to topic map roles
- Manually fix certain subject identifiers of imported topics
- Create light-weight topic hierarchy to connect WordNet topics to Wandora's topic tree.
I (akivela) was actually little surprised how easily the RDF version converted to a topic map. The most demanding step was to decide which roles to use in associations. Next chapters describe the most important base names and subject identifiers of the topic map conversion.
Synsets
Synsets are classes that collect all words under word categories. Categories comply with W3's and WordNet's categories. Single words are instances of these class topics.
Base name | Subject identifiers |
AdjectiveSatelliteSynset (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/AdjectiveSatelliteSynset |
AdjectiveSynset (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/AdjectiveSynset |
AdverbSynset (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/AdverbSynset |
FullSynset (wordnet) | http://www.wandora.net/wordnet/synset |
NounSynset (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/NounSynset |
VerbSynset (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/VerbSynset |
Association types
Base name | Subject identifiers |
Attribute (wordnet) | http://www.wandora.net/wordnet/type/attribute |
Causes (wordnet) | http://www.wandora.net/wordnet/type/causes |
ClassifiedByRegion (wordnet) | http://www.wandora.net/wordnet/type/classifiedByRegion |
ClassifiedByTopic (wordnet) | http://www.wandora.net/wordnet/type/classifiedByTopic |
ClassifiedByUsage (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/classifiedByUsage |
Entails (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/entails |
HyponymOf (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/hyponymOf |
MemberMeronymOf (wordnet) | http://www.wandora.net/wordnet/type/memberMeronymOf |
PartMeronynOf (wordnet) | http://www.wandora.net/wordnet/type/partMeronymOf |
SameVerbGroup (wordnet) | http://www.wandora.net/wordnet/type/sameVerbGroupAs |
SimilarTo (wordnet) | http://www.wandora.net/wordnet/type/similarTo |
Association roles
Occurrence types
synsetId (wordnet) | http://www.w3.org/2006/03/wn/wn20/schema/synsetId |