Introduction to Layered Topic Maps

From WandoraWiki
Jump to: navigation, search

Merging of topics is a very powerfull feature of Topic Maps. However it also introduces some problems. Wandora uses a layered topic map paradigm to counter some of these problems and also to introduce new ways of using topic maps. Unfortunately the layer paradigm also introduces some new problems which are also discussed here.


Problems with merging

Suppose we have two topic maps which both describe artists. They have been created from different points of view. One deals mostly with biographical information about artists and the other has details about works the artists have created. They overlap mostly, but not completely, meaning that most of the artists in one topic map are present also in the other topic map but in some cases there isn't biographical information about an artist that has pictures in the other topic map or vica versa. Suppose also that both topic maps contain date of birth and date of death of most artists. In some cases these might have not been known and thus are not in the topic map.

We can merge these two topic maps to form a topic map with all the combined information. After merge it is however impossible to solve where some piece of information originated from. Any artist might have been in either or both of the topic maps just like a date of birth might have been in either or both. Suppose we find a mistake in the date of birth of some artist and want to fix it. There is no way to know which one of the original topic maps had the wrong date without inspecting the original topic maps. With associations and a greater number of merged topic maps things become even more complex.

Same problem arises when you want to use a third party topic map that is periodically updated. For example, you might have a topic map about your record collection merged with a third party topic map that contains basic information from a large number of records. When a new version of this third party topic map is released, it is very difficult to update your personal topic map with changes in the third party topic map.

In many cases it is also useful to edit small topic maps that focus on certain aspect but view the topic map with full information from many different sources. Consider the previous record collection example. The third party topic map will most likely be extremely large compared to your own topic map of your personal record collection. It would be desirable to maintain your personal record collection topic map separate from the entire record database. Also merging of large topic maps can be computationally complex and require a long time to complete. Thus it may not be possible to perform a large merge operation after every small change to the original topic maps.

Layered topic maps in Wandora

Wandora uses a layered topic map paradigm. Topic maps are organized in layers, each layer containing one topic map. Each topic map can be turned visible or invisible. The topic map that the user sees is the merged topic map of all visible topic maps. However, internally Wandora does not perform a complete merge operation at any time so switching layer visibility is a relatively fast operation. Only the topics visible on the current page are merged. Because changing layer visibility is fast, the user can easily focus on one topic map or view the entire merged information.

The order of layers is used to decide which piece of information is visible when there are different values for some properties of topic. In the reduced topic maps that Wandora uses, topics can have only one base name, only one subject locator and only one occurrence for each scope. If two topics that will be merged have different values for one of these properties, we have to decide which of the possible values actually is used in the merged topic map. Value from the topmost layer is used, if there are several different values in topmost layer, one of them is chosen arbitrarily. Note that we wouldn't need to do this if Wandora used complete standard topic maps, because they may have several base names, subject locators and occurrences. We could simply have them all in the merged topic.

Each layer contains one topic map. You can use different topic map implementations in each layer. For example you can use a database stored topic map in one layer and memory implementation in another. It is also possible to use a layered topic map in a layer which makes it possible to organize layers in several layer groups. This will result in a tree structure where layered topic maps resemble folders in a file system.

At all times, one of the layers is selected. All edit operations in the topic map are done in this selected layer. This may cause implicit duplication of topics when the visible topic is not in the selected layer. This is discussed in more detail below. If you use layered topic maps in layers, choice of selected layer is slightly more complicated. All edit operations are always done in a leaf layer, that is a layer that is not a layered topic map. Hence if you select a layered topic map layer, all edit operations will actually be done in some child layer of the selected layer. This layer will be the last selected layer under the layered topic map and it is highlighted in the GUI with a slightly lighter colour than the selected layer.

Note that the layers in Wandora resemble the layers paradigm used in many image manipulation programs. Just like in Wandora, these programs allow user to switch layer visibility and all edit operations are done in the layer user has selected. Also topmost layers overwrite information from bottom layers by drawing over them.

Applications of topic map merging and layer paradigm

Combining information

Each layer can be a local topic map stored in memory or in database or they can be remote topic maps. Having several local topic maps allows you to keep different kinds of information separate but at the same time easily inspect the combined information.

You can also combine your own topic maps with topic maps provided by some other third party. This third party topic map may be provided as an online resource which can directly be used in Wandora or you may have downloaded it as an XTM file and use with your own topic map. Whichever is the case, you probably want to keep the third party topic map separate from your own topic map.

If the third party topic map is updated, it is easy to update just the one layer. It might even be done automatically if you use it as an online resource. Had you actually merged the third party topic map with your own topic map into one single file, it would be very hard to find out which parts of the file needs to be updated. Simply merging the new version again might leave deleted topics or erroneous information in the topic map.

Adapter layers

The merging rules can be used make an adapter topic map that is used to combine two different topic maps. This is needed when two or more topic maps that have overlapping information but use different vocabulary and subject identifiers need to be merged. Because of the different vocabulary and subject identifiers, topics that represent same concepts would not be merged. An adapter topic map in a sense translates the identifiers back and forth. This is done by creating topics that contain identifiers from both topic maps. This causes topics from both topic maps to merge with topics in the adapter topic map and thus the topics in the two original topic maps are merged together.

Layered adapter example.png

Middle circle represents the adapter topic that has two subject identifiers, one for both topics to be combined. This make the different topics in the other two layers to be combined with the adapter topic and thus all three are combined. The adapter topic doesn't contain any other information so it does not interfere in any way with the information contained in either of the other two topic maps.

Implicit duplication of information

All editing in Wandora is done to the layer that is currently selected. This layer might not however contain some of the topics required to perform the edit operation. For example you may try to change the base name of a topic but the topic is not present in the selected layer. A base name cannot exist without a topic so the topic must first be duplicated in the selected layer. Similar situations may arise when dealing with associations and occurrences. Association and occurrence types and association roles and players may not exist in the selected layer.

When topics are duplicated this way, only minimal versions of the topics are created in the selected topic map. This means that an empty topic with nothing but a single subject identifier is created. The one subject identifier is enough to make sure that the new base name or association uses the desired topic but at the same time interferes with the information in the other topic maps as little as possible. The needed topic may have several subject identifiers in which case one of these is chosen arbitrarily.

If the user accidentally edits wrong layer implicit duplication may result in unnecessary stubs of topics being created in topic maps. These stubs may then clutter the topic map. Thus it is necessary that the user has correct layer selected at all times. Making layers read only, is one way to make sure that they are not accidentally edited.

Ambiguity of operations

The visible topic may be a combination of several topics of the selected layer. This often results in some operations being ambiguous. For example if we are changing a base name, we would normally choose the version of the visible topic in the visible layer, or if such does not exist duplicate it implicitly, but there may be several of such topics. Currently these are resolved by simply using one of the topics.

One might think that performing the operation for all possible topics would be more intuitive. However, in many cases this would result in loss of information. For example when editing base name or adding a subject identifier, performing the operation for all topics would cause all the topics to have same base name or subject identifier and thus the topics would be merged which probably is not what was intended.

See also

Personal tools