Finding a topic

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(Find)
(Find)
 
(25 intermediate revisions by one user not shown)
Line 6: Line 6:
 
Finder is a simple free text search. Finder tries to locate given search word in topics. You can search with any topic element or element combination. Search result appears below the search field. Double clicking a topic in the search result opens the topic into the topic panel. Right clicking a topic opens context menu with a large number of topic tools.
 
Finder is a simple free text search. Finder tries to locate given search word in topics. You can search with any topic element or element combination. Search result appears below the search field. Double clicking a topic in the search result opens the topic into the topic panel. Right clicking a topic opens context menu with a large number of topic tools.
  
Search words used in Finder can contain Java specific regular expression characters such as dot. Finder doesn't restrict search word lenght. As an extreme example you could start search with a single dot as the search word resulting every topic in Wandora. Viewing very large result sets is time consuming and may cause OutOfMemoryExceptions in Wandora.
+
Search words used in Finder can contain [http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html Java specific regular expression characters] such as dot. Finder doesn't restrict search word lenght. As an extreme example you could start search with a single dot and the search would result every topic in Wandora. Viewing very large result sets is time consuming and may cause OutOfMemoryExceptions in Wandora. This is especially true when you are accessing [[Database topic map|database topic maps]].
  
== Find ==
+
== Search ==
  
Addition to Finder tab, topics can be searched selecting '''Edit > Find''' or pressing CTRL-F in Wandora. Option opens Find dialog window as shown below
+
Addition to Finder tab, topics can be searched by selecting '''Edit > Search...''' or by pressing CTRL-F. Both actions open '''Search and query''' dialog window:
  
  
Line 16: Line 16:
  
  
Writing a word to search field and pressing OK starts search. Search results are viewed in a separate dialog window as shown below:
+
To make a search, write your query to the text field and press Search button. Search query is interpreted as a [http://java.sun.com/docs/books/tutorial/essential/regex/ regular expression] allowing rather complicated searches. Search results are viewed below the search field like this
  
  
Line 22: Line 22:
  
  
To open any topic in the search results double click topic name or right click topic name and select option '''Open topic'''. The search result dialog is closed by pressing '''Close''' button. The search dialog is restored by clicking '''Again'''. Addition to traditional regular expression searches Wandora features also string similarity search.
+
To open any topic in the search table, double mouse click topic name or right mouse click the topic name and select menu option '''Open topic'''. Addition to regular expression searches Wandora features also string similarity search.
  
'''String similarity''' tab is used to perform similarity searches to the topic map. String similarity allows Wandora user to search topics with strings that only resemble matched string. Wandora utilizes Sam Chapman's [http://www.dcs.shef.ac.uk/~sam/simmetrics.html SimMetrics] open source library to calculate string similarities. Below is a screenshot of Wandora's '''String similarity''' tab.
+
===String similarity===
 +
 
 +
'''Similar''' tab is used for searching similar topics. Wandora user can search topics that only resemble matched string. In other words, it is sufficient that search query matches topic name only partly. Wandora uses Sam Chapman's [http://sourceforge.net/projects/simmetrics/ SimMetrics] open source library to calculate string similarities. Below is a screen capture of Wandora's '''Similarity''' tab.
  
  
Line 40: Line 42:
 
* SoundEx distance
 
* SoundEx distance
  
Similarity types are discussed in [http://www.dcs.shef.ac.uk/~sam/stringmetrics.html documentation of SimMetrics]. Addition to similarity type user has to choose similarity threshold. Similarity threshold defines the limit measured similarity must exceed before strings are similar and tested string is included to search results. If similarity threshold is near 100, strings must be very similar i.e only minimal differences are allowed. If similarity threshold is near 0, string can be very different to get the tested string included into the result set.
+
Addition to similarity type Wandora user can set the threshold for required similarity. Similarity threshold is a value between 0 and 100. If similarity threshold is close to 100, compared strings must be very similar and only minimal differences are allowed. If similarity threshold is close to 0, compared strings can be very different and they are still considered as similar.
 +
 
 +
Some similarity measures use '''Gap cost''' and '''Tokenizer''' settings. First specifies a penalty caused by a gap in word (usually a space character). Latter is used to split words out of text.
 +
 
 +
'''Difference instead similarity''' option changes the similarity measure to a difference measure. If selected, Wandora searches for strings that are maximally different compared to given string.
 +
 
 +
Wandora views similarity search results below the search settings. To open any topic in the result table, double click a topic name.
 +
 
 +
=== Query scripts===
 +
 
 +
'''Query''' tab is used to write and run queries. A query is a little script written in Javascript and using query directives from Java package org.wandora.query2. Directives can be chained to sequential and parallel pipelines. When a query is executed, it takes topics and associations as input and generates a result containing topics and literals. Results are viewed below the Run query button. Query scripts can be stored and restored with the top selector and Add button. Stored scripts are written into Wandora's options file. Read the [[Query language|query language documentation]] for more information about the possibilities of Wandora's query language.
 +
 
 +
 
 +
[[Image:find_query_tab.gif|center]]
 +
 
 +
=== TMQL ===
  
Some similarity measures use '''Gap cost''' and '''Tokenizer''' settings. First specifies the penalty a gap in word (space character) causes. Latter is used to split words out of text.
+
Wandora support [[TMQL|Topic Map Query Language]] (TMQL). Wandora uses [http://tmql4j.topicmapslab.de/ TMQL4J] engine for the TMQL support. TMQL scripts can be written and executed in Search and query dialog by selecting TMQL tab. The TMQL tab has an text are for the TMQL script. TMQL script is executed by pressing the Run query button. TMQL scripts can be stored and restored with the top selector and Add button. Stored scripts are written into Wandora's options file.
  
'''Difference instead similarity''' option changes the similarity measure to difference measure. You are searching for strings that are maximally different compared to given string.
 
  
Similarity search results are viewed in a separate dialog window. To open any topic in the search results user can double click topic name in the results dialog. The results dialog is closed by clicking '''Close'''. If search results are unsatisfying, user can click '''Again''' button and fine tune similarity search settings.
+
[[Image:finder_tmql_tab.gif|center]]

Latest revision as of 20:25, 16 July 2015

Finder example.gif


Finder is used to locate and open topics. Finder locates beside the Topics tab as shown above.

Finder is a simple free text search. Finder tries to locate given search word in topics. You can search with any topic element or element combination. Search result appears below the search field. Double clicking a topic in the search result opens the topic into the topic panel. Right clicking a topic opens context menu with a large number of topic tools.

Search words used in Finder can contain Java specific regular expression characters such as dot. Finder doesn't restrict search word lenght. As an extreme example you could start search with a single dot and the search would result every topic in Wandora. Viewing very large result sets is time consuming and may cause OutOfMemoryExceptions in Wandora. This is especially true when you are accessing database topic maps.

Contents

[edit] Search

Addition to Finder tab, topics can be searched by selecting Edit > Search... or by pressing CTRL-F. Both actions open Search and query dialog window:


Find search tab.gif


To make a search, write your query to the text field and press Search button. Search query is interpreted as a regular expression allowing rather complicated searches. Search results are viewed below the search field like this


Find results.gif


To open any topic in the search table, double mouse click topic name or right mouse click the topic name and select menu option Open topic. Addition to regular expression searches Wandora features also string similarity search.

[edit] String similarity

Similar tab is used for searching similar topics. Wandora user can search topics that only resemble matched string. In other words, it is sufficient that search query matches topic name only partly. Wandora uses Sam Chapman's SimMetrics open source library to calculate string similarities. Below is a screen capture of Wandora's Similarity tab.


Find similarity tab.gif


Available similarity types are

  • Levenshtein distance
  • Needleman-Wunch distance
  • Smith-Waterman distance
  • Block distance
  • Monge Elkan distance
  • Jaro distance
  • Jaro Winkler
  • SoundEx distance

Addition to similarity type Wandora user can set the threshold for required similarity. Similarity threshold is a value between 0 and 100. If similarity threshold is close to 100, compared strings must be very similar and only minimal differences are allowed. If similarity threshold is close to 0, compared strings can be very different and they are still considered as similar.

Some similarity measures use Gap cost and Tokenizer settings. First specifies a penalty caused by a gap in word (usually a space character). Latter is used to split words out of text.

Difference instead similarity option changes the similarity measure to a difference measure. If selected, Wandora searches for strings that are maximally different compared to given string.

Wandora views similarity search results below the search settings. To open any topic in the result table, double click a topic name.

[edit] Query scripts

Query tab is used to write and run queries. A query is a little script written in Javascript and using query directives from Java package org.wandora.query2. Directives can be chained to sequential and parallel pipelines. When a query is executed, it takes topics and associations as input and generates a result containing topics and literals. Results are viewed below the Run query button. Query scripts can be stored and restored with the top selector and Add button. Stored scripts are written into Wandora's options file. Read the query language documentation for more information about the possibilities of Wandora's query language.


Find query tab.gif

[edit] TMQL

Wandora support Topic Map Query Language (TMQL). Wandora uses TMQL4J engine for the TMQL support. TMQL scripts can be written and executed in Search and query dialog by selecting TMQL tab. The TMQL tab has an text are for the TMQL script. TMQL script is executed by pressing the Run query button. TMQL scripts can be stored and restored with the top selector and Add button. Stored scripts are written into Wandora's options file.


Finder tmql tab.gif
Personal tools