Finding a topic

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
Line 6: Line 6:
 
Finder is a simple free text search. Finder tries to locate given search word in topics. You can search with any topic element or element combination. Search result appears below the search field. Double clicking a topic in the search result opens the topic into the topic panel. Right clicking a topic opens context menu with a large number of topic tools.
 
Finder is a simple free text search. Finder tries to locate given search word in topics. You can search with any topic element or element combination. Search result appears below the search field. Double clicking a topic in the search result opens the topic into the topic panel. Right clicking a topic opens context menu with a large number of topic tools.
  
Search words used in Finder can contain Java specific regular expression characters such as dot. Finder doesn't restrict search word lenght. As an extreme example you could start search with a single dot as the search word resulting every topic in Wandora. Viewing very large result sets is time consuming and may cause OutOfMemoryExceptions in Wandora. This is especially true when you are accessing [[Database topic map|database topic maps]].
+
Search words used in Finder can contain [http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html Java specific regular expression characters] such as dot. Finder doesn't restrict search word lenght. As an extreme example you could start search with a single dot as the search word resulting every topic in Wandora. Viewing very large result sets is time consuming and may cause OutOfMemoryExceptions in Wandora. This is especially true when you are accessing [[Database topic map|database topic maps]].
  
 
== Find ==
 
== Find ==

Revision as of 13:20, 28 November 2009

Finder example.gif


Finder is used to locate and open topics. Finder locates beside the Topics tab as shown above.

Finder is a simple free text search. Finder tries to locate given search word in topics. You can search with any topic element or element combination. Search result appears below the search field. Double clicking a topic in the search result opens the topic into the topic panel. Right clicking a topic opens context menu with a large number of topic tools.

Search words used in Finder can contain Java specific regular expression characters such as dot. Finder doesn't restrict search word lenght. As an extreme example you could start search with a single dot as the search word resulting every topic in Wandora. Viewing very large result sets is time consuming and may cause OutOfMemoryExceptions in Wandora. This is especially true when you are accessing database topic maps.

Find

Addition to Finder tab, topics can be searched selecting Edit > Find or pressing CTRL-F in Wandora. Option opens Find dialog window as shown below


Find search tab.gif


Writing a word to search field and pressing OK starts search. Search word is interpreted as a regular expression allowing rather complicated searches. Search results are viewed in a separate dialog window as shown below:


Find results.gif


To open any topic in the search results double click topic name or right click topic name and select option Open topic. The search result dialog is closed by pressing Close button. The search dialog is restored by clicking Again. Addition to traditional regular expression searches Wandora features also string similarity search.

String similarity

String similarity tab is used to perform similarity searches to the topic map. String similarity allows Wandora user to search topics with strings that only resemble matched string. Wandora utilizes Sam Chapman's SimMetrics open source library to calculate string similarities. Below is a screenshot of Wandora's String similarity tab.


Find similarity tab.gif


Available similarity types are

  • Levenshtein distance
  • Needleman-Wunch distance
  • Smith-Waterman distance
  • Block distance
  • Monge Elkan distance
  • Jaro distance
  • Jaro Winkler
  • SoundEx distance

Similarity types are discussed in documentation of SimMetrics. Addition to similarity type user has to choose similarity threshold. Similarity threshold defines the limit measured similarity must exceed before strings are similar and tested string is included to search results. If similarity threshold is near 100, strings must be very similar i.e only minimal differences are allowed. If similarity threshold is near 0, string can be very different to get the tested string included into the result set.

Some similarity measures use Gap cost and Tokenizer settings. First specifies the penalty a gap in word (space character) causes. Latter is used to split words out of text.

Difference instead similarity option changes the similarity measure to difference measure. You are searching for strings that are maximally different compared to given string.

Similarity search results are viewed in a separate dialog window. To open any topic in the search results user can double click topic name in the results dialog. The results dialog is closed by clicking Close. If search results are unsatisfying, user can click Again button and fine tune similarity search settings.

Query scripts

Query script tab is used to write and perform queries for Wandora's topic maps. A query is a little script that returns topics and string data distilled from the topic map. Wandora uses a non-standard query language that resembles functional languages such as LISP. Wandora's query language has a tutorial page of it's own.


Find query tab.gif