Finding a topic

From WandoraWiki
Revision as of 16:58, 30 January 2009 by Akivela (Talk | contribs)

Jump to: navigation, search
Finder example.gif


Finder is used to locate and open topics. Finder locates beside the Topics tab as shown above.

Finder is a simple free text search. Finder tries to locate given search word in topics. You can search with any topic element or element combination. Search result appears below the search field. Double clicking a topic in the search result opens the topic into the topic panel. Right clicking a topic opens context menu with a large number of topic tools.

Search words used in Finder can contain Java specific regular expression characters such as dot. Finder doesn't restrict search word lenght. As an extreme example you could start search with a single dot as the search word resulting every topic in Wandora. Viewing very large result sets is time consuming and may cause OutOfMemoryExceptions in Wandora.

Find

Addition to Finder tab, topics can be searched selecting Edit > Find or pressing CTRL-F in Wandora. Option opens Find dialog window as shown below


Find search tab.gif


Writing a word to search field and pressing OK starts search. Search result is viewed in separate dialog window:


Find results.gif


User has searched with title in the example above. To open any topic in search results double click topic name or right click topic name and select option Open topic. Search result dialog is closed when you press Close button. Search dialog is restored by clicking Again.

String similarity tab is used to perform similarity searches to the topic map. Wandora utilizes Sam Chapman's SimMetrics open source library to calculate string similarities.


Find similarity tab.gif


Available similarity types are

  • Levenshtein distance
  • Needleman-Wunch distance
  • Smith-Waterman distance
  • Block distance
  • Monge Elkan distance
  • Jaro distance
  • Jaro Winkler
  • SoundEx distance

Similarity types are discussed in documentation of SimMetrics. Addition to similarity type user has to choose similarity threshold. Similarity threshold defines the limit measured similarity must exceed before strings are similar and tested string is included to search results. If similarity threshold is near 100, strings must be very similar i.e only minimal differences are allowed. If similarity threshold is near 0, string can be very different to get the tested string included into the result set.

Some similarity measures use Gap cost and Tokenizer settings. First specifies the penalty a gap in word (space character) causes. Latter is used to split words out of text.

Difference instead similarity option changes the similarity measure to difference measure. You are searching for strings that are maximally different to given string.