Does Wandora can automaticly extract the topic of text?

Forum is for miscellaneous user help requests.

Does Wandora can automaticly extract the topic of text?

Postby Butty » Tue Sep 15, 2009 4:34 pm

Hello AKI,
I have known that Wandora provides a lot of extractors. The Simple Text Document Extractor can convert simple text documents to topic occurrence. It read the document, creates simple topic for the document and places the text content into an accurrence attached to the document topic.

I just wonder what is the meaning of "occurrence"? If the Simple Text Document Extractor can automatically extract the topic of text and visuaize the topic?
Butty
 
Posts: 1
Joined: Tue Sep 15, 2009 3:23 pm

Postby akivela » Wed Sep 16, 2009 11:11 am

Hello

Although Topic Maps standard suggests occurrence is an information packet that instances topic's subject, I tend to think occurrences are simple free text properties attached to a topic. Occurrences, in a way, allow you to store free text to your topic map. Wandora's style to handle occurrences reflects this design decision.

Using Simple Text Document extractor creates a topic and attaches given text as an occurrence to the topic, just like you described. Simple Text Document extractor doesn't analyze the text. Text is simply stored as an occurrence.

However, Wandora contains features digging topics out of text occurrences. Say, you have used Simple Text Document extractor and have a topic with text occurrence. Now, you can open the occurrence editor by clicking the occurrence text cell (in Traditional topic panel) that shows first words of your occurrence. Occurrence editor is a simple text editor window where you can edit the occurrence text. Window has a menu bar with Refine menu. If you look into the Refine menu you should see menu options

* Make topics
* Find topics
* Classify
* Insert

Classify contains submenus Classify with OpenCalais and Classify with SemanticHacker. Selecting OpenCalais option sends your occurrence text to OpenCalais (http://www.opencalais.com) and retrieves topics describing the occurrence text. Then Wandora associates your topic (occurrence carrier) with all retrieved topics. You might say Wandora in a way resolves topics for free occurrence text, although external webservice (OpenCalais) is used to do the actual topic extraction.

Selecting Classify with SemanticHacker does similar operation but uses http://www.semantichacker.com to retrieve occurrence text topics. You need a valid SemanticHacker user token to use Classify with SemanticHacker option.

Wandora contains also separate extractors for OpenCalais and SemanticHacker in File > Extract > Classification. OpenCalais extractor is discussed at http://www.wandora.org/wandora/wiki/index.php?title=OpenCalais_classifier. SemanticHacker extractor is discussed at http://www.wandora.org/wandora/wiki/index.php?title=SemanticHacker_classifier.

I hope I answered your question. If not, please drop a line.

Kind Regards,
Aki Kivelä
Wandora Team
akivela
Site Admin
 
Posts: 260
Joined: Tue Sep 18, 2007 10:20 am
Location: Helsinki, Finland


Return to How to... and problems

Who is online

Users browsing this forum: No registered users and 12 guests