Page 1 of 1

IS IT POSSIBLE TO OPEN A SER FILE

PostPosted: Wed Oct 26, 2011 3:31 pm
by gimley
Hello,
I am a newbie to Wandora and have installed the system and find it impressive. How do I get wandora to handle the ser files of the STANFORD NER engine. All my attempts get the ser files to run failed.

Many thanks in advance for any help.

PostPosted: Wed Oct 26, 2011 4:18 pm
by akivela
Hi Gimley

Thank you for using Wandora.

Wandora's Stanford NER extractor has a configuration dialog that is opened if you hold down CTRL key while selecting the Stanford NER extractor option in menu. Configuration dialog contains a class path field for the SER file. Notice the path is relative to Wandora's build folder. It's probably best to place the SER file into

classes/org/wandora/application/tools/extractors/stanfordner/classifiers

This folder contains all default SER files. If you didn't manage to change the SER file or Wandora's Stanford NER extractor generates errors or exceptions after change, please, do drop a line.

Kind Regards,
Aki / Wandora Team

PostPosted: Wed Oct 26, 2011 7:34 pm
by gimley
Hi Aki,
Many thanks for your kind reply. I went in the path of Wandora in my machine:
F:\wandora-2011-09-19\wandora\build\classes\org\wandora\application\tools\extractors\stanfordner\classifiers\
I had copied all files from Standford NER Classifiers into that path. The app asked for an external file containing sequence classifier.
I used the default which was:
/ner-eng-ie.crf-3-all2008.ser.gz
and which is in path.
I clicked OK. I saw a small message in the lower extreme right of screen saying Stanford Named Entity Recogniser, but after that nothing happened.
Sorry to hassle you like this but the NER jsut won;t extract or run
What have I done which is not right ?
Many thanks

PostPosted: Thu Oct 27, 2011 11:00 am
by akivela
Hi Gimley

Once you have successfully configured Wandora's Stanford NER extractor by addressing the SER file, you can perform Stanford NER extractions following next simple steps:

1. Select menu option File > Extract > Classification > Stanford Named Entity Recognizer. When selected, Wandora opens a dialog window.

2. In the dialog window, select a tab that reflects your data source. Available data sources are Files, Urls and Raw. If you just want to test drive the classifier, I suggest you select the Raw tab.

3. If you selected the Raw tab, write (or copy and paste) some English text into the text field. If you selected Urls tab, write some url addresses to the text field. If you selected Files tab, select some text files to be classified.

4. Once you have addressed the source data, press Extract button. Now Wandora reads your data and performs classifications using the Stanford Named Entity Recognizer.

5. Once the extraction/classification is finished, Wandora views a log window. The log window contains the number of found entities. Close the log window.

6. If the extraction/classification was successful, Wandora should contain new topics and associations now. You can view the topics in Wandora's topic tree. Topic tree locates left, under the Topics tab. Tree root labeled as Wandora Class has now two new branches labeled as Document and Stanford NER. You'll find all data source documents under the Document and all found entity topics under the Stanford NER. Each data source document topic has associations that link the document to all recognized entities.

I hope you'll be able to perform successful classifications following these instructions. Notice, Wandora can also extract entities out of occurrences but that's another use case. Face still problems, please drop a line.

Kind Regards,
Aki / Wandora Team

PostPosted: Thu Oct 27, 2011 1:55 pm
by gimley
Many thanks. I tried and it has worked successfully. One of my colleagues asked me if we can extract the .ser file itself. Just a question out of curiosity
Thanks once ore for all the pains you have taken.

PostPosted: Thu Oct 27, 2011 4:05 pm
by akivela
I am happy to hear you managed to perform extraction successfully.

SER file is a corpora used by the Stanford Named Entity Recognizer. It contains the vocabulary and entity recognition rules. Unfortunately Wandora can't extract SER files per se. Wandora just uses SER files. SER file defines what kind of entities the classifier potentially recognizes.

Kind Regards,
Aki / Wandora Team