Stanford Named Entity Recognizer integration

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(See also)
 
(3 intermediate revisions by one user not shown)
Line 1: Line 1:
 
[http://nlp.stanford.edu/software/CRF-NER.shtml Stanford Named Entity Recognizer] (NER) is an open source Java library for named entity recognition. In other words, Stanford NER can extract named entities out of given text. Named entities are persons, organizations, and locations, for example. Wandora features a tool called '''Stanford Named Entity Recognizer''' that uses Stanford NER Java library and extracts topics and associations out of given text, an occurrence, for example. Tool locates in Wandora application menu '''File > Extract > Classification'''. It is available in occurrence editor and browser plugin also.
 
[http://nlp.stanford.edu/software/CRF-NER.shtml Stanford Named Entity Recognizer] (NER) is an open source Java library for named entity recognition. In other words, Stanford NER can extract named entities out of given text. Named entities are persons, organizations, and locations, for example. Wandora features a tool called '''Stanford Named Entity Recognizer''' that uses Stanford NER Java library and extracts topics and associations out of given text, an occurrence, for example. Tool locates in Wandora application menu '''File > Extract > Classification'''. It is available in occurrence editor and browser plugin also.
  
Stanford NER is included in Wandora distribution package and embedded tool '''Stanford Named Entity Recognizer''' processes given text completely '''locally'''.
+
Stanford NER is included in Wandora distribution package and embedded tool '''Stanford Named Entity Recognizer''' processes given text '''locally'''.
  
 
== Configuring Stanford NER ==
 
== Configuring Stanford NER ==
  
Keeping CTRL-key pressed while starting the tool in Wandora opens up a configuration dialog window. In this window Wandora user can change NER's sequence classifier. Sequence classifier contains all information related to recognized entities.
+
Keeping CTRL-key pressed while starting the tool in Wandora opens up a configuration dialog window. In this window Wandora user can change NER's sequence classifier with NER model. Sequence classifier contains all information related to recognized entities.
 +
 
 +
At the moment Wandora includes all default sequence classifiers of Stanford NER. They locate in '''buid/classes/org/wandora/application/tools/extractors/stanfordner/classifiers''' and are
 +
 
 +
* ner-eng-ie.crf-3-all2008.ser.gz
 +
* ner-eng-ie.crf-3-all2008-distsim.ser.gz
 +
* ner-eng-ie.crf-4-conll.ser.gz
 +
* ner-eng-ie.crf-4-conll-distsim.ser.gz
 +
 
 +
To train your own NER model see [http://nlp.stanford.edu/software/crf-faq.shtml Stanford NER FAQ].
  
 
== See also ==
 
== See also ==
Line 16: Line 25:
 
* [[Zemanta extractor]]
 
* [[Zemanta extractor]]
 
* [[GATE/ANNIE integration|GATE/ANNIE]]
 
* [[GATE/ANNIE integration|GATE/ANNIE]]
 +
* [[UClassify integration]]

Latest revision as of 17:01, 12 August 2011

Stanford Named Entity Recognizer (NER) is an open source Java library for named entity recognition. In other words, Stanford NER can extract named entities out of given text. Named entities are persons, organizations, and locations, for example. Wandora features a tool called Stanford Named Entity Recognizer that uses Stanford NER Java library and extracts topics and associations out of given text, an occurrence, for example. Tool locates in Wandora application menu File > Extract > Classification. It is available in occurrence editor and browser plugin also.

Stanford NER is included in Wandora distribution package and embedded tool Stanford Named Entity Recognizer processes given text locally.

[edit] Configuring Stanford NER

Keeping CTRL-key pressed while starting the tool in Wandora opens up a configuration dialog window. In this window Wandora user can change NER's sequence classifier with NER model. Sequence classifier contains all information related to recognized entities.

At the moment Wandora includes all default sequence classifiers of Stanford NER. They locate in buid/classes/org/wandora/application/tools/extractors/stanfordner/classifiers and are

  • ner-eng-ie.crf-3-all2008.ser.gz
  • ner-eng-ie.crf-3-all2008-distsim.ser.gz
  • ner-eng-ie.crf-4-conll.ser.gz
  • ner-eng-ie.crf-4-conll-distsim.ser.gz

To train your own NER model see Stanford NER FAQ.

[edit] See also

Personal tools