MediaWiki extractor

From WandoraWiki
Revision as of 20:54, 9 January 2010 by Akivela (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Wandora's MediaWiki extractor allows you to gather topics and associations from various large knowledge repositories such as Wikipedia. The extractor can't handle HTML version of MediaWiki page but requires the XML exported page. MediaWiki extractor reads the XML dump of MediaWiki page and creates a topic for the page. Page content is attached to the topic as a text data occurrence. The extractor is started with File > Extract > Wiki > MediaWiki extractor. You can extract data from local XML files or directly from MediaWiki site using export URL of the page. For example the export URL of this page is

Note: Wandora or Wandora authors have no rights to give you any permission to use any content of any MediaWiki site. Wandora provides you nothing but a technology to create topic maps from MediaWiki pages. You should carefully read the content license of the MediaWiki site before using the extractor.

Postprocessing MediaWiki extracted topics

The MediaWiki extractor does not process the content of extracted pages. However, it is possible to create associations out of page content using another tool in Wandora. Context menu has a tool called Topics > Associations > Find associations in occurrence... that can be used to extract associations out of text data. The tool requires type and scope of processed occurrence, topic's role in new associations, and a regular expression used to recognize extracted topics in text data.

See also

Wandora contains also separate Wikipedia extractor that is a graphical front end for MediaWiki extractor described here.

Personal tools