Geo microformat extractor

From WandoraWiki
Jump to: navigation, search

Geo microformat extractor reads HTML documents and fragments, and creates topics and associations for geo structures. Geo structures are used to mark up geographical coordinates (latitude and longitude). Geo Microformat extractor starts with menu option File > Extract > Microformats > Geo microformat extractor... or selecting the extractor as the Drag and drop extractor and dropping HTML fragments directly from the WWW browser.

Geo microformat extraction example

First, Wandora user selects Geo extractor as current drag and drop extractor.


Geo example wandora.gif


Italian Wikipedia contains a page for Genova, a city in north Italia. Right column of the Genova page has an info box with Genova's geographical coordinates. User selects coordinate text as shown below and drags the text fragment over Wandora's drag and drop extractor.


Geo example wikipedia.gif


When user drops the fragment, Wandora recognizes geo microformat structure and creates equivalent topic map structures. Recognized geo microformat structure is

  <span class="latitude">44°24′40.16″N</span>
  <span class="longitude">8°55′57.58″E</span>

Preceding Coordinate: in selection is interpreted as a label for the geo-location. Extracted structures can be found in Wandora with finder and geo keyword. Screenshot below views extracted association.


Geo example wandora loc.gif


Notice Wandora handles coordinates as strings. As different sites use different coordinate representations it is possible Wandora doesn't merge same coordinates. Notice also Wandora has no feature to estimate distance of coordinates at the moment.

See also

Personal tools