A Million First Steps in Topic Maps
British Library released images and information from the 17th, 18th and 19th century books under a title "A million first steps". The information was released as a series of structured text files placed at the GITHUB. The images were stored into the Flickr. License of the images and the information is public domain.
Wandora Team has converted the data files in GITHUB into the topic map serializations. Topic map serializations are XTM 2.0 formatted and can be viewed/edited in many topic map applications such as Wandora and Ontopia. Information has been divided into separate XTM files each containing information about books published during one year. Filename reflects the publishing year. License of the topic map conversions is same as the license of original data files i.e. public domain. Topic map files doesn't contain actual images or image files but links to images in Flickr.
Information about the used topic map data model:
- Each book topic has a basename that is a combination of book's title and identifier.
- Each book topic has a subject identifier that is derived from book's identifier. Identifiers doesn't resolve.
- Each book topic has an English display name that is book's title.
- If book has a Digital Service Library identifier, it is attached to the book topic as an occurrence. Also, PDF link to the book is attached to the book topic as a separate occurrence.
- If book has an Ark id, it is attached to the book as an occurrence.
- Author topic is associated with the book topic.
- Publication date topic is associated with the book topic.
- Image topics are associated with the book topic.
- Place of publishing topic is associated with the book topic.
- Image topic has a subject identifier and a subject locator that resolve original image file in Flickr.
- Image topic has a basename that is image's identifier.
- Image topic has occurrences for the image number, the page number and the volume number.
What can be done with the topic map conversions of the "Million first steps"? Next chapters describe some ideas.
Wandora and other topic map applications can provide a nice viewer for the book data and especially images. Topic map applications can also provide alternative publishing options that create a WWW site or a specific visualization out of the topic maps.
The user can easily enrich the information captured into the topic map, either manually or semiautomatically. A topic map is fundamentally a graph and topic map applications contain powerful tools to alter and modify the graph. Also, topic maps are incremental and two or more different topic maps can be merged. This enables linked data type merging of book information with some other information sources. Wandora has over 50 different information extractors.