HSOpen Apurahat as a topic map
Helsingin Sanomat, a major Finnish newspaper and media company, arranged an open-data-hackathon with a title HS Open #2 - Follow the Money 23rd of May 2011. Helsingin Sanomat released two data sets for the event. First data set included money support reports from members of Finnish parliament. Second data set was about grants received by Finnish artists between 2000-2010. Wandora Team participated the hackathon and would like to thank organizers. To contribute the event we created a special importer for Wandora application that converts the data set of artist' grants to a topic map format. As a result, the data set can imported to any topic map application, such as Wandora, for further analysis. After creating the importer, we are ready to publish the topic map conversion of the original data set.
HSOpen Apurahat topic map and it's license
Topic Maps conversion of HSOpen Apurahat data set is distributed as
- XTM 2.0 topic map serialization.
- XTM 2.0 topic map serialization (including Wandora's base ontology).
- Wandora project file.
Conversion date was 3rd of June 2011. Original data set is available here. To open XTM 2.0 serialization in Wandora see How to import existing topic map to Wandora. To open Wandora project file see How to save and load project.
License of the Topic Maps conversion of HSOpen Apurahat is Creative Commons Attribution 1.0 Finland. Source of the original data set is Helsingin Sanomat. Topic Maps conversion of HSOpen Apurahat has been created by Aki Kivelä, Wandora Team.
Topic map conversion details
- Language of the topic map is Finnish.
- Number of topics in conversion is 11993. Number of associations is 38952. Number of person (henkilö) topics is 5238. Number of grant (päätös) associations is 16594. Below is a screen capture of Wandora application viewing some additional numbers calculated from the topic map.
- Original data set contains artist's birth day, month and year. Topic Maps conversion contains only artist's birth year. Month and day information have been dropped away.
- Original data set contains errors. As the conversion was an automatic process, the topic map contains same errors.
- Connection distribution of topic map is
Screen captures and some visualizations
First screen capture views the HSOpen Apurahat topic map open in Wandora application. User is looking at grant categories and has opened grants of critics and reviewers (arvostelijat in Finnish).
Second screen capture views Wandora user looking at grants of one specific artist. By default Wandora views information in a table. Information table views all grants received by the artist, artist's language, home town, sex and birth time.
Third screen capture views a single home-town of artists. Viewed home-town is MIKKELI. Again, Wandora views all related information in a table. Table contains all artists that have lived in MIKKELI.
Next, Wandora user switches the table view to a graph view and opens grant categories once again. Now Wandora views a simple star shaped graph where grant-category (Hakemusluokka in Finnish) topic is in the middle and it's instances are branches. Length of a branch is proportional to the number of artists in the category. Order of branches carries no special information. I added category labels and light gray arrows to the screen capture in Adobe Photoshop.
Now Wandora user expands all category nodes and Wandora views nodes for all artists that have received a grant in given category. Magenta circles are graph node groups in their default position. Darker magenta edges are relationships between artists and grant categories. What is interesting, are the edges between categories. It appears that a single artist may have received a grant from two (or more) different categories. For example there are artists that have received a grant from literature (kirjallisuus) and visual arts (kuvataide). These multi talented artists appear near one category and there is an additional edge that connects the artist to another category also. As a consequence, it looks like categories are directly connected.
Next Wandora starts to animate nodes and edges. After a while, the graph has opened. Again, here the artists who have received grants of several categories appear to connect different category "balloons". Graph also views clearly the relative size of categories. Literature (kirjallisuus) appears to be the most granted art form in Finland.
Wandora user now closes categories of literature (kirjallisuus) and visual arts (kuvataide) and tries to find details in the messy center of the graph.
Wandora user zooms in the black frame viewed in image above. When user is close enough, node labels appear.
What about next. Wandora offers a wide range of graph filtering options. Wandora user could focus on a single artist and visualize her grants, for example. If Wandora can't create a suitable visualization user can always export whole topic map graph or just a part of it to Gephi, yEd or other graph applications using Wandora's export options. Wandora offers also some very simple matrix export options which may be useful if want to analyze the data in R, for example.