New York Times Article Search API extractor

From WandoraWiki
Jump to: navigation, search

The New York Times API Extractor is used to parse articles from the New York Times Article Search service into Topic Map data. The service requires credentials for access. API keys for the article search as well as other NYT end points can be requested on the NYT Developer Pages. In Wandora, the API key is not saved between sessions.


As an example, the service is queried for articles related to Barack Obama. Extractors using NYT services are found in File > Extract > News > New York Times API Extractor

Nyt article 1.png

The Article Search dialog mirrors options available in the Article Search service. In addition to a search query, the service allows selecting a subset of fields to return for each article. The results may also be restricted to a certain date range, offset with a page number (i.e. a multiple of 10 articles) or sorted newest or oldest first.

Nyt article 2.png

If an API key isn't already stored it is requested.

Nyt article 3.png

Often the query returns more than one page, or 10 articles, and the extractor prompts for further action. The pagination options allow for either

  • Stopping and not retrieving additional articles
  • Extracting one more page and stopping
  • Extracting one more page and prompting again
  • Extract 10 more pages and prompting again
  • Extracting all remaining pages

The NYT article database is large, and vague queries might return results sets in the order of thousands of articles. Review the API rate limits before running excessive queries in order to avoid exhausting available quota.

Nyt article 4.png

The fetched articles are then parsed into topics and associations representing relations between articles, keywords etc.

Nyt article 5.png

In particular the keyword structure used by the service is parsed into a topic structure associating articles with common keywords.

Nyt article 6.png

Personal tools