R: Get articles and show in command line

Using R, get started with our Historical API. Fetch an article and print the API's response in the command line.

📘

About this tutorial

What you'll learn

  • How to make a POST API call to our Historical API using R.
  • How filters work in the v1/articles endpoint.

What you'll need

  • Access to the Command Prompt (Windows) or Terminal (macOS).
  • R installed on your machine.
  • A text editor or, preferably, RStudio.

Step 1: Check for R

Before we get started, we'll check that we have R installed on our machine.

Open the Command Prompt / Terminal and enter the following command:

r --version

If you have R installed, running this command this will return the R Version.

If the version is lower than 3, we recommend upgrading to a newer version.

Step 2: Install and Import the Libraries We Will Need

There are a number of libraries available in R that will make our lives easier. A library is pre-written code that lets us perform common tasks more quickly. In this tutorial, we'll be using the libraries 'httr', to make API calls, and 'jsonlite' to handle JSON data structures.

To install the libraries run the following lines in your command line or in the RStudio console:

# install.packages('httr')
# install.packages('jsonlite')

Lets open a text editor or RStudio. This is where we'll write our R script. A script is a set of instructions that your computer will know how to run using R.

At the top of your file, add the following to import the libraries we need.

library(httr)
library(jsonlite)

Step 3: Set our endpoint and API key details

We want to have an easy place to store things like our API key and the details of the endpoint we're using. We'll want to add the API key to the endpoint URL as a query parameter as well.

We'll do that by defining two variables:

  • api_key for our key
  • api_endpoint for the full URL of the endpoint, using the api_key as well

In your file, add the following on its own line:

api_key <- 'YOUR_API_KEY'
api_endpoint <- paste0('https://api.newswhip.com/v1/articles?key=', api_key)

Step 4: Define our query

Next up, we're going to tell our script what parameters we want to send when we make our request.

The articles endpoint lets us filter the results on all kinds of things (you'll find the full reference in the /articles documentation). In this example, we're going to

  • Search for happiness
  • Restrict our result to just one entry

We'll write the query that we'll need to run, and assign that to a variable called data. We do that so that we can reuse the query whenever we like, just by referring to the data variable.

It's very important to note the formatting of this query; the query is made up of words and numbers and some grammatical symbols.

  • Any word you are using in the query string has to be surrounded by a double quote -- this signals to the API how it should treat the information you are passing.
  • You also need to add a backslash \\ before every double quote. So happiness becomes \"happiness\". This is because quotation marks are special characters in our API. You can find out more in our Lucene query strings article.
  • Numerical values can be used without any extra symbols.
  • Boolean values and operators are keywords like true, false, and etc. For these values, the casing of the word is very important, they should all be lower case. This is a different casing structure than you use in R.
  • And, as you will see, everything needs to be wrapped in curly brackets e.g. {}

So, here is the query we're going to use:

data <- '{\"filters\": [\"happiness\"], \"size\": 1}'

Step 5: Run the API query

Next, we'll run the query.

POST is a command from the httr library that handles our API request. We'll assign the output of the request to a variable that we'll call r.

We're also going to use the command stop_for_status(), again from httr, to get a more meaningful error message if there is some problem with our query. This is a really helpful step in understanding what to do if something goes wrong.

r <- POST(api_endpoint, body = data)
stop_for_status(r)

Step 6: Extracting the response text

So far, we've written the code to define an API query and execute it. Now we need to show the output in the command prompt.

To do this, we'll do two things:

  • We'll transform the value stored in r into text, and assign that to a new variable called response.
  • We'll print that text to the command prompt you're using.

The data we get back from the API is in a format called JSON, so we need to use the fromJSON command from jsonlite to transform the data into some more human-readable. Then we can print our results.

response <- fromJSON(content(r, "text", encoding = "ISO-8859-1"))$articles

print(response)

Step 7: Write a function

Let's make all this work a little easier to re-use by rewriting it all as a function.

Making a function like this means we can copy this peice of code into other work and re-use everything we've learned easily

We'll write the function so that you can search by a different keyword each time, you can pass in different API keys (so if you want to share the work with someone they can use their own API key, or at least you don't have to share yours), and you can set the limit to the number of articles you want to see differently each time.

get_newswhip_articles <- function(api_key, keyword, limit) {
  
  
          api_endpoint <- paste0('https://api.newswhip.com/v1/articles?key=', api_key)
          
          data <- paste0('{\"filters\": [\"', 
              keyword, '\"], \"size\": ',
              limit, ', \"find_related\": false}')
          
          r <- httr::POST(api_endpoint, body = data)
          httr::stop_for_status(r)
          
          jsonlite::fromJSON(httr::content(r, "text", encoding = "UTF-8"))$articles
          
}

Step 8: Result!

We can now run our function and print out some results.

Below I've limited the print to only show the headline for each article and the volume of social interactions the article is predicted to get.

happiness <- get_newswhip_articles(api_key = 'YOUR_API_KEY', keyword = 'happiness', limit = 1)
cats <- get_newswhip_articles(api_key = 'YOUR_API_KEY', keyword = 'cats', limit = 1)

print(happiness[c("headline", "predicted_interactions")])
print(cats[c("headline", "predicted_interactions")])

The results will look something like this:

##                                                                 headline
## 1 People Who Put Up Christmas Decorations Early Are Happier, Says Expert
##   predicted_interactions
## 1                  52994
##                                                                                                 headline
## 1 Dream job: A cat sanctuary is seeking a caretaker to live on a Greek island and look after its 55 cats
##   predicted_interactions
## 1                 371430

Resources

Useful documentation

The full script

You can download the rmarkdown file here.

get_newswhip_articles <- function(api_key, keyword, limit) {
        
        #' Search for Articles from NewsWhip which include a specific keyword 
        #' 
        #' 
        #' @importFrom jsonlite fromJSON
        #' @importFrom jsonlite stop_for_status
        #' @importFrom jsonlite POST
        #' @importFrom jsonlite content
        #' @export
        
        api_endpoint <- paste0('https://api.newswhip.com/v1/articles?key=', api_key)
        
        data <- paste0('{\"filters\": [\"', 
                       keyword, '\"], \"size\": ',
                       limit, ', \"find_related\": false}')
        
        r <- POST(api_endpoint, body = data)
        stop_for_status(r)
        
        fromJSON(content(r, "text", encoding = "UTF-8"))$articles
        
}

cats <- get_newswhip_articles(api_key = 'YOUR_API_KEY', keyword = 'cats', limit = 1)