Objective

To create a believable President Barack Obama speech generator in R using a selection of his speeches, addresses, and interviews as training data.

Rules

While generating speech via randomly sampled blocks of text from the provided documents is, technically, a valid approach (albeit not an impressive one), the caveat is that each block must not be longer than one sentence.

Data

To train your algorithms, we have prepared 8 text documents containing a selection of Obama’s words from various speeches and interviews. The documents are available as a 82KB ZIP archive.

Submission

Each submission must be emailed to Mikhail Popov (mikhail@mpopov.com) by 11:59pm on May 20th as a single R script that, when run, will use the documents we provided (above) to create a self-contained function speech. This function must generate a n sentences-long Obama speech.

speech must not rely on any objects outside of it. All objects that are not speech will be removed from the global environment before speech is used to generate text. See instructions below for creating a self-contained function.

Example

speech(3)
## [1] "There are only so many shortcuts."         
## [2] "Ultimately, we have to change the law."    
## [3] "And people have to remain focused on that."

Evaluation

The submissions will be evaluated by the Pittsburgh useR group organizers (for code readability, performance, and memory usage) and competing teams (blind peer review).

Weighted Scoring

Prize

RStudio is a trademark of RStudio, Inc. A one (1) year subscription to shinyapps.io Standard Edition (a $1,100 value): Unlimited Applications, 1,000 Active Hours, Authentication, Multiple Instances, and Email Support.

Runners up will get a variety of prizes, including Hands-On Programming with R by Garrett Grolemund.

RStudio is a trademark of RStudio, Inc.


Winners

We received a single submission from Taylor Pospisil and Lee Richardson, which can be found at https://github.com/Pittsburgh-useR-Group/RObama/tree/master/submission.


Instructions for Creating Self-Contained Functions 

You can use the following template and accompanying example for creating a function that contains all the data and models it needs:

make <- function(Object1,Object2) {
  force(Object1)
  force(Object2)
  return(function(n=NULL) {
    obj1 <- get('Object1',environment())
    obj2 <- get('Object2',environment())
    str(obj1)
    str(obj2)
    ls()
  })
}

Let’s see it in action:

set.seed(0)
str(x <- rnorm(10))
##  num [1:10] 1.263 -0.326 1.33 1.272 0.415 ...
str(y <- rnorm(10))
##  num [1:10] 0.764 -0.799 -1.148 -0.289 -0.299 ...
speech <- make(x,y)
rm(x,y,make)
ls()
## [1] "speech"
speech()
##  num [1:10] 1.263 -0.326 1.33 1.272 0.415 ...
##  num [1:10] 0.764 -0.799 -1.148 -0.289 -0.299 ...
## [1] "n"    "obj1" "obj2"