Training and Testing a Prophet Model

Last updated on Jun 2, 2022 14 min read

As I start looking for non-academic positions, I wanted to practice forecasting as I didn’t really have much experience with these types of models. NOTE: This is for practicing forecasting skills and you should not trust this model with your own stocks. After plenty of reading,

reading book I finally have some understanding of how to utilize these models. This post started because even after a BA, 2 masters degrees, and a doctorate, my brother still has no clue what I do. He, along with most of my family think I am a Clinical Psychologist.

So for me to try and make my brother understand what I do, I thought I would show him with something that he has become interested with recently; stocks. So for this post, I’ll

load stock data from Google Finance
visualize some data
train a prophet model
test that prophet model
see how well that model predicts other stock data

Below are all the sites for the packages I used.

library(tidyverse)
library(tidymodels)
library(prophet)
library(lubridate)
library(modeltime)
library(timetk)

Loading Data

To load the Google Finance data, I decided to pick a stock that my brother had, which in this case was JetBlue. A cool feature about Google Finance and Google Sheets is that you can use the following formula in a Google Sheet on the first cell of the first column =GOOGLEFINANCE("JBLU", "price", DATE(2000,1,1), DATE(2025, 1, 1), "DAILY") and it will give you the date and stock closing values for whatever period you’d like. The example above provides Google financial data for JBLU or the abbreviation for JetBlue stock. It also provides the price of the stock from the first day that there is data on JetBlue stocks, which in this case is April 12th 2002. You can also choose the period of time for the stock prices. I decided to look at daily data.

JetBlue Sheet

Here I have a copy of my Google Sheet for JetBlue that I will use to train and test my Prophet model. Instead of having a .csv file on my local machine, I decided to keep this on Google Drive so that it constantly updates with the Google Finance function. This meant that I had to use the googlesheets4 package to load the data from a Google Sheet. I also changed the name and class of the date variable to make it a date variable instead of a date and time variable.

googlesheets4::gs4_deauth()

theme_set(theme_light())

jet <- 
  googlesheets4::read_sheet('https://docs.google.com/spreadsheets/d/1SpRXsC3kXDaQLUfC6cPIOvsqxDF6updhgHRJeT8PTog/edit#gid=0') %>% 
  janitor::clean_names() %>% 
  mutate(ds = as_date(date))

Cleaning Up the Data

Based on some visualizations below, I also decided to create some additional variables from the date variable. Specifically, I used lubridate's wday() function to create a new variable that gives you the actual day from the corresponding cell’s date. I also used the ts_clean_vec function from time_tk to clean for outliers in the stock price values. There are additional arguments for the function, like applying a Box-Cox transformation but that is for a multiplicative trend, which this model does not appear to fit since the variation in the outcome does not grow exponentially. I’ll also include 2002 as the reference year for the year variable and make sure that my data is arranged by date.

jetblue <- jet %>% 
  mutate(actual_day = wday(ds,
                           label = TRUE),
         clean = ts_clean_vec(close)) %>% 
  separate(col = date,
           into = c('year_num', 'month_num', 'day_num'),
           sep = '-') %>% 
  mutate(year_num = as.factor(year_num),
         year_num = relevel(year_num, ref = '2002')) %>% 
  separate(col = day_num,
           into = c('day_num', 'drop'),
           sep = ' ') %>%
  mutate(day_num = as.numeric(day_num),
         month_num = as.factor(month_num)) %>% 
  select(-drop) %>% 
  arrange(ds)

Visualizing Data

Starting with some quick visualizations, we can see that the only area that there is a difference in the variation of the stock prices is in the beginning of 2020. I wonder what that could have been friends .

jetblue %>% 
  group_by(year_num, month_num) %>% 
  summarize(var_value = sd(close)^2) %>% 
  ungroup() %>% 
  ggplot(aes(month_num, var_value)) + 
  geom_point() + 
  facet_wrap(vars(year_num))

Next, we can look at the histograms for the outcome of interest. If we look at the histograms, we can see that there are potential outliers in the original stock prices data. We can also see that cleaning the variable removed the potential outliers.

only_numeric <- jetblue %>% 
  select(close, clean)

map2(only_numeric,
     names(only_numeric),
     ~ggplot(data = only_numeric,
             aes(.x)) + 
       geom_histogram(color = 'white',
                      fill = 'dodgerblue') +
       geom_vline(xintercept = mean(.x) +
                    sd(.x) +
                    sd(.x) +
                    sd(.x),
                  color = 'red',
                  size = 1.25,
                  linetype = 2) + 
       geom_vline(xintercept = mean(.x) -
                    sd(.x) -
                    sd(.x) -
                    sd(.x),
                  color = 'red',
                  size = 1.25,
                  linetype = 2) + 
       labs(title = .y))

## $close

## 
## $clean

There will also be a lot of use of the purrr package and the map functions, which are part of the tidyverse. We can also see that in the plot series visualization using modeltime's plot_time_series function, that the cleaned stock prices remove the outliers. So from here on out, I’ll be using the cleaned stock prices.

map2(only_numeric,
     names(only_numeric),
     ~only_numeric %>% 
       plot_time_series(jetblue$ds,
                        .x,
                        .interactive = FALSE) + 
       labs(title = .y))

## $close

## 
## $clean

We can also look for anomalies, or points that deviate from the trend. Using the plot_anomaly_diagnostics function from the modeltime package, I can see all the anomalies in the data. I also used ggplot to create my own visualization using the same data. Lastly, we’ll deal with those anomalies by removing them from the dataset. This is not too much of a problem because the Prophet model should be able to handle this fairly easy.

jetblue %>% 
  plot_anomaly_diagnostics(ds,
                           clean,
                           .facet_ncol = 1,
                           .interactive = FALSE)

jetblue %>% 
  tk_anomaly_diagnostics(ds,
                         clean) %>% 
  ggplot(aes(ds, observed)) + 
  geom_line() + 
  geom_point(aes(color = anomaly)) +
  viridis::scale_color_viridis(option = 'D',
                               discrete = TRUE,
                               begin = .5,
                               end = 0)

anomaly <- jetblue %>%
  tk_anomaly_diagnostics(ds,
                         clean)

jetblue <- left_join(jetblue, anomaly) %>%
  filter(anomaly != 'Yes')

We can also look into additional regressors to include in the model by looking into seasonality. We can see some fluctuation in stock prices across the years. We’ll include the year variable as another regressor on the stock prices.

jetblue %>% 
  plot_seasonal_diagnostics(ds,
                            clean,
                            .interactive = FALSE)

Training the Prophet Model

Before we begin, I’m going to designate 10 cores to process any models run.

set.seed(05262022)

parallel::detectCores()

## [1] 12

parallel_start(10,
               .method = 'parallel')

First, instead of the normal initial_split used for training and testing splits, we’ll use the initial_time_split function from tidymodels to separate the first 80% of the data into training set and the other 20% into the testing set.

set.seed(05262022)
jet_split <- initial_time_split(jetblue)

Prophet Model Function

I decided to create my own Prophet function to be able to use for both training the model and testing it. In this function, I’ve also included parameters that can be changed to see if the model performs better or worse. Lastly, the train = TRUE allows us to practice with the training dataset and then when we’re happy with the model, we can use it to test our model. For our model, we’ll be predicting stock prices with date and comparing each year to the reference year (2002).

prophet_mod <- function(splits,
                        changepoints = .05,
                        seasonality = .01,
                        holiday = .01,
                        season_type = 'additive',
                        day_season = 'auto',
                        week_season = 'auto',
                        year_season = 'auto',
                        train = TRUE){
  library(tidyverse)
  library(tidymodels)
  library(modeltime)
  library(prophet)
  
  analy_data <- analysis(splits)
  assess_data <- assessment(splits)
  
  model <- prophet_reg() %>% 
    set_engine(engine = 'prophet',
               verbose = TRUE) %>% 
    set_args(prior_scale_changepoints = changepoints,
             prior_scale_seasonality = seasonality,
             prior_scale_holidays = holiday,
             season = season_type,
             seasonality_daily = day_season,
             seasonality_weekly = week_season,
             seasonality_yearly = year_season) %>% 
    fit(clean ~ ds + year_num, 
        data = analy_data)
  
  if(train == TRUE){
    train_cali <- model %>% 
      modeltime_calibrate(new_data = analy_data)
    
    train_acc <- train_cali %>% 
      modeltime_accuracy()
    
    return(list(train_cali, train_acc))
  }
  
  else{
    test_cali <- model %>% 
      modeltime_calibrate(new_data = assess_data)
    
    test_acc <- test_cali %>% 
      modeltime_accuracy()
    
    return(list(test_cali, test_acc))
  }
}

It is worth noting that I’m using the modeltime package to run the prophet model because I believe it is easier to use (especially for later steps) than from Prophet but both can be implemented in this function. Let’s try running this model with the some random parameters I chose from the Prophet website until realizing that the modeltime parameters are log transformed.

set.seed(05262022)
baseline <- prophet_mod(jet_split,
                 train = TRUE) %>% 
  pluck(2)

## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

## Converting to Modeltime Table.

baseline

## # A tibble: 1 x 9
##   .model_id .model_desc           .type    mae  mape  mase smape  rmse   rsq
##       <int> <chr>                 <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1         1 PROPHET W/ REGRESSORS Fitted 0.942  8.90  4.42  8.73  1.30 0.950

So with the model, we can see that the Mean Absolute Scaled Error (MASE) is 4.4198696 and the Root Mean Square Error (RMSE) is 1.3000727. Not bad for an initial run. Let’s look at how the model fits the training data.

prophet_mod(jet_split,
                 train = TRUE) %>%  
  pluck(1) %>% 
  modeltime_forecast(new_data = training(jet_split),
                     actual_data = jetblue) %>% 
  plot_modeltime_forecast(.interactive = FALSE) +
  labs(title = 'Prophet Baseline Model')

So the model appears to follow the trend line. We’ll try to tune some of these parameters to see if we can make the model better.

Tuning the Model

Now, I’ll tune the prior scale values for the model. I’ll use the grid_latin_hypercube from the dials package in tidymodels to choose 5 sets of parameter values to run. I’m also using the rolling_origin from the rsample package in tidymodels because we are working with time series data. This does not create random samples but instead has samples with data points with consecutive values.

set.seed(05262022)

proph_model <- prophet_reg() %>%
  set_engine(engine = 'prophet',
             verbose = TRUE) %>%
  set_args(prior_scale_changepoints = tune(),
           prior_scale_seasonality = tune(),
           prior_scale_holidays = tune(),
           season = 'additive',
           seasonality_daily = 'auto',
           seasonality_weekly = 'auto',
           seasonality_yearly = 'auto')

proph_rec <-
  recipe(clean ~ ds + year_num,
         data = training(jet_split))


set.seed(05262022)
train_fold <-
  rolling_origin(training(jet_split),
                 initial = 270,  
                 assess = 90, 
                 skip = 30,
                 cumulative = TRUE)

set.seed(05262022)
grid_values <-
  grid_latin_hypercube(prior_scale_changepoints(),
                       prior_scale_seasonality(),
                       prior_scale_holidays(),
                       size = 5)

set.seed(05262022)
proph_fit <- tune_grid(object = proph_model,
                       preprocessor = proph_rec,
                       resamples = train_fold,
                       grid = grid_values,
                       control = control_grid(verbose = TRUE,
                                              save_pred = TRUE,
                                              allow_par = TRUE))


tuned_metrics <- collect_metrics(proph_fit)
tuned_metrics %>%
  filter(.metric == 'rmse') %>% 
  arrange(mean)

saveRDS(tuned_metrics,
        file = 'tuned_metrics.rds')

metrics <-
  readr::read_rds('C:/Users/cpppe/Desktop/github_projects/personal_website/content/post/2022-06-02-prophet-model/tuned_metrics.rds')

metrics %>% 
  filter(.metric == 'rmse') %>% 
  arrange(mean)

## # A tibble: 5 x 9
##   prior_scale_change~ prior_scale_sea~ prior_scale_hol~ .metric .estimator  mean
##                 <dbl>            <dbl>            <dbl> <chr>   <chr>      <dbl>
## 1             3.53             0.0170           1.12    rmse    standard    2.43
## 2             0.884           36.4              0.0131  rmse    standard    2.56
## 3             0.00139          0.00166          0.00172 rmse    standard    2.56
## 4             0.0549           0.261            0.231   rmse    standard    2.66
## 5            43.0              3.80            12.2     rmse    standard    2.93
## # ... with 3 more variables: n <int>, std_err <dbl>, .config <chr>

For the sake of not waiting for this to render, I decided to make a RDS file of the metrics gathered from the tuned Prophet model. We can see that the RMSE value was 2.4252669 and the prior scale changepoint value was 3.5347457, the prior scale seasonality value was 0.0170306, and the prior scale holiday value was 1.1198542.

Final Training Model

I then decided to run the prophet model on the training dataset with the new parameter values.

final_train <- prophet_mod(jet_split,
                 changepoints = 3.53,
                 seasonality = .017,
                 holiday = 1.12,
                 train = TRUE) %>%  
  pluck(2)

final_train

## # A tibble: 1 x 9
##   .model_id .model_desc           .type    mae  mape  mase smape  rmse   rsq
##       <int> <chr>                 <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1         1 PROPHET W/ REGRESSORS Fitted 0.822  7.60  3.86  7.50  1.20 0.957

prophet_mod(jet_split,
            changepoints = 3.53,
            seasonality = .017,
            holiday = 1.12,
            train = TRUE) %>%  
  pluck(1) %>% 
  modeltime_forecast(new_data = training(jet_split),
                     actual_data = jetblue) %>% 
  plot_modeltime_forecast(.interactive = FALSE) +
  labs(title = 'JetBlue Stock Prices - Training Model')

We can see that when using the whole training set, we have a RMSE of 1.1989755 and a MASE of 3.8565717 so both metrics reduced slightly.

Testing the Model

Finally, let’s test our Prophet model to see how well the model fits.

prophet_mod(jet_split,
            changepoints = 3.53,
            seasonality = .017,
            holiday = 1.12,
            train = FALSE) %>%
  pluck(1) %>% 
  modeltime_forecast(new_data = testing(jet_split),
                     actual_data = jetblue) %>% 
  plot_modeltime_forecast(.interactive = FALSE) +
  labs(title = 'JetBlue Stock Prices - Testing Model')

test_model <- prophet_mod(jet_split,
            changepoints = 3.53,
            seasonality = .017,
            holiday = 1.12,
            train = FALSE) %>%
  pluck(2)

test_model

## # A tibble: 1 x 9
##   .model_id .model_desc           .type   mae  mape  mase smape  rmse   rsq
##       <int> <chr>                 <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1         1 PROPHET W/ REGRESSORS Test   18.2  124.  64.7  67.2  20.7 0.461

Well, that doesn’t look very good and we can see that with the metrics. The MASE has gotten much worse (64.6905915) and so has the RMSE (20.7184712) sweating

Forecasting Ahead a Year

Well our model did not fit well to the testing data, but let’s see how it model looks when refit to the full data and forecasted forward a year. So in a year, it seems that JetBlue stock will remain roughly around the same value. It is important to note that the confidence intervals are large and with 95% confidence that values could be between 52.49 and -28.39 (not possible), there is not much confidence that JetBlue stock prices will remain where they are now in a year.

future <- jetblue %>% 
  future_frame(.length_out = '1 year', .bind_data = TRUE)

future <-
  future %>%
  select(-year_num, -month_num, -day_num) %>%
  mutate(date2 = ds) %>%
  separate(col = date2,
           into = c('year_num', 'month_num', 'day_num'),
           sep = '-') %>%
  mutate(year_num = as.factor(year_num),
         year_num = relevel(year_num, ref = '2002'),
         month_num = as.factor(month_num),
         day_num = as.numeric(day_num)) %>% 
  arrange(ds)

glimpse(future)

## Rows: 5,363
## Columns: 17
## $ close         <dbl> 13.33, 13.40, 13.57, 13.36, 13.10, 12.93, 12.45, 12.56, ~
## $ ds            <date> 2002-04-12, 2002-04-15, 2002-04-16, 2002-04-17, 2002-04~
## $ actual_day    <ord> Fri, Mon, Tue, Wed, Thu, Fri, Mon, Tue, Wed, Thu, Fri, M~
## $ clean         <dbl> 13.33, 13.40, 13.57, 13.36, 13.10, 12.93, 12.45, 12.56, ~
## $ observed      <dbl> 13.33, 13.40, 13.57, 13.36, 13.10, 12.93, 12.45, 12.56, ~
## $ season        <dbl> -0.0032169416, 0.0018169103, -0.0061626547, 0.0002063279~
## $ trend         <dbl> 13.40625, 13.41578, 13.42532, 13.43486, 13.44440, 13.453~
## $ remainder     <dbl> -0.073029136, -0.017601542, 0.150839468, -0.075068069, -~
## $ seasadj       <dbl> 13.33322, 13.39818, 13.57616, 13.35979, 13.09264, 12.933~
## $ remainder_l1  <dbl> -2.223341, -2.223341, -2.223341, -2.223341, -2.223341, -~
## $ remainder_l2  <dbl> 2.246437, 2.246437, 2.246437, 2.246437, 2.246437, 2.2464~
## $ anomaly       <chr> "No", "No", "No", "No", "No", "No", "No", "No", "No", "N~
## $ recomposed_l1 <dbl> 11.17969, 11.19426, 11.19582, 11.21173, 11.22842, 11.227~
## $ recomposed_l2 <dbl> 15.64947, 15.66404, 15.66560, 15.68150, 15.69819, 15.697~
## $ year_num      <fct> 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 20~
## $ month_num     <fct> 04, 04, 04, 04, 04, 04, 04, 04, 04, 04, 04, 04, 04, 05, ~
## $ day_num       <dbl> 12, 15, 16, 17, 18, 19, 22, 23, 24, 25, 26, 29, 30, 1, 2~

test_model1 <- prophet_mod(jet_split,
            changepoints = 3.53,
            seasonality = .017,
            holiday = 1.12,
            train = FALSE) %>%
  pluck(1)

test_model1 %>% 
  modeltime_refit(data = future) %>% 
  modeltime_forecast(new_data = future,
                     actual_data = jetblue) %>% 
  plot_modeltime_forecast(.interactive = FALSE) +
  labs(title = 'Forecasted JetBlue Stock Prices')

Testing the Algorithm on a Different Airline’s Stock Prices

Let’s take this a step further and see how well our algorithm fits on a different airline’s stock price data. We will use the final Prophet model to see if it works well using all of American Airlines data to make predictions.

link <- 'https://docs.google.com/spreadsheets/d/11DWSWLFXT84uGg_mBvVYJevQOsN7ghYovJefH87BJXc/edit#gid=0'

amer <- 
  googlesheets4::read_sheet(link) %>% 
  janitor::clean_names() %>% 
  mutate(ds = as_date(date))

american <-
  amer %>% 
  mutate(actual_day = wday(ds,
                           label = TRUE),
         clean = ts_clean_vec(close)) %>% 
  separate(col = date,
           into = c('year_num', 'month_num', 'day_num'),
           sep = '-') %>% 
  mutate(year_num = as.factor(year_num),
         year_num = relevel(year_num, ref = '2013')) %>% 
  separate(col = day_num,
           into = c('day_num', 'drop'),
           sep = ' ') %>%
  mutate(day_num = as.numeric(day_num),
         month_num = as.factor(month_num)) %>% 
  select(-drop) %>% 
  arrange(ds)


model <- prophet_reg() %>% 
    set_engine(engine = 'prophet',
               verbose = TRUE) %>% 
    set_args(prior_scale_changepoints = 3.53,
             prior_scale_seasonality = .017,
             prior_scale_holidays = 1.12,
             season = 'additive',
             seasonality_daily = 'auto',
             seasonality_weekly = 'auto',
             seasonality_yearly = 'auto') %>% 
    fit(clean ~ ds + year_num, 
        data = american)

model_cali <- model %>% 
  modeltime_calibrate(new_data = american)
    
model_cali %>% 
  modeltime_accuracy()

## # A tibble: 1 x 9
##   .model_id .model_desc           .type    mae  mape  mase smape  rmse   rsq
##       <int> <chr>                 <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1         1 PROPHET W/ REGRESSORS Fitted  1.73  6.20  2.73  6.11  2.20 0.968

future_amer <- american %>% 
  future_frame(.length_out = '1 year', .bind_data = TRUE)

future_amer <-
  future_amer %>%
  select(-year_num, -month_num, -day_num) %>%
  mutate(date2 = ds) %>%
  separate(col = date2,
           into = c('year_num', 'month_num', 'day_num'),
           sep = '-') %>%
  mutate(year_num = as.factor(year_num),
         year_num = relevel(year_num, ref = '2013'),
         month_num = as.factor(month_num),
         day_num = as.numeric(day_num))

model_cali %>% 
  modeltime_forecast(new_data = american,
                     actual_data = american) %>% 
  plot_modeltime_forecast(.interactive = FALSE)  + 
  labs(title = 'Predicted American Airlines Stock Prices')

model_cali %>% 
  modeltime_refit(data = future_amer) %>%
  modeltime_forecast(new_data = future_amer,
                     actual_data = american) %>% 
  plot_modeltime_forecast(.interactive = FALSE) + 
  labs(title = 'Forecasted American Airlines Stock Prices')