Prophet is a library developed at Facebook for forecasting time-series data. The main motivation for developing the library is that current popular forecasting methods require training in statistical modelling to produce high-quality forecasts. Prophet, on the other hand, creates useful business forecasts and makes the resulting model easy to interpret.
In modelling trended data, two types of growth is often used: linear growth and saturating growth, both illustrated below. In the linear growth the data steadily shifts upwords, whereas in saturating growth, one the increments declines rapidly as the data approaches certain carrying capacity.
Prophet supports both types of trend modelling: however the user has to choose which one by specifying the
growth= option. Where Prophet shines though is that trend is allowed to change. Let's illustrate this with an example:
One of the most important results in signal processing is that any periodic function can be represented as sum of sine and cosine functions. Such decomposition, known as Fourier series, is illustrated in Figure 3. Increasing the number of terms, also known as the order of the approximation, makes the approximations more fine-tuned.
Prophet authors chose a default optimal number of Fourier terms for each seasonality: it's 10 for yearly, 3 for weakly and 4 for daily. In addition, custom seasonality period can be specified. Figure 4 shows a dataset that exhibits a 5-day seasonality: getting the period right is essential for getting a good fit, whilst
prior_scale parameters (which make seasonality more lenient) aren't nearly as important. In general, the longer the period, the higher the order is required for a better fit.
Just like there are two types of trend components, there are two types of seasonalities: additive (default, illustrated in Figure 4) and multiplicative. Official documentation illustrates the multiplicative counterpart really well. Finally, note that the Figure 4 is missing the uncertainty around the seasonal component: in order to get such uncertainty one should switch from the default training method, MAP, to MCMC by specifying
mcmc_samples= parameter when constructing
Trying with real-world data
This dataset contains sales data aggregated per day and is particularly well-suited for the Prophet model, as it has a strong seasonal component that is commonly observed in sales records.
The function above plots the dataset and the predictions and uses just a few lines of code. The components of the fitted model can also be examined with ease, see figure above.
Finally, Prophet authors developed a new technique for diagnosing the model based on "rolling origin" method. Below there's a plot where 4 models are fitted where each model is trained on progressively more data. Further, each model makes a forecast for the chunk of the data that it wasn't trained and MAPE, =(df'y' - df'yhat') / df'y', is computed. The plotted blue line is the 10% rolling average. We see that predictions are about about 10% off - which is a great starting point!