Example Data

The core.NeuralForecast class allows you to efficiently fit multiple NeuralForecast models for large sets of time series. It operates with pandas DataFrame df that identifies individual series and datestamps with the unique_id and ds columns, and the y column denotes the target time series variable. To assist development, we declare useful datasets that we use throughout all NeuralForecast’s unit tests.

1. Synthetic Panel Data

`generate_series`

generate_series(
    n_series,
    freq="D",
    min_length=50,
    max_length=500,
    n_temporal_features=0,
    n_static_features=0,
    equal_ends=False,
    seed=0,
)

Generate Synthetic Panel Series. Generates n_series of frequency freq of different lengths in the interval [min_length, max_length]. If n_temporal_features > 0, then each serie gets temporal features with random values. If n_static_features > 0, then a static dataframe is returned along the temporal dataframe. If equal_ends == True then all series end at the same date. Parameters:

Name	Type	Description	Default
`n_series`	`int`	Number of series for synthetic panel.	required
`freq`	`str`	Frequency of the data, panda’s available frequencies. Defaults to “D”.	`’D’`
`min_length`	`int`	Minimal length of synthetic panel’s series. Defaults to 50.	`50`
`max_length`	`int`	Maximal length of synthetic panel’s series. Defaults to 500.	`500`
`n_temporal_features`	`int`	Number of temporal exogenous variables for synthetic panel’s series. Defaults to 0.	`0`
`n_static_features`	`int`	Number of static exogenous variables for synthetic panel’s series. Defaults to 0.	`0`
`equal_ends`	`bool`	If True, series finish in the same date stamp `ds`. Defaults to False.	`False`
`seed`	`int`	Random seed for reproducibility. Defaults to 0.	`0`

Returns:

Type	Description
`DataFrame`	pd.DataFrame: Synthetic panel with columns [`unique_id`, `ds`, `y`] and exogenous.

synthetic_panel = generate_series(n_series=2)
synthetic_panel.groupby('unique_id').head(4)

temporal_df, static_df = generate_series(n_series=1000, n_static_features=2,
                                         n_temporal_features=4, equal_ends=False)
static_df.head(2)

2. AirPassengers Data

The classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960. It has been used as a reference on several forecasting libraries, since it is a series that shows clear trends and seasonalities it offers a nice opportunity to quickly showcase a model’s predictions performance.

AirPassengersDF.head(12)

#We are going to plot the ARIMA predictions, and the prediction intervals.
fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = AirPassengersDF.set_index('ds')

plot_df[['y']].plot(ax=ax, linewidth=2)
ax.set_title('AirPassengers Forecast', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()

import numpy as np
import pandas as pd

n_static_features = 3
n_series = 5

static_features = np.random.uniform(low=0.0, high=1.0,
                        size=(n_series, n_static_features))
static_df = pd.DataFrame.from_records(static_features,
                   columns = [f'static_{i}'for i in  range(n_static_features)])
static_df['unique_id'] = np.arange(n_series)

static_df

3. Panel AirPassengers Data

Extension to classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960. It includes two series with static, temporal and future exogenous variables, that can help to explore the performance of models like NBEATSx and TFT.

fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = AirPassengersPanel.set_index('ds')

plot_df.groupby('unique_id')['y'].plot(legend=True)
ax.set_title('AirPassengers Panel Data', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(title='unique_id', prop={'size': 15})
ax.grid()

fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = AirPassengersPanel[AirPassengersPanel.unique_id=='Airline1'].set_index('ds')

plot_df[['y', 'trend', 'y_[lag12]']].plot(ax=ax, linewidth=2)
ax.set_title('Box-Cox AirPassengers Data', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()

4. Time Features

We have developed a utility that generates normalized calendar features for use as absolute positional embeddings in Transformer-based models. These embeddings capture seasonal patterns in time series data and can be easily incorporated into the model architecture. Additionally, the features can be used as exogenous variables in other models to inform them of calendar patterns in the data.

References

Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, Wancai Zhang. “Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting”

`augment_calendar_df`

augment_calendar_df(df, freq='H')

Augment a dataframe with calendar features based on frequency. Frequency mappings:

Q - [month]
M - [month]
W - [Day of month, week of year]
D - [Day of week, day of month, day of year]
B - [Day of week, day of month, day of year]
H - [Hour of day, day of week, day of month, day of year]
T - [Minute of hour*, hour of day, day of week, day of month, day of year]
S - [Second of minute, minute of hour, hour of day, day of week, day of month, day of year]

*minute returns a number from 0-3 corresponding to the 15 minute period it falls into. Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame to augment with calendar features.	required
`freq`	`str`	Frequency string for determining which features to add. Defaults to “H”.	`’H’`

Returns:

Type	Description
Tuple[pd.DataFrame, List[str]]: Tuple of (augmented DataFrame, list of feature column names).

`time_features_from_frequency_str`

time_features_from_frequency_str(freq_str)

Returns a list of time features that will be appropriate for the given frequency string. Parameters:

Name	Type	Description	Default
`freq_str`	`str`	Frequency string of the form [multiple][granularity] such as “12H”, “5min”, “1D” etc.	required

Returns:

Type	Description
`List[TimeFeature]`	List[TimeFeature]: List of time features appropriate for the frequency.

`WeekOfYear`

Bases: TimeFeature Week of year encoded as value between [-0.5, 0.5].

`MonthOfYear`

Bases: TimeFeature Month of year encoded as value between [-0.5, 0.5].

`DayOfYear`

Bases: TimeFeature Day of year encoded as value between [-0.5, 0.5].

`DayOfMonth`

Bases: TimeFeature Day of month encoded as value between [-0.5, 0.5].

`DayOfWeek`

Bases: TimeFeature Day of week encoded as value between [-0.5, 0.5].

`HourOfDay`

Bases: TimeFeature Hour of day encoded as value between [-0.5, 0.5].

`MinuteOfHour`

Bases: TimeFeature Minute of hour encoded as value between [-0.5, 0.5].

`SecondOfMinute`

Bases: TimeFeature Second of minute encoded as value between [-0.5, 0.5].

`TimeFeature`

TimeFeature()

AirPassengerPanelCalendar, calendar_cols = augment_calendar_df(df=AirPassengersPanel, freq='M')
AirPassengerPanelCalendar.head()

plot_df = AirPassengerPanelCalendar[AirPassengerPanelCalendar.unique_id=='Airline1'].set_index('ds')
plt.plot(plot_df['month'])
plt.grid()
plt.xlabel('Datestamp')
plt.ylabel('Normalized Month')
plt.show()

`get_indexer_raise_missing`

get_indexer_raise_missing(idx, vals)

Get index positions for values, raising error if any are missing. Parameters:

Name	Type	Description	Default
`idx`	`Index`	Index to search in.	required
`vals`	`List[str]`	Values to find indices for.	required

Returns:

Type	Description
`List[int]`	List[int]: List of index positions.

Raises:

Type	Description
`ValueError`	If any values are missing from the index.

5. Prediction Intervals

`PredictionIntervals`

PredictionIntervals(n_windows=2, method='conformal_distribution', step_size=1)

Class for storing prediction intervals metadata information. Initialize PredictionIntervals. Parameters:

Name	Type	Description	Default
`n_windows`	`int`	Number of windows to evaluate. Defaults to 2.	`2`
`method`	`str`	One of the supported methods for the computation of prediction intervals: conformal_error or conformal_distribution. Defaults to “conformal_distribution”.	`‘conformal_distribution’`
`step_size`	`int`	Step size between each cross-validation window. Defaults to 1.	`1`

`PredictionIntervals.method`

method = method

`PredictionIntervals.n_windows`

n_windows = n_windows

`PredictionIntervals.step_size`

step_size = step_size

`add_conformal_distribution_intervals`

add_conformal_distribution_intervals(
    model_fcsts,
    cs_df,
    model,
    cs_n_windows,
    n_series,
    horizon,
    level=None,
    quantiles=None,
)

Add conformal intervals based on conformal scores using distribution strategy. This strategy creates forecast paths based on errors and calculates quantiles using those paths. Parameters:

Name	Type	Description	Default
`model_fcsts`	`array`	Model forecasts array.	required
`cs_df`	`DFType`	DataFrame containing conformal scores.	required
`model`	`str`	Model name.	required
`cs_n_windows`	`int`	Number of conformal score windows.	required
`n_series`	`int`	Number of series.	required
`horizon`	`int`	Forecast horizon.	required
`level`	`Optional[List[Union[int, float]]]`	Confidence levels for prediction intervals. Defaults to None.	`None`
`quantiles`	`Optional[List[float]]`	Quantiles for prediction intervals. Defaults to None.	`None`

Returns:

Type	Description
`Tuple[array, List[str]]`	Tuple[np.array, List[str]]: Tuple of (forecasts with intervals, column names).

`add_conformal_error_intervals`

add_conformal_error_intervals(
    model_fcsts,
    cs_df,
    model,
    cs_n_windows,
    n_series,
    horizon,
    level=None,
    quantiles=None,
)

Add conformal intervals based on conformal scores using error strategy. This strategy creates prediction intervals based on absolute errors. Parameters:

Name	Type	Description	Default
`model_fcsts`	`array`	Model forecasts array.	required
`cs_df`	`DFType`	DataFrame containing conformal scores.	required
`model`	`str`	Model name.	required
`cs_n_windows`	`int`	Number of conformal score windows.	required
`n_series`	`int`	Number of series.	required
`horizon`	`int`	Forecast horizon.	required
`level`	`Optional[List[Union[int, float]]]`	Confidence levels for prediction intervals. Defaults to None.	`None`
`quantiles`	`Optional[List[float]]`	Quantiles for prediction intervals. Defaults to None.	`None`

Returns:

Type	Description
`Tuple[array, List[str]]`	Tuple[np.array, List[str]]: Tuple of (forecasts with intervals, column names).

`get_prediction_interval_method`

get_prediction_interval_method(method)

Get the prediction interval method function by name. Parameters:

Name	Type	Description	Default
`method`	`str`	Name of the prediction interval method.	required

Returns:

Name	Type	Description
`Callable`		The corresponding method function.

Raises:

Type	Description
`ValueError`	If the method is not supported.

`quantiles_to_level`

quantiles_to_level(quantiles)

Convert a list of quantiles to confidence levels. Parameters:

Name	Type	Description	Default
`quantiles`	`List[float]`	List of quantiles (e.g., [0.1, 0.5, 0.9]).	required

Returns:

Type	Description
`List[Union[int, float]]`	List[Union[int, float]]: List of corresponding confidence levels.

`level_to_quantiles`

level_to_quantiles(level)

Convert a list of confidence levels to quantiles. Parameters:

Name	Type	Description	Default
`level`	`List[Union[int, float]]`	List of confidence levels (e.g., [80, 90]).	required

Returns:

Type	Description
`List[float]`	List[float]: List of corresponding quantiles.

Getting Started

Capabilities

Tutorials

Use cases

API Reference

1. Synthetic Panel Data

`generate_series`

2. AirPassengers Data

3. Panel AirPassengers Data

4. Time Features

References

`augment_calendar_df`

`time_features_from_frequency_str`

`WeekOfYear`

`MonthOfYear`

`DayOfYear`

`DayOfMonth`

`DayOfWeek`

`HourOfDay`

`MinuteOfHour`

`SecondOfMinute`

`TimeFeature`

`get_indexer_raise_missing`

5. Prediction Intervals

`PredictionIntervals`

`PredictionIntervals.method`

`PredictionIntervals.n_windows`

`PredictionIntervals.step_size`

`add_conformal_distribution_intervals`

`add_conformal_error_intervals`

`get_prediction_interval_method`

`quantiles_to_level`

`level_to_quantiles`

Getting Started

Capabilities

Tutorials

Use cases

API Reference

Documentation Index

​1. Synthetic Panel Data

​generate_series

​2. AirPassengers Data

​3. Panel AirPassengers Data

​4. Time Features

​References

​augment_calendar_df

​time_features_from_frequency_str

​WeekOfYear

​MonthOfYear

​DayOfYear

​DayOfMonth

​DayOfWeek

​HourOfDay

​MinuteOfHour

​SecondOfMinute

​TimeFeature

​get_indexer_raise_missing

​5. Prediction Intervals

​PredictionIntervals

​PredictionIntervals.method

​PredictionIntervals.n_windows

​PredictionIntervals.step_size

​add_conformal_distribution_intervals

​add_conformal_error_intervals

​get_prediction_interval_method

​quantiles_to_level

​level_to_quantiles

1. Synthetic Panel Data

`generate_series`

2. AirPassengers Data

3. Panel AirPassengers Data

4. Time Features

References

`augment_calendar_df`

`time_features_from_frequency_str`

`WeekOfYear`

`MonthOfYear`

`DayOfYear`

`DayOfMonth`

`DayOfWeek`

`HourOfDay`

`MinuteOfHour`

`SecondOfMinute`

`TimeFeature`

`get_indexer_raise_missing`

5. Prediction Intervals

`PredictionIntervals`

`PredictionIntervals.method`

`PredictionIntervals.n_windows`

`PredictionIntervals.step_size`

`add_conformal_distribution_intervals`

`add_conformal_error_intervals`

`get_prediction_interval_method`

`quantiles_to_level`

`level_to_quantiles`