Forecast Error Depending on HorizonΒΆ

This example shows the dependence of the mean squared error (forecast error) on a growing forecast horizon. The models KNN and SVR are compared. In this example, the turbine ‘Tehachapi’ is the target turbine. The forecast is based on the whole wind park, which is defined by the latter id and a given radius of 3 kilometres. For the mapping of pattern-label combinations the Power Mapping is used. The power mapping is based on the generaltimeseriesmodel. The feature window and the forecast horizon both consist of 3 elements of every time series. Because of performance issues, in this example only every fifth element is used for training and testing.

SVR MSE: [4.4304203419626083, 9.4536349362948791, 16.338651435656267, 24.075246222490247, 32.810941995992401, 39.547388164044555, 47.020385913675938, 51.778744886359043]
KNN MSE: [5.8521237, 11.07809, 17.458475, 24.78684, 31.418432, 35.450794, 43.349888, 49.662083]
# Author: Jendrik Poloczek <>
# License: BSD 3 clause

import numpy as np
import math
import matplotlib.pyplot as plt
from windml.datasets.nrel import NREL
from windml.mapping.power_mapping import PowerMapping
from sklearn.grid_search import GridSearchCV
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error

def compute_mse(regressor, horizon):
    # get wind park and corresponding target. 
    windpark = NREL().get_windpark(NREL.park_id['tehachapi'], 3, 2004, 2005)
    target = windpark.get_target()

    # use power mapping for pattern-label mapping. 
    feature_window = 3
    mapping = PowerMapping()
    X = mapping.get_features_park(windpark, feature_window, horizon)
    y = mapping.get_labels_turbine(target, feature_window, horizon)

    # train roughly for the year 2004, test for 2005.
    train_to = int(math.floor(len(X) * 0.5))
    test_to = len(X)
    train_step, test_step = 25, 25

    if(regressor == 'svr'):
        reg = SVR(kernel='rbf', epsilon=0.1, C = 100.0,\
                gamma = 0.0001).fit(X_train,y_train)
        mse = mean_squared_error(reg.predict(X_test),y_test)
    elif(regressor == 'knn'):
        reg = KNeighborsRegressor(10, 'uniform').fit(X_train,y_train)
        mse = mean_squared_error(reg.predict(X_test),y_test)
    return mse

regressors = ['svr', 'knn']
horizons = range(2, 18, 2)
errors = {} 

for regressor in regressors:
    for horizon in horizons:
        errors[regressor].append(compute_mse(regressor, horizon))
print "SVR MSE:", errors['svr']
print "KNN MSE:", errors['knn']

    fig = plt.figure(figsize=(8, 5))
    ax = fig.add_subplot(111)
    N = len(horizons)
    ind = np.arange(N)                # the x locations for the groups
    width = 0.4                       # the width of the bars
    ax.set_xticks(ind + width)
    ax.set_xticklabels([str(i) for i in horizons])
    ## the bars
    rects1 =,errors['svr'] , width, color='#6795cf')
    rects2 =, errors['knn'], width, color='#918f90')
    plt.title("MSE depending on Forecast Horizon")
    plt.xlabel("Forecast Horizon")
    plt.legend(loc='lower right')


Fork me on GitHub