python - Time series cross-validation using linear regression from scikit learn -
i'm using linear regression model scikit learn explanatory fit on time series:
from sklearn import linear_model import numpy np x = np.array([np.random.random(100), np.random.random(100)]) y = np.array(np.random.random(100)) regressor = linear_model.linearregression() regressor.fit(x, y) y_hat = regressor.predict(x) i want cross-validate the prediction. far know, can't use cross_val sklearn (like kfold) because break down results randomly, , need folds sequentially. example,
data_set = [1 2 3 4 5 6 7 8 9 10] # first train set train = [1] # first test set test = [2 3 4 5 6 7 8 9 10] #fit, predict, evaluate # train set train = [1 2] # test set test = [3 4 5 6 7 8 9 10] #fit, predict, evaluate ... # train set train = [1 2 3 4 5 6 7 8] # test set test = [9 10] #fit, predict, evaluate is possible using sklearn?
you not need scikit kind of folding. slicing sufficient, like:
step = 1 in range(0, len(data_set), step): train = dataset[:i] test = dataset[i:] # etc...
Comments
Post a Comment