1

There is a library on GitHub called timeseriescv which implements Combinatorial CV. I am trying to use it in conjunction with GridSearchCV. However, unlike normal sklearn cross validators which have a function called "get_n_splits" which returns the number of splits, this package does not have this function. The docstring states: "The samples are decomposed into n_splits folds containing equal numbers of samples, without shuffling. In each cross validation round, n_test_splits folds are used as the test set, while the other folds are used as the training set. There are as many rounds as n_test_splits folds among the n_splits folds." So if I was to implement a function called get_n_splits, how would I approach this if I have n_splits and n_test_splits? Here is a snippet of code:

from timeseriescv.cross_validation import CombPurgedKFoldCV
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV

cv = CombPurgedKFoldCV(n_splits=5, n_test_splits=2) cv.X = X cv.y = y cv.pred_times = train.Pred_Date cv.eval_times = train.Eval_Date

print(cv.n_splits, cv.n_test_splits)

for i, (train_indexes, test_indexes) in enumerate(cv.split(X=X, y=y, pred_times=train.Pred_Date, eval_times=train.Eval_Date)):

param_grid = {'C': [0.1, 0.5, 0.75, 1, 1.5], "tol": [1e3, 1e4], "max_iter": [500, 1000]}

model = LogisticRegression(fit_intercept=True, class_weight="balanced") gs = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=cv, verbose=0) # [(train_indexes, test_indexes)] gs.fit(X, y)

Here is the error:

n_splits = cv_orig.get_n_splits(X, y, groups)

AttributeError: 'CombPurgedKFoldCV' object has no attribute 'get_n_splits'

mirekphd
  • 165

0 Answers0