types - Error while testing DecisionTreeClassifier in Scikit Learn with Python -
i read data in csv file, first line strings , rest decimals. had convert data file string decimal , trying run decision tree classifier on data. can train data fine when call decisiontreeclassifier.score() error message: "unknown not supported"
here code:
cval = kfold(len(file)-1, n_folds=10, shuffle=true); train_index, test_index in cval: obfa_train, obfa_test = np.array(obfa)[train_index], np.array(obfa)[test_index] ttime_train, ttime_test = np.array(ttime)[train_index], np.array(ttime)[test_index] model = tree.decisiontreeclassifier() model = model.fit(obfa_train.tolist(), ttime_train.tolist()) print model.score(obfa_test.tolist(), ttime_test.tolist()) i filled obfa , ttime these lines earlier:
ttime.append(decimal(file[i][11].strip('"'))) obfa[i-1][j-1] = decimal(file[i][j].strip('"')) so obfa 2d array , ttime 1d. tried removing "tolist()" in above code, did not affect error. here error report prints:
in <module>() ---> print model.score(obfa_test.tolist(), ttime_test.tolist()) in score(self, x, y, sample_weight) """ .metrics import accuracy_score -->return accuracy_score(y, self.predict(x), sample_weight=sample_weight) in accuracy_score(y_true, y_pred, normalize, sample_weight) # compute accuracy each possible representation ->y_type, y_true, y_pred = _check_clf_targets(y_true, y_pred) if y_type == 'multilabel-indicator': score = (y_pred != y_true).sum(axis=1) == 0 in _check_clf_targets(y_true, y_pred) if (y_type not in ["binary", "multiclass", "multilabel-indicator", "multilabel-sequences"]): -->raise valueerror("{0} not supported".format(y_type)) if y_type in ["binary", "multiclass"]: valueerror: unknown not supported i added print statements check dimensions of input parameters printed:
obfa_test.shape: (48l, 12l) ttime_test.shape: (48l,) i confused why error report shows 3 required parameters score() documentation has 2. "self" parameter? can me solve error?
this seems reminiscent of error discussed here. problem seems stem datatype you're using fit , score model. instead of decimal when filling input data arrays, try float. , don't have inaccurate answer -- can't use floats/continuous values decisiontreeclassifiers. if want use floats, use decisiontreeregressor. otherwise, try using integers or strings (but might steering away task you're trying accomplish).
as self question @ end, syntactic idiosyncrasy of python. when model.score(...), python sort of treating score(model, ...). i'm afraid don't know more right now, isn't necessary answer original question. here's answer better addresses particular question.
Comments
Post a Comment