random forest - randomForest prediction issue in r -
i have been using randomforest package classification model categorical factors predictors. predict function fails when there new levels in test not in training. overcome that, have added missing levels follows:
mname<-get(model) ( k in 1:ncol(dataframe)) { col<-names(dataframe[k]) modellevels <- mname$forest$xlevels[[col]] orig<-levels(dataframe[,k]) misses <-setdiff(modellevels,orig) origordered <- orig[order(match(orig,modellevels))] # not sure if ordering matters missordered <- misses[order(match(misses,modellevels))] # not sure if ordering matters if(length(misses) > 0) { levels(dataframe[,k]) <- as.character(c(origordered,missordered)) } } prediction <- data.frame(predict(mname,dataframe,type = "prob"))$x1
this prediction returns different values same input vector depending on size of dataframe provided prediction.
for example, if send dataframe 2 rows returns values of 0,0 whereas if pass bigger dataframe same 2 rows included throws prediction of 0.066667 same 2 rows?
could clarify reason behavior? unable share data here.
Comments
Post a Comment