Weka ,Text Classification on an arff file -


.this basic question .i trying classify text files 20 different classes.

therefore have project structure folder called train,test. in train folder have 20 different folders ,each folder again has many files related particular class.ex:weather, atheism...etc

i have created train.arff file entire train folder.when data visualized through can see 2 attributes . have provided link below:

screen in weka

my doubt how can view various files under these folders , remove stopwords,punctuation,stemmin.how go preprocessing.if links resources available please suggest , provide necessary links

i found videos below quite helpful when first got hands on text classification using weka. might want take look.

you might want use stringtowordvector filter see effect of each word attribute, indeed described in detail in first , last video . within filter settings can give stopwords list , choose in each run use or not. same stemming can change well. documentation , videos understand easily.


Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -