How to merge text files using mapping and reducing in Java Spark MLLib? -

March 15, 2015

i have large dataset stored on hadoop (yarn cluster) on want train support vector machine classifier. features extracted each data-point dataset , saved in libsvm format. spark mllib can read these files using mlutils.loadlibsvmfile(javasparkcontext context, string directory). every file has 1 line doubles ending in newline character. line represents values of features.

i want concatenate these files javardd. can use .textfile("../*") somekind of .join or .union statement? not understand how ...

could please kind help? think more people know how efficiently.

sparkcontext.textfile("/path/to/file/*") read all matched files , represent single large rdd.

and think mlutils.loadlibsvmfile(sc, "/path/to/file/*") load features you. have tried?

Search This Blog

Ruby Code

How to merge text files using mapping and reducing in Java Spark MLLib? -

Comments

Post a Comment

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

command line - Use qwinsta in PowerShell ISE -

java - Incorrect order of records in M-M relationship in hibernate -