eclipse - hadoop writing output to hdfs file -

March 15, 2015

i have written first map reduce program , when ran in eclipse writes output file , works expected. when run command line using hadoop jar myjar.jar results not getting written output file. output files(_success , part-r-0000) getting created empty. there persistence issue?. reduce input records =12 reduce output records =0. same not 0 if in eclipse. in eclipse reduce output records not 0. appreciated. thanks

[cloudera@quickstart desktop]$ sudo hadoop jar checkjar.jar hdfs://quickstart.cloudera:8020/user/cloudera/input.csv hdfs://quickstart.cloudera:8020/user/cloudera/output9 15/04/28 22:09:06 info client.rmproxy: connecting resourcemanager @ /0.0.0.0:8032 15/04/28 22:09:07 warn mapreduce.jobsubmitter: hadoop command-line option parsing not performed. implement tool interface , execute application toolrunner remedy this. 15/04/28 22:09:08 info input.fileinputformat: total input paths process : 1 15/04/28 22:09:09 info mapreduce.jobsubmitter: number of splits:1 15/04/28 22:09:09 info mapreduce.jobsubmitter: submitting tokens job: job_1430279123629_0011 15/04/28 22:09:10 info impl.yarnclientimpl: submitted application application_1430279123629_0011 15/04/28 22:09:10 info mapreduce.job: url track job: http://quickstart.cloudera:8088/proxy/application_1430279123629_0011/ 15/04/28 22:09:10 info mapreduce.job: running job: job_1430279123629_0011 15/04/28 22:09:22 info mapreduce.job: job job_1430279123629_0011 running in uber mode : false 15/04/28 22:09:22 info mapreduce.job:  map 0% reduce 0% 15/04/28 22:09:32 info mapreduce.job:  map 100% reduce 0% 15/04/28 22:09:46 info mapreduce.job:  map 100% reduce 100% 15/04/28 22:09:46 info mapreduce.job: job job_1430279123629_0011 completed 15/04/28 22:09:46 info mapreduce.job: counters: 49     file system counters         file: number of bytes read=265         file: number of bytes written=211403         file: number of read operations=0         file: number of large read operations=0         file: number of write operations=0         hdfs: number of bytes read=365         hdfs: number of bytes written=0         hdfs: number of read operations=6         hdfs: number of large read operations=0         hdfs: number of write operations=2     job counters          launched map tasks=1         launched reduce tasks=1         data-local map tasks=1         total time spent maps in occupied slots (ms)=8175         total time spent reduces in occupied slots (ms)=10124         total time spent map tasks (ms)=8175         total time spent reduce tasks (ms)=10124         total vcore-seconds taken map tasks=8175         total vcore-seconds taken reduce tasks=10124         total megabyte-seconds taken map tasks=8371200         total megabyte-seconds taken reduce tasks=10366976     map-reduce framework         map input records=12         map output records=12         map output bytes=235         map output materialized bytes=265         input split bytes=120         combine input records=0         combine output records=0         reduce input groups=2         reduce shuffle bytes=265         reduce input records=12         reduce output records=0         spilled records=24         shuffled maps =1         failed shuffles=0         merged map outputs=1         gc time elapsed (ms)=172         cpu time spent (ms)=1150         physical memory (bytes) snapshot=346574848         virtual memory (bytes) snapshot=1705988096         total committed heap usage (bytes)=196481024     shuffle errors         bad_id=0         connection=0         io_error=0         wrong_length=0         wrong_map=0         wrong_reduce=0     file input format counters          bytes read=245     file output format counters          bytes written=0

reducer.java

package com.mapreduce.assgn4; import java.io.ioexception; import java.util.arraylist; import java.util.list;  import org.apache.hadoop.io.text; import org.apache.hadoop.mapreduce.reducer;    public class joinreducer extends reducer<text, text, text, text> { @override public void reduce(text key, iterable<text> values,  context context)  throws ioexception, interruptedexception { list<string> tableonetuples = new arraylist<string>(); list<string> tabletwotuples = new arraylist<string>();   (text value : values) {      string[] splitvalues = value.tostring().split("#");      string tablename = splitvalues[0];      if(tablename.equals(joinmapper.tableone))      {          tableonetuples.add(splitvalues[1]);      }      else      {          tabletwotuples.add(splitvalues[1]);      }  }  system.out.println(tableonetuples.size());  system.out.println(tabletwotuples.size());   string finaljoinstring = null;  for(string tableonevalue: tableonetuples)  {      (string tabletwovalue: tabletwotuples)      {          finaljoinstring = tableonevalue+","+tabletwovalue;          finaljoinstring = key.tostring()+","+finaljoinstring;          context.write(null, new text(finaljoinstring));      }  }  } }

your context.write in reducer has bugs. need have nullwritable have null in output ,

context.write(nullwritable, new text(finaljoinstring));

Search This Blog

Ruby Code

eclipse - hadoop writing output to hdfs file -

Comments

Post a Comment

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - Show Soft Keyboard when EditText Appears -

command line - Use qwinsta in PowerShell ISE -