java - Map task stuck at 50% -


i have mapper , reducer classes input , output values set below.

//reducer job.setoutputkeyclass(longwritable.class); job.setoutputvalueclass(mapperoutput.class);  //mapper job.setmapoutputkeyclass(longwritable.class); job.setmapoutputvalueclass(mapperoutput.class); 

here mapperoutput custom class defined me , implements writable interface.

a part of mapper function below.

public void map(longwritable arg0, text arg1,         context context)         throws ioexception  {     try     {         string tran = null;         string ip = arg1.tostring();         system.out.println(ip);         bufferedreader br = new bufferedreader(new stringreader(ip));         hsynopsis bdelta = null;         hsynopsis b = null, bnew = null;          hashentries = (int) math.floor(calculatehashentries()); //hash table size         system.out.println("hash entries: "+hashentries);          //initialize main hash table , delta hashtable         hashtable = new arraylist<>(hashentries);         for(int = 0; < hashentries; i++)         {             hashtable.add(i, null);         }          deltahashtable = new arraylist<>(hashentries);           for(int = 0; < hashentries; i++)         {             deltahashtable.add(i, null);         }          while((tran = br.readline())!=null)         {             createbinaryrep(tran);             for(int = 0; < deltahashtable.size(); i++)             {                 bdelta = deltahashtable.get(i);                 if(bdelta != null)                 {                     if(bdelta.nlast_access >= (alpha * transactioncount))                     {                         //transmit bdelta coordinator                         mapperoutput mp = new mapperoutput(transactioncount, bdelta);                         context.write(new longwritable(i), mp);                          //merge bdelta b                         b = hashtable.get(i);                         bnew = merge(b,bdelta);                         hashtable.set(i, bnew);                          //release bdelta                         deltahashtable.set(i, null);                     }                 }             }         }     }     catch(exception e)     {         e.printstacktrace();     }        } 

my reducer task below.

public void reduce(longwritable index, iterator<mapperoutput> mpvalues, context context) {     while(mpvalues.hasnext())     {         /*some code here */     }      context.write(index, mp); } 

from code of mapper, algorithm demands, trying send output reducer , when condition satisfied (inside for loop), , mapper after writing context, continues execute loop.

when try run code on single-node hadoop cluster, following log.

15/04/29 03:19:23 warn util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable 15/04/29 03:19:23 warn mapred.jobclient: use genericoptionsparser parsing arguments. applications should implement tool same. 15/04/29 03:19:23 warn mapred.jobclient: no job jar file set.  user classes may not found. see jobconf(class) or jobconf#setjar(string). 15/04/29 03:19:23 info input.fileinputformat: total input paths process : 2 15/04/29 03:19:23 warn snappy.loadsnappy: snappy native library not loaded 15/04/29 03:19:24 info mapred.jobclient: running job: job_local599819429_0001 15/04/29 03:19:24 info mapred.localjobrunner: waiting map tasks 15/04/29 03:19:24 info mapred.localjobrunner: starting task: attempt_local599819429_0001_m_000000_0 15/04/29 03:19:24 info util.processtree: setsid exited exit code     0 15/04/29 03:19:24 info mapred.task:  using resourcecalculatorplugin : org.apache.hadoop.util.linuxresourcecalculatorplugin@74ff364a 15/04/29 03:19:24 info mapred.maptask: processing split: file:/home/pooja/adm/frequentpatternmining/input/file.dat~:0+24 15/04/29 03:19:24 info mapred.maptask: io.sort.mb = 100 15/04/29 03:19:24 info mapred.maptask: data buffer = 79691776/99614720 15/04/29 03:19:24 info mapred.maptask: record buffer = 262144/327680 15/04/29 03:19:24 info mapred.maptask: starting flush of map output 15/04/29 03:19:24 info mapred.maptask: starting flush of map output 15/04/29 03:19:25 info mapred.jobclient:  map 0% reduce 0% 15/04/29 03:19:30 info mapred.localjobrunner:  15/04/29 03:19:31 info mapred.jobclient:  map 50% reduce 0% 

the map task has stuck @ 50% , doesn't proceed.

when run map function separately (not in hadoop), not having problem of infinite loop.

can please me this?

edit 1: input file in orders of kb. causing problem distribution of data mappers?

edit 2: mentioned in answer, changed iterator iterable. still map gets stuck @ 100% , after time restarts.

i see following in jobtracker log:

2015-04-29 13:26:28,026 info org.apache.hadoop.mapred.taskinprogress: error attempt_201504291300_0003_m_000000_0: task attempt_201504291300_0003_m_000000_0 failed report status 600 seconds. killing! 2015-04-29 13:26:28,026 info org.apache.hadoop.mapred.jobtracker: removing task 'attempt_201504291300_0003_m_000000_0' 

you have mistakenly used iterator in reduce function instead of iterable .

you need use iterable using new map reduce api's, because reduce(object, iterable, org.apache.hadoop.mapreduce.reducer.context)

method called each in sorted inputs.


Comments

Popular posts from this blog

java - Spring Data JPA: Why findOne(id) executing delete query internally? -

python - Mongodb How to add addtional information when aggregating? -

java - Incorrect order of records in M-M relationship in hibernate -