scala - Spark Fold vs Reduce in performance? -


in big data processing job, function "fold" have lower computation performance compared function "reduce" ?

for instance, have following 2 functions:

    array1.indices.zip(array1).map(x => x._1 * x._2).reduce(_ + _)      array1.indices.zip(array1).map(x => x._1 * x._2).fold(0.0) {_ + _} 

array1 huge rdd array. function has higher computation performance giving same clustering setting.

this indeed same 1 pointed out muhuk guts of spark implementation merely call iterator

fold source:

(iter: iterator[t]) => iter.fold(zerovalue)(cleanop) 

reduce source:

iter =>    if (iter.hasnext)some(iter.reduceleft(cleanf))   else none 

so, calling scala implementations.


Comments

Popular posts from this blog

java - Spring Data JPA: Why findOne(id) executing delete query internally? -

python - Mongodb How to add addtional information when aggregating? -

java - Incorrect order of records in M-M relationship in hibernate -