scala - Apache Spark DISTINCT showing unexpected behaviour? -
i trying apply distinct transformation on sample rdd array of strings (1,2,3,3). when apply distinct transformation expecting array (1,2,3). getting blank array. can please explain behavior ?
scala> val l = sc.textfile("/users/shubhro/documents/datafiles/clean/samplerdd") l: org.apache.spark.rdd.rdd[string] = /users/shubhro/documents/datafiles/clean/samplerdd mappartitionsrdd[34] @ textfile @ <console>:26 scala> l.collect() res19: array[string] = array(1, 2, 3, 3) scala> l.distinct().collect() res20: array[string] = array()
Comments
Post a Comment