pyspark - Spark streaming processes RDDs one by one? -


i wrote spark streaming program pyspark.

it receives live input text stream sockettextstream , transformations , saves csv file saveastextfile. spark streaming window operation not used , no previous data required create output data.

but seems spark not start process rdd in dstream until previous rdd finishes when previous rdd uses few partitions , cpu/memory.

is spark's default behaviour ? there way change such behaviour ?

can kindly post code , problem facing?

conceptually, data within each time interval forms rdd @ end of interval (thats idea of forming mini-batch data abstraction).


Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -