1

I often write this helper to partition a collection, akin to the partition method in the standard library.

def partition[T](xs: RDD[T], predicate: (T) => Boolean): (RDD[T], RDD[T]) = {
  (xs.filter(predicate), xs.filter(!predicate(_)))
}

I was never able to find such a method in the Spark API. Does it exist?

Synesso
  • 36,000
  • 33
  • 126
  • 200

0 Answers0