Return a new RDD by applying a function to each element of this RDD. 1

Return a new RDD by applying a function to each element of this RDD.

rdd = sc.parallelize(["b", "b", "c"])
sorted(rdd.map(lambda x: (x, 1)).collect())
# [('a', 1), ('b', 1), ('c', 1)]

Here is what the above code is Doing:
1. We create an RDD with three elements: “a”, “b”, and “c”.
2. We map each element to a tuple of the element and the number 1.
3. We collect the RDD into a list.
4. We sort the list.

The result is a sorted list of tuples.

Similar Posts