If you are exporting a very large dataset, you can't call collect()
or
a similar action to read all the data from the RDD onto the single driver
program - that could trigger out of memory problems. Instead, you have to be careful about saving a large RDD. See these two sections for more information.