Return to book
Review this book
About the author
Knowledgebase
1. Best Practices
2. General Troubleshooting
3. Performance & Optimization
- 3.1. How Many Partitions Does An RDD Have?
- 3.2. Data Locality
4. Spark Streaming
- 4.1. ERROR OneForOneStrategy

Powered by GitBook

Databricks Spark Knowledge Base

Best Practices

Avoid GroupByKey
Don't copy all elements of a large RDD to the driver
Gracefully Dealing with Bad Input Data