Databricks Spark Reference Applications
Introduction
1. Log Analysis with Spark
2. Twitter Streaming Language Classifier
3. Weather TimeSeries Data Application with Cassandra
- 3.1. Overview
- 3.2. Running the Example

Databricks Spark Reference Applications

Batch Data Import

This section covers batch importing data into Apache Spark, such as seen in the non-streaming examples from Chapter 1. Those examples load data from files all at once into one RDD, processes that RDD, the job completes, and the program exits. In a production system, you could set up a cron job to kick off a batch job each night to process the last day's worth of log files and then publish statistics for the last day.

Importing From Files covers caveats when importing data from files.
Importing from Databases links to examples of reading data from databases.