GitHub - ansrivas/spark-structured-streaming: Spark structured streaming with Kafka data source and writing to Cassandra

Inside setup directory, run docker-compose up -d to launch instances of zookeeper, kafka and cassandra
Wait for a few seconds and then run docker ps to make sure all the three services are running.
Then run pip install -r requirements.txt
main.py generates some random data and publishes it to a topic in kafka.
Run the spark-app using sbt clean compile run in a console. This app will listen on topic (check Main.scala) and writes it to Cassandra.
Again run main.py to write some test data on a kafka topic.
Finally check if the data has been published in cassandra.

Credits:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
project		project
setup		setup
src/main		src/main
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
build.sbt		build.sbt
readme.md		readme.md

Provide feedback