Kafka is a messaging broker system which facilitates the passing of messages between producer and consumer whereas Spark Structure streaming consumes static and streaming data from various sources like kafka, flume, twitter or any other socket which can be processed and analysed using high level algorithm for machine learning and finally pushed the result out to external storage system. The main advantage of structured streaming is to get the continuous incrementing the result as the streaming data continue to arrive.
Though the kafka has its own stream library and its best suitable for transforming a kafka topic to topic whereas spark streaming are almost integrated with any type of system. For more detail you can refer to this blog.
In this blog i’ll cover an end to end integration of kafka with spark structured streaming by creating kafka as source and spark structured streaming as sink.
View original post 440 more words