Simple sample Kafka producer that reads image files from a given directory and stores them into a Kafka topic. The Kafka magic is done inside kafka.KafkaImageProducer. Together with kafka-picture-consumer it demonstrates the different working modes Queuing and Publish/Subscribe of Kafka.
Start this application and the kafka-picture-consumer, the kafka-picture-producer, lay back and enjoy your movie. 😏
You'll need a running Kafka and Zookeeper instance. You may find some information on creating the test setup at the end of this document. Furthermore, you are going to need a movie split into single frames (google for vlc scene video filter) or use the memoryImageProducer (default)...
In the easiest way you simply run
java -jar kafka-picture-producer-0.1.0.jar
(make sure you went to target
folder before the execution)
but there are also some command line arguments
java -jar kafka-picture-producer-0.1.0.jar [--imagePath] [--zookeeper.broker.host] [--kafka.topic] [--kafka.broker.host] [--kafka.partition.count] [--kafka.replication.count]
argument name | argument value | default |
---|---|---|
--zookeeper.broker.host | zookeeper host (needed for topic creation only) | localhost:2181 |
--kafka.topic | topic the images are published to | images |
--kafka.broker.host | a Kafka broker to connect initially | localhost:9092 |
--kafka.partition.count | numbers of partitions for the topic | 1 |
--kafka.replication.count | numbers of brokers this topic is replicated to | 1 |
--imageProducer | name of the image producer to use, possible values are
|
fileSystemImageProducer |
--imagePath | path to be used to read images (png) from, used by fileSystemImageProducer | . |
These command lines could also be set in the application.yml
This project uses Spring Boot as application framework and Apache Maven to build. The application was written against Kafka 3.2.
This project uses Maven for building the application. Simply run
./mvnw clean install
to build the executable jar file. You then will find it under target.
The easiest way to get started locally is to use docker. Please consult the Kafka documentation to learn more about the commands needed to run Kafka inside of docker.
The demo is meant to show what Apache Kafka means with log (so, ordered set of messages, stored regardless whether or how often they are consumed) explain the differences between Queuing and Publish/Subscribe modes of Apache Kafka. Start the test setup.
Now start two consumer instances of the kafka-image-consumer with the same consumer id so e.g.
java -Djava.awt.headless=false -jar kafka-picture-consumer-2.0.0.jar --kafka.group.id=1 &
java -Djava.awt.headless=false -jar kafka-picture-consumer-2.0.0.jar --kafka.group.id=1 &
and another one with a different consumer id, like so:
java -Djava.awt.headless=false -jar kafka-picture-consumer-2.0.0.jar --kafka.group.id=2 &
Arrange the windows so you can see them all.
Now start the producer and observe what happens on the consumer windows.
Please pay attention that the messages (the images in this demo) are only stored for 3 minutes so hurry 😏
So what will happen? You'll see the movie running (too fast for sure 😏) on two consumer windows, one will stay blank. As the consumer group id is displayed on the title of the windows you'll find that the consumer that stays blank has the same consumer id as another consumer. This is the Kafka Queuing mode with the edge case that there are less partitions than consumers.
This demo also shows the Publish/Subscribe mode. You'll find that consumers with different consumer group ids will pull the messages, so this works somehow like broadcasting.
What else do you see? The single movie frames are shown in the correct order. So first message that came in will be consumed first.
Now start an additional consumer (I hope you do so within the 3 minutes lifetime of the messages), like so
java -Djava.awt.headless=false -jar kafka-picture-consumer-0.1.0.jar --kafka.group.id=3 &
You'll find the movie starts playing. This is so, because Kafka keeps the messages (until the configured timeout) regardless of how often the messages where consumed.