Sample dataset generator for Aiven for Apache Kafka® ==================================================== Learning to work with streaming data is much more fun with data, so to get you started on your Apache Kafka® journey we help you create fake streaming data to a topic. .. Note:: The following example is based on `Docker `_ images, which require `Docker `_ or `Podman `_ to be executed. The following example assumes you have an Aiven for Apache Kafka® service running. You can create one following the :doc:`dedicated instructions `. Fake data generator on Docker ----------------------------- To learn data streaming, you need a continuous flow of data and for that you can use the `Dockerized fake data producer for Aiven for Apache Kafka® `_. To start using the generator: 1. Clone the repository: .. code:: git clone https://github.com/aiven/fake-data-producer-for-apache-kafka-docker 2. Copy the file ``conf/env.conf.sample`` to ``conf/env.conf`` 3. Create a new access token via the `Aiven Console `_ or the following command in the :doc:`Aiven CLI `, changing the ``max-age-seconds`` appropriately for the duration of your test: .. code:: avn user access-token create \ --description "Token used by Fake data generator" \ --max-age-seconds 3600 \ --json | jq -r '.[].full_token' .. Tip:: The above command uses ``jq`` (https://stedolan.github.io/jq/) to parse the result of the Aiven CLI command. If you don't have ``jq`` installed, you can remove the ``| jq -r '.[].full_token'`` section from the above command and parse the JSON result manually to extract the access token. 4. Edit the ``conf/env.conf`` file filling the following placeholders: * ``my_project_name``: the name of your Aiven project * ``my_kafka_service_name``: the name of your Aiven for Apache Kafka instance * ``my_topic_name``: the name of the target topic, can be any name * ``my_aiven_email``: the email address used as username to log in to Aiven services * ``my_aiven_token``: the access token generated during the previous step 5. Build the Docker image with: .. code:: docker build -t fake-data-producer-for-apache-kafka-docker . .. Tip:: Every time you change any parameters in the ``conf/env.conf`` file, you need to rebuild the Docker image to start using them. 6. Start the streaming data flow with: .. code:: docker run fake-data-producer-for-apache-kafka-docker 7. Once the Docker image is running, check in the target Aiven for Apache Kafka® service that the topic is populated. This can be done with the `Aiven Console `_, if the Kafka REST option is enabled, in the *Topics* tab. Alternatively you can use tools like :doc:`kcat ` to achieve the same.