streams-bootstrap
provides base classes and utility functions for Kafka Streams applications.
It provides a common way to
- configure Kafka Streams applications
- deploy streaming applications on Kubernetes via Helm charts
- reprocess data
Visit our blogpost
and demo for an overview and a demo application.
The common configuration and deployments on Kubernetes are supported by
the Streams Explorer, which makes it possible to explore and monitor data
pipelines in Apache Kafka.
You can add streams-bootstrap via Maven Central.
implementation group: 'com.bakdata.kafka', name: 'streams-bootstrap-cli', version: '3.0.0'
With Kotlin DSL
implementation(group = "com.bakdata.kafka", name = "streams-bootstrap-cli", version = "3.0.0")
<dependency>
<groupId>com.bakdata.kafka</groupId>
<artifactId>streams-bootstrap-cli</artifactId>
<version>3.0.0</version>
</dependency>
For other build tools or versions, refer to the latest version in MvnRepository.
Create a subclass of KafkaStreamsApplication
and implement the abstract methods buildTopology()
and getUniqueAppId()
. You can define the topology of your application in buildTopology()
.
import com.bakdata.kafka.KafkaStreamsApplication;
import com.bakdata.kafka.SerdeConfig;
import com.bakdata.kafka.StreamsApp;
import com.bakdata.kafka.StreamsTopicConfig;
import com.bakdata.kafka.TopologyBuilder;
import java.util.Map;
import org.apache.kafka.common.serialization.Serdes.StringSerde;
import org.apache.kafka.streams.kstream.KStream;
public class MyStreamsApplication extends KafkaStreamsApplication {
public static void main(final String[] args) {
startApplication(new MyStreamsApplication(), args);
}
@Override
public StreamsApp createApp() {
return new StreamsApp() {
@Override
public void buildTopology(final TopologyBuilder builder) {
final KStream<String, String> input = builder.streamInput();
// your topology
input.to(builder.getTopics().getOutputTopic());
}
@Override
public String getUniqueAppId(final StreamsTopicConfig topics) {
return "streams-bootstrap-app-" + topics.getOutputTopic();
}
@Override
public SerdeConfig defaultSerializationConfig() {
return new SerdeConfig(StringSerde.class, StringSerde.class);
}
// Optionally you can define custom Kafka properties
@Override
public Map<String, Object> createKafkaProperties() {
return Map.of(
// your config
);
}
};
}
}
The following configuration options are available:
-
--bootstrap-servers
,--bootstrap-server
: List of Kafka bootstrap servers (comma-separated) (required) -
--schema-registry-url
: The URL of the Schema Registry -
--kafka-config
: Kafka Streams configuration (<String=String>[,<String=String>...]
) -
--input-topics
: List of input topics (comma-separated) -
--input-pattern
: Pattern of input topics -
--output-topic
: The output topic -
--error-topic
: A topic to write errors to -
--labeled-input-topics
: Additional labeled input topics if you need to specify multiple topics with different message types (<String=String>[,<String=String>...]
) -
--labeled-input-patterns
: Additional labeled input patterns if you need to specify multiple topics with different message types (<String=String>[,<String=String>...]
) -
--labeled-output-topics
: Additional labeled output topics if you need to specify multiple topics with different message types (String=String>[,<String=String>...]
) -
--application-id
: Unique application ID to use for Kafka Streams. Can also be provided by implementingStreamsApp#getUniqueAppId()
-
--volatile-group-instance-id
: Whether the group instance id is volatile, i.e., it will change on a Streams shutdown.
Additionally, the following commands are available:
-
clean
: Reset the Kafka Streams application. Additionally, delete the consumer group and all output and intermediate topics associated with the Kafka Streams application. -
reset
: Clear all state stores, consumer group offsets, and internal topics associated with the Kafka Streams application.
Create a subclass of KafkaProducerApplication
.
import com.bakdata.kafka.KafkaProducerApplication;
import com.bakdata.kafka.ProducerApp;
import com.bakdata.kafka.ProducerBuilder;
import com.bakdata.kafka.ProducerRunnable;
import com.bakdata.kafka.SerializerConfig;
import java.util.Map;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.common.serialization.StringSerializer;
public class MyProducerApplication extends KafkaProducerApplication {
public static void main(final String[] args) {
startApplication(new MyProducerApplication(), args);
}
@Override
public ProducerApp createApp() {
return new ProducerApp() {
@Override
public ProducerRunnable buildRunnable(final ProducerBuilder builder) {
return () -> {
try (final Producer<Object, Object> producer = builder.createProducer()) {
// your producer
}
};
}
@Override
public SerializerConfig defaultSerializationConfig() {
return new SerializerConfig(StringSerializer.class, StringSerializer.class);
}
// Optionally you can define custom Kafka properties
@Override
public Map<String, Object> createKafkaProperties() {
return Map.of(
// your config
);
}
};
}
}
The following configuration options are available:
-
--bootstrap-servers
,--bootstrap-server
: List of Kafka bootstrap servers (comma-separated) (required) -
--schema-registry-url
: The URL of the Schema Registry -
--kafka-config
: Kafka producer configuration (<String=String>[,<String=String>...]
) -
--output-topic
: The output topic -
--labeled-output-topics
: Additional labeled output topics (String=String>[,<String=String>...]
)
Additionally, the following commands are available:
clean
: Delete all output topics associated with the Kafka Producer application.
For the configuration and deployment to Kubernetes, you can use the Helm Charts.
To configure your streams app, you can use
the values.yaml
as a
starting point.
We also provide a chart
to clean your streams app.
To configure your producer app, you can use
the values.yaml
as a
starting point.
We also provide a chart
to clean your producer app.
If you want to contribute to this project, you can simply clone the repository and build it via Gradle. All dependencies should be included in the Gradle files, there are no external prerequisites.
> git clone [email protected]:bakdata/streams-bootstrap.git
> cd streams-bootstrap && ./gradlew build
Please note, that we have code styles for Java. They are basically the Google style guide, with some small modifications.
We are happy if you want to contribute to this project. If you find any bugs or have suggestions for improvements, please open an issue. We are also happy to accept your PRs. Just open an issue beforehand and let us know what you want to do and why.
This project is licensed under the MIT license. Have a look at the LICENSE for more details.