From 87513dc98fa83f2ec0de36c35c2246a8c3bb18e9 Mon Sep 17 00:00:00 2001 From: Jeremy Beard Date: Fri, 16 Feb 2018 12:20:45 -0500 Subject: [PATCH] [ENV-239] Update example docs with more information (#162) --- examples/filesystem/README.md | 2 +- examples/fix/README.adoc | 2 ++ examples/traffic/README.md | 4 +++- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/examples/filesystem/README.md b/examples/filesystem/README.md index 4d875c7..414fc5b 100644 --- a/examples/filesystem/README.md +++ b/examples/filesystem/README.md @@ -12,7 +12,7 @@ This example demonstrates a simple HDFS-based data processing pipeline. **Run the Envelope job** - spark2-submit target/envelope-*.jar examples/filesystem/filesystem.conf + spark2-submit build/envelope/target/envelope-*.jar examples/filesystem/filesystem.conf **Grab the results** diff --git a/examples/fix/README.adoc b/examples/fix/README.adoc index 7fe921d..58da2d5 100644 --- a/examples/fix/README.adoc +++ b/examples/fix/README.adoc @@ -6,6 +6,8 @@ The configuration for this example is found link:fix.conf[here]. The messages do ## Running the example +. Modify `create_fix_tables.sql`, `fix.conf`, and `fix_generator.conf` to point to your cluster. If your cluster has secured Kafka, you will also need to modify the configuration files and below `spark2-submit` calls (see the FIX HBase example for more details) and the `kafka-topics` and `kafka-console-consumer` calls (see the test steps in the link:https://www.cloudera.com/documentation/kafka/latest/topics/kafka_security.html#concept_lcn_4mm_s5[Cloudera Kafka documentation] for more details). + . Create the required Kudu tables using the provided Apache Impala script: impala-shell -f create_fix_tables.sql diff --git a/examples/traffic/README.md b/examples/traffic/README.md index 2edc15a..ddcd9cb 100644 --- a/examples/traffic/README.md +++ b/examples/traffic/README.md @@ -2,10 +2,12 @@ The traffic example is an Envelope pipeline that retrieves measurements of traffic congestion and stores an aggregated view of the traffic congestion at a point in time using the current measurement and all of those in the previous 60 seconds. Within Envelope this uses the Apache Spark Streaming window operations functionality. This example demonstrates use cases that need to do live aggregations of recently received messages prior to user querying. -A sample configuration file is provided for reference. After creating the required Apache Kudu tables using the provided Apache Impala scripts, the example can be run as: +A sample configuration file is provided for reference. After creating the required Apache Kudu tables using the provided Apache Impala scripts, and modifying the configuration file to point to your cluster, the example can be run as: SPARK_KAFKA_VERSION=0.10 spark2-submit envelope-*.jar traffic.conf +Note that if your cluster has secured Kafka, you will also need to modify the configuration file and `spark2-submit` call -- see the FIX HBase example for more details. + An Apache Kafka producer to generate sample messages for the example, and push them in to the "traffic" topic, can be run as: spark2-submit --class com.cloudera.labs.envelope.examples.TrafficGenerator envelope-*.jar kafkabrokerhost:9092 traffic