Skip to content

Commit

Permalink
added gcp integration tests
Browse files Browse the repository at this point in the history
  • Loading branch information
martinzink committed Mar 8, 2022
1 parent c4da329 commit c65e698
Show file tree
Hide file tree
Showing 20 changed files with 221 additions and 72 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@ jobs:
if [ -d ~/.ccache ]; then mv ~/.ccache .; fi
mkdir build
cd build
cmake -DUSE_SHARED_LIBS= -DSTRICT_GSL_CHECKS=AUDIT -DENABLE_JNI=OFF -DDISABLE_JEMALLOC=ON -DENABLE_AWS=ON -DENABLE_LIBRDKAFKA=ON -DENABLE_MQTT=ON -DENABLE_AZURE=ON -DENABLE_SQL=ON -DENABLE_SPLUNK=ON -DENABLE_OPC=ON -DENABLE_SCRIPTING=ON -DENABLE_LUA_SCRIPTING=ON -DENABLE_KUBERNETES=ON -DDOCKER_BUILD_ONLY=ON -DDOCKER_CCACHE_DUMP_LOCATION=$HOME/.ccache ..
cmake -DUSE_SHARED_LIBS= -DSTRICT_GSL_CHECKS=AUDIT -DENABLE_JNI=OFF -DDISABLE_JEMALLOC=ON -DENABLE_AWS=ON -DENABLE_LIBRDKAFKA=ON -DENABLE_MQTT=ON -DENABLE_AZURE=ON -DENABLE_SQL=ON -DENABLE_SPLUNK=ON -DENABLE_GCP=ON -DENABLE_OPC=ON -DENABLE_SCRIPTING=ON -DENABLE_LUA_SCRIPTING=ON -DENABLE_KUBERNETES=ON -DDOCKER_BUILD_ONLY=ON -DDOCKER_CCACHE_DUMP_LOCATION=$HOME/.ccache ..
make docker
- id: install_deps
run: |
Expand Down
23 changes: 12 additions & 11 deletions PROCESSORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -1458,17 +1458,18 @@ Puts content into a Google Cloud Storage bucket

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

| Name | Default Value | Allowable Values | Description |
|----------------------------|---------------|-----------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Bucket Name** | | | The name of the Bucket to upload to. If left empty the _gcs.bucket_ attribute will be used by default.<br>**Supports Expression Language: true** |
| **Object Name** | | | The name of the object to be uploaded. If left empty the _filename_ attribute will be used by default.<br>**Supports Expression Language: true** |
| **NumberOfRetries** | 6 | integers | How many retry attempts should be made before routing to the failure relationship. |
| **GcpCredentials** | | [GcpCredentialsControllerService](CONTROLLERS.md#GcpCredentialsControllerService) | The Controller Service used to obtain Google Cloud Platform credentials. |
| Object ACL | | authenticatedRead<br>bucketOwnerFullControl<br>bucketOwnerRead<br>private<br>projectPrivate<br>publicRead | Access Control to be attached to the object uploaded. Not providing this will revert to bucket defaults. For more information please visit [Google Cloud Access control lists](https://cloud.google.com/storage/docs/access-control/lists#predefined-acl) |
| Server Side Encryption Key | | | An AES256 Encryption Key (encoded in base64) for server-side encryption of the object.<br>**Supports Expression Language: true** |
| CRC32 Checksum location | | | The name of the attribute where the crc32 checksum is stored for server-side validation.<br>**Supports Expression Language: true** |
| MD5 Hash Location | | | The name of the attribute where the md5 hash is stored for server-side validation.<br>**Supports Expression Language: true** |
| Content Type | | | The Content Type of the uploaded object. If not set, "mime.type" flow file attribute will be used. If not set, "mime.type" flow file attribute will be used. In case of neither of them is specified, this information will not be sent to the server. <br>**Supports Expression Language: true** |
| Name | Default Value | Allowable Values | Description |
|---------------------------------------|---------------|-----------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Bucket Name** | | | The name of the Bucket to upload to. If left empty the _gcs.bucket_ attribute will be used by default.<br>**Supports Expression Language: true** |
| **Object Name** | | | The name of the object to be uploaded. If left empty the _filename_ attribute will be used by default.<br>**Supports Expression Language: true** |
| **NumberOfRetries** | 6 | integers | How many retry attempts should be made before routing to the failure relationship. |
| **GCP Credentials Provider Service** | | [GcpCredentialsControllerService](CONTROLLERS.md#GcpCredentialsControllerService) | The Controller Service used to obtain Google Cloud Platform credentials. |
| Object ACL | | authenticatedRead<br>bucketOwnerFullControl<br>bucketOwnerRead<br>private<br>projectPrivate<br>publicRead | Access Control to be attached to the object uploaded. Not providing this will revert to bucket defaults. For more information please visit [Google Cloud Access control lists](https://cloud.google.com/storage/docs/access-control/lists#predefined-acl) |
| Server Side Encryption Key | | | An AES256 Encryption Key (encoded in base64) for server-side encryption of the object.<br>**Supports Expression Language: true** |
| CRC32 Checksum location | | | The name of the attribute where the crc32 checksum is stored for server-side validation.<br>**Supports Expression Language: true** |
| MD5 Hash Location | | | The name of the attribute where the md5 hash is stored for server-side validation.<br>**Supports Expression Language: true** |
| Content Type | | | The Content Type of the uploaded object. If not set, "mime.type" flow file attribute will be used. If not set, "mime.type" flow file attribute will be used. In case of neither of them is specified, this information will not be sent to the server. <br>**Supports Expression Language: true** |
| Endpoint Override URL | | | Overrides the default Google Cloud Storage endpoints |

### Relationships

Expand Down
1 change: 1 addition & 0 deletions cmake/DockerConfig.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ add_custom_target(
-c ENABLE_ENCRYPT_CONFIG=${ENABLE_ENCRYPT_CONFIG}
-c ENABLE_NANOFI=${ENABLE_NANOFI}
-c ENABLE_SPLUNK=${ENABLE_SPLUNK}
-c ENABLE_GCP=${ENABLE_GCP}
-c ENABLE_SCRIPTING=${ENABLE_SCRIPTING}
-c ENABLE_LUA_SCRIPTING=${ENABLE_LUA_SCRIPTING}
-c ENABLE_KUBERNETES=${ENABLE_KUBERNETES}
Expand Down
4 changes: 2 additions & 2 deletions cmake/GoogleCloudCpp.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@ set(GOOGLE_CLOUD_CPP_ENABLE storage CACHE INTERNAL storage-api)
set(GOOGLE_CLOUD_CPP_ENABLE_MACOS_OPENSSL_CHECK OFF CACHE INTERNAL macos-openssl-check)
set(BUILD_TESTING OFF CACHE INTERNAL testing-off)
FetchContent_Declare(google-cloud-cpp
URL https://github.com/googleapis/google-cloud-cpp/archive/refs/tags/v1.35.0.tar.gz
URL_HASH SHA256=e4e9eac1e7999eff195db270bc2a719004660b3730ebb5d2f444f2d2057e49b2
URL https://github.com/googleapis/google-cloud-cpp/archive/refs/tags/v1.37.0.tar.gz
URL_HASH SHA256=a7269b21d5e95bebff7833ebb602bcd5bcc79e82a59449cc5d5b350ff2f50bbc
PATCH_COMMAND "${PC}")
add_compile_definitions(_SILENCE_CXX20_REL_OPS_DEPRECATION_WARNING _SILENCE_CXX17_CODECVT_HEADER_DEPRECATION_WARNING CURL_STATICLIB)
FetchContent_MakeAvailable(google-cloud-cpp)
3 changes: 2 additions & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ ARG ENABLE_AZURE=OFF
ARG ENABLE_ENCRYPT_CONFIG=ON
ARG ENABLE_NANOFI=OFF
ARG ENABLE_SPLUNK=OFF
ARG ENABLE_GCP=OFF
ARG DISABLE_CURL=OFF
ARG DISABLE_JEMALLOC=ON
ARG DISABLE_CIVET=OFF
Expand Down Expand Up @@ -119,7 +120,7 @@ RUN cmake -DSTATIC_BUILD= -DSKIP_TESTS=true -DENABLE_ALL="${ENABLE_ALL}" -DENABL
-DENABLE_TENSORFLOW="${ENABLE_TENSORFLOW}" -DENABLE_AWS="${ENABLE_AWS}" -DENABLE_BUSTACHE="${ENABLE_BUSTACHE}" -DENABLE_SFTP="${ENABLE_SFTP}" \
-DENABLE_OPENWSMAN="${ENABLE_OPENWSMAN}" -DENABLE_AZURE="${ENABLE_AZURE}" -DENABLE_NANOFI=${ENABLE_NANOFI} -DENABLE_SYSTEMD=OFF \
-DDISABLE_CURL="${DISABLE_CURL}" -DDISABLE_JEMALLOC="${DISABLE_JEMALLOC}" -DDISABLE_CIVET="${DISABLE_CIVET}" -DENABLE_SPLUNK=${ENABLE_SPLUNK} \
-DDISABLE_EXPRESSION_LANGUAGE="${DISABLE_EXPRESSION_LANGUAGE}" -DDISABLE_ROCKSDB="${DISABLE_ROCKSDB}" \
-DDISABLE_EXPRESSION_LANGUAGE="${DISABLE_EXPRESSION_LANGUAGE}" -DDISABLE_ROCKSDB="${DISABLE_ROCKSDB}" -DENABLE_GCP="${ENABLE_GCP}" \
-DDISABLE_LIBARCHIVE="${DISABLE_LIBARCHIVE}" -DDISABLE_LZMA="${DISABLE_LZMA}" -DDISABLE_BZIP2="${DISABLE_BZIP2}" \
-DENABLE_SCRIPTING="${ENABLE_SCRIPTING}" -DDISABLE_PYTHON_SCRIPTING="${DISABLE_PYTHON_SCRIPTING}" -DENABLE_LUA_SCRIPTING="${ENABLE_LUA_SCRIPTING}" \
-DENABLE_KUBERNETES="${ENABLE_KUBERNETES}" \
Expand Down
6 changes: 6 additions & 0 deletions docker/test/integration/MiNiFi_integration_test_driver.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,12 @@ def check_splunk_event(self, splunk_container_name, query):
def check_splunk_event_with_attributes(self, splunk_container_name, query, attributes):
assert self.cluster.check_splunk_event_with_attributes(splunk_container_name, query, attributes)

def check_google_cloud_storage(self, gcs_container_name, content):
assert self.cluster.check_google_cloud_storage(gcs_container_name, content)

def check_empty_gcs_bucket(self, gcs_container_name):
assert self.cluster.is_gcs_bucket_empty(gcs_container_name)

def check_minifi_log_contents(self, line, timeout_seconds=60, count=1):
self.check_container_log_contents("minifi-cpp", line, timeout_seconds, count)

Expand Down
19 changes: 19 additions & 0 deletions docker/test/integration/features/google_cloud_storage.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Feature: Sending data to Google Cloud Storage using PutGcsObject

Background:
Given the content of "/tmp/output" is monitored

Scenario: A MiNiFi instance can upload data to Google Cloud storage
Given a GetFile processor with the "Input Directory" property set to "/tmp/input"
And a file with the content "hello_gcs" is present in "/tmp/input"
And a Google Cloud storage server is set up
And a PutGcsObject processor
And PutGcsObject processor is set up with a GcpCredentialsControllerService to communicate with the Google Cloud storage server
And a PutFile processor with the "Directory" property set to "/tmp/output"
And the "success" relationship of the GetFile processor is connected to the PutGcsObject
And the "success" relationship of the PutGcsObject processor is connected to the PutFile

When all instances start up

Then a flowfile with the content "hello_gcs" is placed in the monitored directory in less than 45 seconds
And object with the content "hello_gcs" is present in the Google Cloud storage
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from ..core.ControllerService import ControllerService


class GcpCredentialsControllerService(ControllerService):
def __init__(self, name=None, credentials_location=None, json_path=None, raw_json=None):
super(GcpCredentialsControllerService, self).__init__(name=name)

self.service_class = 'GcpCredentialsControllerService'

if credentials_location is not None:
self.properties['Credentials Location'] = credentials_location

if json_path is not None:
self.properties['Service Account JSON File'] = json_path

if raw_json is not None:
self.properties['Service Account JSON'] = raw_json
10 changes: 10 additions & 0 deletions docker/test/integration/minifi/core/DockerTestCluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,16 @@ def enable_splunk_hec_ssl(self, container_name, splunk_cert_pem, splunk_key_pem,
"-auth", "admin:splunkadmin"])
return code == 0

@retry_check()
def check_google_cloud_storage(self, gcs_container_name, content):
(code, output) = self.client.containers.get(gcs_container_name).exec_run(["grep", "-r", content, "/storage"])
return code == 0

@retry_check()
def is_gcs_bucket_empty(self, container_name):
(code, output) = self.client.containers.get(container_name).exec_run(["ls", "/storage/test-bucket"])
return code == 0 and output == b''

def query_postgres_server(self, postgresql_container_name, query, number_of_rows):
(code, output) = self.client.containers.get(postgresql_container_name).exec_run(["psql", "-U", "postgres", "-c", query])
output = output.decode(self.get_stdout_encoding())
Expand Down
27 changes: 27 additions & 0 deletions docker/test/integration/minifi/core/FakeGcsServerContainer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import logging
import os
from .Container import Container


class FakeGcsServerContainer(Container):
def __init__(self, name, vols, network, image_store, command=None):
super().__init__(name, 'fake-gcs-server', vols, network, image_store, command)

def get_startup_finished_log_entry(self):
return "server started at http"

def deploy(self):
if not self.set_deployed():
return

logging.info('Creating and running google cloud storage server docker container...')
self.client.containers.run(
"fsouza/fake-gcs-server:latest",
detach=True,
name=self.name,
network=self.network.name,
entrypoint=self.command,
ports={'4443/tcp': 4443},
volumes=[os.environ['TEST_DIRECTORY'] + "/resources/fake-gcs-server-data:/data"],
command='-scheme http -host fake-gcs-server')
logging.info('Added container \'%s\'', self.name)
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
from .KafkaBrokerContainer import KafkaBrokerContainer
from .S3ServerContainer import S3ServerContainer
from .AzureStorageServerContainer import AzureStorageServerContainer
from .FakeGcsServerContainer import FakeGcsServerContainer
from .HttpProxyContainer import HttpProxyContainer
from .PostgreSQLServerContainer import PostgreSQLServerContainer
from .MqttBrokerContainer import MqttBrokerContainer
Expand Down Expand Up @@ -97,6 +98,8 @@ def acquire_container(self, name, engine='minifi-cpp', command=None):
return self.containers.setdefault(name, S3ServerContainer(name, self.vols, self.network, self.image_store, command))
elif engine == 'azure-storage-server':
return self.containers.setdefault(name, AzureStorageServerContainer(name, self.vols, self.network, self.image_store, command))
elif engine == 'fake-gcs-server':
return self.containers.setdefault(name, FakeGcsServerContainer(name, self.vols, self.network, self.image_store, command))
elif engine == 'postgresql-server':
return self.containers.setdefault(name, PostgreSQLServerContainer(name, self.vols, self.network, self.image_store, command))
elif engine == 'mqtt-broker':
Expand Down
14 changes: 14 additions & 0 deletions docker/test/integration/minifi/processors/PutGcsObject.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
from ..core.Processor import Processor


class PutGcsObject(Processor):
def __init__(
self):
super(PutGcsObject, self).__init__(
'PutGcsObject',
properties={
'Bucket Name': 'test-bucket',
'Endpoint Override URL': 'fake-gcs-server:4443',
'Number of retries': 2
},
auto_terminate=["success", "failure"])
Loading

0 comments on commit c65e698

Please sign in to comment.