Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for inmemory repository as a native provider in cts/pts #138

Closed
planetf1 opened this issue Mar 28, 2022 · 21 comments
Closed

Add support for inmemory repository as a native provider in cts/pts #138

planetf1 opened this issue Mar 28, 2022 · 21 comments
Assignees

Comments

@planetf1
Copy link
Member

Currently if the cts/pts charts are setup to use the 'native' providers and we set

 tut:
   connectorProvider: "org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.                                     InMemoryOMRSRepositoryConnectorProvider"

then the cts (for example) will fail to initialize with

 > Configuring technology under test:

{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-url-root?url=https://cts-platform:9443)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-type?typeName=TUT)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/organization-name?name=Egeria)
{"class":"VoidResponse","relatedHTTPCode":400,"exceptionClassName":"org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException","actionDescription":"setLocalMetadataCollectionName","exceptionErrorMessage":"OMAG-ADMIN-400-008 The local repository mode has not been set for OMAG server tut","exceptionErrorMessageId":"OMAG-ADMIN-400-008","exceptionErrorMessageParameters":["tut"],"exceptionSystemAction":"The local repository mode must be enabled before the event mapper connection is set.  The system is unable to configure the local server.","exceptionUserAction":"The local repository mode is supplied by the caller to the OMAG server. This call to enable the local repository needs to be made before the call to set the event mapper connection."}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/metadata-collection-name/TUT_MDR)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/event-bus?topicURLRoot=egeria)
-- Unknown native repository provider: org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.InMemoryOMRSRepositoryConnectorProvider -- exiting.

This is because the logic in cts/pts for native is

if [ "${TUT_TYPE}" = "native" ]; then
  if [ "${CONNECTOR_PROVIDER}" = "org.odpi.openmetadata.adapters.repositoryservices.graphrepository.repositoryconnector.GraphOMRSRepositoryConnectorProvider" ]; then
    curl -f -k -w "\n   (%{http_code} - %{url_effective})\n" --silent -X POST \
      "${EGERIA_ENDPOINT}/open-metadata/admin-services/users/${EGERIA_USER}/servers/${TUT_SERVER}/local-repository/mode/local-graph-repository" || exit $?
  else
    echo "-- Unknown native repository provider: ${CONNECTOR_PROVIDER} -- exiting."
    exit 1
  fi

This condition should be extended to allow for the in-mem repository & perform the appropriate configuration

@planetf1 planetf1 self-assigned this Mar 28, 2022
planetf1 added a commit to planetf1/egeria-charts that referenced this issue Mar 28, 2022
planetf1 added a commit that referenced this issue Mar 28, 2022
#138 add support for in-memory-repository to cts/pts charts
planetf1 added a commit to planetf1/egeria-charts that referenced this issue Mar 28, 2022
planetf1 added a commit that referenced this issue Mar 28, 2022
#138 additional version update to trigger gh pages after infrastructu…
@planetf1
Copy link
Member Author

Further issue notes with ordering:

 > Configuring technology under test:

{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-url-root?url=https://cts-platform:9443)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-type?typeName=TUT)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/organization-name?name=Egeria)
{"class":"VoidResponse","relatedHTTPCode":400,"exceptionClassName":"org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException","actionDescription":"setLocalMetadataCollectionName","exceptionErrorMessage":"OMAG-ADMIN-400-008 The local repository mode has not been set for OMAG server tut","exceptionErrorMessageId":"OMAG-ADMIN-400-008","exceptionErrorMessageParameters":["tut"],"exceptionSystemAction":"The local repository mode must be enabled before the event mapper connection is set.  The system is unable to configure the local server.","exceptionUserAction":"The local repository mode is supplied by the caller to the OMAG server. This call to enable the local repository needs to be made before the call to set the event mapper connection."}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/metadata-collection-name/TUT_MDR)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/event-bus?topicURLRoot=egeria)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/mode/in-memory-repository)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/cohorts/cts)

Will re-order...

@planetf1
Copy link
Member Author

Note the above is only setting the collection name, and as a 200 http response is still returned, configuration continues. Also opened up docs issue to suggest updating the documentation around the sequencing of these calls.
Only affects cts, we don't set the md name in pts

@planetf1
Copy link
Member Author

After fix:

> Configuring technology under test:

{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-url-root?url=https://cts-platform:9443)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-type?typeName=TUT)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/organization-name?name=Egeria)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/event-bus?topicURLRoot=egeria)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/mode/in-memory-repository)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/metadata-collection-name/TUT_MDR)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/cohorts/cts)

planetf1 added a commit to planetf1/egeria-charts that referenced this issue Mar 29, 2022
planetf1 added a commit that referenced this issue Mar 29, 2022
#138 correct ordering of configuration api calls
planetf1 added a commit to planetf1/egeria-charts that referenced this issue Mar 29, 2022
planetf1 added a commit that referenced this issue Mar 29, 2022
#138 increment version for publishing
@planetf1
Copy link
Member Author

With these changes done, CTS is still not launching:

jonesn:charts/ (ctsupd) $ helm repo update && helm search repo egeria  --devel                                                                                    [8:48:06]
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "egeria" chart repository
...Successfully got an update from the "strimzi" chart repository
...Successfully got an update from the "bitnami" chart repository
Update Complete. ⎈Happy Helming!⎈
NAME                  	CHART VERSION     	APP VERSION	DESCRIPTION
egeria/egeria-base    	3.7.0-prerelease.0	3.7        	Egeria simple deployment (platform, react UI)
egeria/egeria-cts     	3.7.0-prerelease.4	3.7        	Egeria Conformance Test Suite deployment to Kub...
egeria/egeria-pts     	3.7.0-prerelease.3	3.7        	Egeria Performance Test Suite deployment to Kub...
egeria/odpi-egeria-lab	3.7.0-prerelease.0	3.7        	Egeria lab environment

jonesn:charts/ (ctsupd) $ cat ~/etc/cts-inmem.yml                                                                                                                 [8:47:36]
tut:
  connectorProvider: "org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.InMemoryOMRSRepositoryConnectorProvider"
jonesn:charts/ (ctsupd) $ helm install cts egeria/egeria-cts  -f ~/etc/cts-inmem.yml --devel                                                                      [8:47:42]
NAME: cts
LAST DEPLOYED: Tue Mar 29 08:47:54 2022
NAMESPACE: cts1
STATUS: deployed
REVISION: 1
TEST SUITE: None
jonesn:charts/ (ctsupd) $ helm list                                                                                                                               [8:47:56]
NAME	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART                        	APP VERSION
cts 	cts1     	1       	2022-03-29 08:47:54.799022 +0100 BST	deployed	egeria-cts-3.7.0-prerelease.4	3.7
jonesn:charts/ (ctsupd) $ kubectl get pods                                                                                                                        [8:48:01]
NAME                                        READY   STATUS     RESTARTS   AGE
cts-init--1-zjm9v                           0/1     Init:0/2   0          10s
cts-platform-7f4445dd6c-vd5w6               0/1     Running    0          10s
cts-report--1-4p2c8                         0/1     Init:0/4   0          10s
strimzi-cluster-operator-587cb79468-4ff4x   0/1     Running    0          10s

Leave things to start... until.....

jonesn:charts/ (ctsupd) $ kubectl get pods                                                                                                                        [8:51:27]
NAME                                           READY   STATUS      RESTARTS   AGE
cts-init--1-zjm9v                              0/1     Completed   0          3m34s
cts-platform-7f4445dd6c-vd5w6                  1/1     Running     0          3m34s
cts-report--1-4p2c8                            0/1     Init:3/4    0          3m34s
cts-strimzi-entity-operator-546b5ddc5b-mswlv   3/3     Running     0          25s
cts-strimzi-kafka-0                            1/1     Running     0          110s
cts-strimzi-zookeeper-0                        1/1     Running     0          3m17s
strimzi-cluster-operator-587cb79468-4ff4x      1/1     Running     0          3m34s
jonesn:charts/ (ctsupd) $ kubectl logs cts-init--1-zjm9v                                                                                                          [8:51:30]
-- Environment variables --
CTS_PLATFORM_PORT_9443_TCP_PORT=9443
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_PORT=443
CTS_FACTOR=5
DEV_PORT_9443_TCP=tcp://172.21.100.246:9443
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9091_TCP_PROTO=tcp
CONNECTOR_CONFIG={}
HOSTNAME=cts-init--1-zjm9v
EGERIA_USER=admin
CTS_REPORT_NAME=cts
CTS_STRIMZI_ZOOKEEPER_CLIENT_SERVICE_HOST=172.21.118.32
CTS_STRIMZI_ZOOKEEPER_CLIENT_PORT_2181_TCP=tcp://172.21.118.32:2181
CTS_PLATFORM_PORT=tcp://172.21.35.220:9443
CTS_STRIMZI_KAFKA_BOOTSTRAP_SERVICE_PORT_TCP_REPLICATION=9091
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9092_TCP=tcp://172.21.14.172:9092
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9092_TCP_PROTO=tcp
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9091_TCP=tcp://172.21.14.172:9091
PWD=/opt/config
CTS_STRIMZI_ZOOKEEPER_CLIENT_PORT=tcp://172.21.118.32:2181
DEV_PORT_9443_TCP_PROTO=tcp
CTS_STRIMZI_ZOOKEEPER_CLIENT_SERVICE_PORT=2181
JUPYTER_PORT_8888_TCP_ADDR=172.21.130.91
JUPYTER_SERVICE_PORT=8888
KAFKA_ENDPOINT=cts-strimzi-kafka-bootstrap:9092
JUPYTER_PORT_8888_TCP_PORT=8888
CTS_STRIMZI_KAFKA_BOOTSTRAP_SERVICE_PORT_TCP_CLIENTS=9092
HOME=/
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9092_TCP_PORT=9092
KUBERNETES_PORT_443_TCP=tcp://172.21.0.1:443
CTS_PLATFORM_SERVICE_PORT=9443
CTS_PLATFORM_SERVICE_HOST=172.21.35.220
JUPYTER_SERVICE_HOST=172.21.130.91
EGERIA_ENDPOINT=https://cts-platform:9443
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9091_TCP_ADDR=172.21.14.172
CTS_PLATFORM_PORT_9443_TCP_ADDR=172.21.35.220
CTS_STRIMZI_ZOOKEEPER_CLIENT_SERVICE_PORT_TCP_CLIENTS=2181
DEV_PORT=tcp://172.21.100.246:9443
EGERIA_SERVER=cts
CTS_STRIMZI_KAFKA_BOOTSTRAP_SERVICE_HOST=172.21.14.172
STRICT_SSL=false
EGERIA_COHORT=cts
TUT_TYPE=native
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9091_TCP_PORT=9091
JUPYTER_PORT_8888_TCP=tcp://172.21.130.91:8888
DEV_PORT_9443_TCP_PORT=9443
CTS_PLATFORM_SERVICE_PORT_CHASSIS=9443
CTS_STRIMZI_ZOOKEEPER_CLIENT_PORT_2181_TCP_PORT=2181
CTS_STRIMZI_ZOOKEEPER_CLIENT_PORT_2181_TCP_PROTO=tcp
TERM=xterm
CTS_STRIMZI_ZOOKEEPER_CLIENT_PORT_2181_TCP_ADDR=172.21.118.32
TUT_SERVER=tut
CTS_PLATFORM_PORT_9443_TCP=tcp://172.21.35.220:9443
SHLVL=2
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=172.21.0.1
JUPYTER_PORT=tcp://172.21.130.91:8888
CTS_STRIMZI_KAFKA_BOOTSTRAP_SERVICE_PORT=9091
DEV_SERVICE_HOST=172.21.100.246
KUBERNETES_SERVICE_HOST=172.21.0.1
DEV_PORT_9443_TCP_ADDR=172.21.100.246
KUBERNETES_PORT=tcp://172.21.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443
DEV_SERVICE_PORT=9443
CONNECTOR_PROVIDER=org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.InMemoryOMRSRepositoryConnectorProvider
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT=tcp://172.21.14.172:9091
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
JUPYTER_PORT_8888_TCP_PROTO=tcp
CTS_STRIMZI_KAFKA_BOOTSTRAP_PORT_9092_TCP_ADDR=172.21.14.172
CTS_PLATFORM_PORT_9443_TCP_PROTO=tcp
_=/usr/bin/env
-- End of Environment variables --

-- Configuring platform with required servers...

 > Configuring conformance test suite driver:

{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/cts/server-url-root?url=https://cts-platform:9443)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/cts/server-type?typeName=Conformance)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/cts/event-bus?topicURLRoot=egeria)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/cts/cohorts/cts)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/cts/conformance-suite-workbenches/repository-workbench/repositories)

 > Configuring technology under test:

{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-url-root?url=https://cts-platform:9443)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-type?typeName=TUT)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/organization-name?name=Egeria)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/event-bus?topicURLRoot=egeria)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/mode/in-memory-repository)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/metadata-collection-name/TUT_MDR)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/cohorts/cts)

-- End of configuration

-- Running the conformance test suite...

 > Starting conformance test suite:

{"class":"SuccessMessageResponse","relatedHTTPCode":200,"successMessage":"Tue Mar 29 07:51:11 GMT 2022 cts is running the following services: [Open Metadata Repository Services (OMRS), Connected Asset Services, Conformance Suite Services]"}

 > Starting the technology under test:

{"class":"SuccessMessageResponse","relatedHTTPCode":200,"successMessage":"Tue Mar 29 07:51:14 GMT 2022 tut is running the following services: [Open Metadata Repository Services (OMRS)]"}

-- End of conformance test suite startup

jonesn:charts/ (ctsupd) $ kubectl logs cts-platform-7f4445dd6c-vd5w6 | gist -f cts-platform                                                                       [8:52:23]
https://gist.github.com/68291150111fb9aaca366ef220a9afe2
jonesn:charts/ (ctsupd) $ kubectl logs cts-platform-7f4445dd6c-vd5w6 | tail -12                                                                                   [8:52:51]
Tue Mar 29 07:51:12 GMT 2022 tut Cohort OCF-FILE-REGISTRY-STORE-CONNECTOR-0115 Creating new cohort registry store ./data/servers/tut/cohorts/cts.registrystore
Tue Mar 29 07:51:12 GMT 2022 tut Cohort OCF-FILE-REGISTRY-STORE-CONNECTOR-0115 Creating new cohort registry store ./data/servers/tut/cohorts/cts.registrystore
Tue Mar 29 07:51:12 GMT 2022 tut Cohort OMRS-AUDIT-0060 Registering with open metadata repository cohort cts using metadata collection id c49209eb-d830-447e-9187-e6ddc5fee9e2
Tue Mar 29 07:51:12 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Tue Mar 29 07:51:12 GMT 2022 tut Startup OCF-KAFKA-TOPIC-CONNECTOR-0010 The Apache Kafka producer for topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances is starting up with 0 buffered messages
Tue Mar 29 07:51:12 GMT 2022 tut Startup OMRS-AUDIT-0015 The listener thread for an OMRS Topic Connector for topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances has started
Tue Mar 29 07:51:12 GMT 2022 tut Cohort OMRS-AUDIT-0062 Requesting registration information from other members of the open metadata repository cohort cts
Tue Mar 29 07:51:12 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Tue Mar 29 07:51:12 GMT 2022 tut Startup OMRS-AUDIT-0031 The local repository outbound event manager is starting with 1 type definition event consumer(s) and 1 instance event consumer(s)
Tue Mar 29 07:51:12 GMT 2022 tut Startup OMRS-AUDIT-0032 The local repository outbound event manager is sending out the 874 type definition events that were generated and buffered during server initialization
Tue Mar 29 07:51:14 GMT 2022 tut Startup OMAG-ADMIN-0004 The tut server has successfully completed start up.  The following services are running: [Open Metadata Repository Services (OMRS)]
Tue Mar 29 07:52:53 GMT 2022 cts Information CONFORMANCE-SUITE-0008 The Open Metadata Repository Conformance Workbench repository-workbench is waiting for server tut to join the cohort

This last message repeats, and the test never gets underway.

This looks like a cohort/server initialization ordering issue.

Need to check - is kafka working ok? What do the local and remote registrations on the server look like?

@planetf1
Copy link
Member Author

Note full log from the startup logs on the platform is at https://gist.github.com/68291150111fb9aaca366ef220a9afe2

@planetf1
Copy link
Member Author

Checking cohort members, it seems neither member knows about the other, ie remote registrations:

jonesn:~/ $ curl -k -X GET https://localhost:9443/servers/tut/open-metadata/repository-services/users/me/metadata-highway/cohorts/cts/remote-members              [9:10:21]
{"class":"CohortMembershipListResponse","relatedHTTPCode":200,"offset":0,"pageSize":0}%                                                                                     jonesn:~/ $ curl -k -X GET https://localhost:9443/servers/tut/open-metadata/repository-services/users/me/metadata-highway/cohorts/tut/remote-members              [9:10:29]
{"class":"CohortMembershipListResponse","relatedHTTPCode":200,"offset":0,"pageSize":0}%

Both are registered:

jonesn:~/ $ http --verify=no --pretty=format GET https://localhost:9443/servers/cts/open-metadata/repository-services/users/me/metadata-highway/local-registration
HTTP/1.1 200
Connection: keep-alive
Content-Type: application/json
Date: Tue, 29 Mar 2022 08:12:42 GMT
Keep-Alive: timeout=60
Transfer-Encoding: chunked

{
    "class": "CohortMembershipResponse",
    "cohortMember": {
        "metadataCollectionId": "03e855d4-b434-4bc9-998d-49820154cd6f",
        "metadataCollectionName": "cts",
        "repositoryConnection": {
            "class": "Connection",
            "connectorType": {
                "class": "ConnectorType",
                "connectorProviderClassName": "org.odpi.openmetadata.adapters.repositoryservices.rest.repositoryconnector.OMRSRESTRepositoryConnectorProvider",
                "description": "Cohort member client connector that provides access to open metadata located in a remote repository via REST calls.",
                "displayName": "REST Cohort Member Client Connector",
                "guid": "75ea56d1-656c-43fb-bc0c-9d35c5553b9e",
                "headerVersion": 0,
                "qualifiedName": "Egeria:OMRSRepositoryConnector:CohortMemberClient:REST",
                "type": {
                    "class": "ElementType",
                    "elementOrigin": "LOCAL_COHORT",
                    "elementTypeDescription": "A set of properties describing a type of connector.",
                    "elementTypeId": "954421eb-33a6-462d-a8ca-b5709a1bd0d4",
                    "elementTypeName": "ConnectorType",
                    "elementTypeVersion": 1,
                    "elementVersion": 0,
                    "headerVersion": 0
                }
            },
            "endpoint": {
                "address": "https://cts-platform:9443/servers/cts",
                "class": "Endpoint",
                "headerVersion": 0
            },
            "headerVersion": 0
        },
        "serverName": "cts",
        "serverType": "Conformance Suite Services"
    },
    "relatedHTTPCode": 200
}

jonesn:~/ $ http --verify=no --pretty=format GET https://localhost:9443/servers/tut/open-metadata/repository-services/users/me/metadata-highway/local-registration
HTTP/1.1 200
Connection: keep-alive
Content-Type: application/json
Date: Tue, 29 Mar 2022 08:13:11 GMT
Keep-Alive: timeout=60
Transfer-Encoding: chunked

{
    "class": "CohortMembershipResponse",
    "cohortMember": {
        "metadataCollectionId": "c49209eb-d830-447e-9187-e6ddc5fee9e2",
        "metadataCollectionName": "TUT_MDR",
        "organizationName": "Egeria",
        "repositoryConnection": {
            "class": "Connection",
            "connectorType": {
                "class": "ConnectorType",
                "connectorProviderClassName": "org.odpi.openmetadata.adapters.repositoryservices.rest.repositoryconnector.OMRSRESTRepositoryConnectorProvider",
                "description": "Cohort member client connector that provides access to open metadata located in a remote repository via REST calls.",
                "displayName": "REST Cohort Member Client Connector",
                "guid": "75ea56d1-656c-43fb-bc0c-9d35c5553b9e",
                "headerVersion": 0,
                "qualifiedName": "Egeria:OMRSRepositoryConnector:CohortMemberClient:REST",
                "type": {
                    "class": "ElementType",
                    "elementOrigin": "LOCAL_COHORT",
                    "elementTypeDescription": "A set of properties describing a type of connector.",
                    "elementTypeId": "954421eb-33a6-462d-a8ca-b5709a1bd0d4",
                    "elementTypeName": "ConnectorType",
                    "elementTypeVersion": 1,
                    "elementVersion": 0,
                    "headerVersion": 0
                }
            },
            "endpoint": {
                "address": "https://cts-platform:9443/servers/tut",
                "class": "Endpoint",
                "headerVersion": 0
            },
            "headerVersion": 0
        },
        "serverName": "tut",
        "serverType": "TUT"
    },
    "relatedHTTPCode": 200
}

@planetf1
Copy link
Member Author

To clarify - on the hostnames - I'd done a kubectl port-forward cts-platform-7f4445dd6c-vd5w6 9443:9443 & which is why the postman requests are to localhost, but the hostnames in the reg refer to cts-platform, which is a valid address as per:

jonesn:~/ $ kubectl get services                                                                                                                                  [9:14:44]
NAME                           TYPE           CLUSTER-IP       EXTERNAL-IP                         PORT(S)                      AGE
cts-platform                   ClusterIP      172.21.35.220    <none>                              9443/TCP                     26m
cts-strimzi-kafka-bootstrap    ClusterIP      172.21.14.172    <none>                              9091/TCP,9092/TCP            25m
cts-strimzi-kafka-brokers      ClusterIP      None             <none>                              9090/TCP,9091/TCP,9092/TCP   25m
cts-strimzi-zookeeper-client   ClusterIP      172.21.118.32    <none>                              2181/TCP                     26m
cts-strimzi-zookeeper-nodes    ClusterIP      None             <none>                              2181/TCP,2888/TCP,3888/TCP   26m
dev                            LoadBalancer   172.21.100.246   7a37b71c-eu-gb.lb.appdomain.cloud   9443:30355/TCP               59d
jupyter                        LoadBalancer   172.21.130.91    7bd0e82e-eu-gb.lb.appdomain.cloud   8888:30613/TCP               59d

@planetf1
Copy link
Member Author

planetf1 commented Mar 29, 2022

Full cohort config is added to the gists in the log below:

jonesn:~/ $ http --verify=no --pretty=format GET https://localhost:9443/servers/tut/open-metadata/repository-services/users/me/metadata-highway/cohort-descriptions | gist -
f 'tut-cohort'

https://gist.github.com/8c1a80b2e22daafe83d163d8004befc5

jonesn:~/ $ http --verify=no --pretty=format GET https://localhost:9443/servers/cts/open-metadata/repository-services/users/me/metadata-highway/cohort-descriptions | gist -f 'cts-cohort'

https://gist.github.com/d6c084bb1eb07dd9481148b7a7a14819

@planetf1
Copy link
Member Author

From the overall server logs

  • We only see one entry from each server for 'The local server is attempting to connect to Kafka, attempt 1' - so this would suggest there are no retries, can connect just fine
  • We see 'The Apache Kafka producer for topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances is starting up with 0 buffered messages' for each server

We see both servers try and register

jonesn:~/ $ cat /tmp/log | grep 'Registry Event'                                                                                                                 [10:39:17]
Tue Mar 29 07:51:07 GMT 2022 cts Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Tue Mar 29 07:51:07 GMT 2022 cts Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Tue Mar 29 07:51:12 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Tue Mar 29 07:51:12 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration

This all looks good.

Looks like a problem perhaps with consumer group/offset ? or topics/kafka backend...

@planetf1
Copy link
Member Author

With strimzi we can get the kafka topics via the CRD ie:

jonesn:~/ $ kubectl get kafkatopic                                                                                                                               [10:43:08]
NAME                                                                                                                             CLUSTER       PARTITIONS   REPLICATION FACTOR   READY
consumer-offsets---84e7a678d08f4bd226872e5cdd4eb527fadc1c6a                                                                      lab-strimzi   50           1                    True
egeria.omag.openmetadata.repositoryservices.cohort.devcohort.omrstopic.instances---eb305a0f1a427dc244fd2e4a050c43fc27dde9d3      lab-strimzi   1            1                    True
egeria.omag.openmetadata.repositoryservices.cohort.devcohort.omrstopic.registration---14bb4ecab33b90a8fc54eaea81fa2db65d3c6e41   lab-strimzi   1            1                    True
egeria.omag.openmetadata.repositoryservices.cohort.devcohort.omrstopic.types---7344d0419fd783b43b45d9d32a0d47d4fae85171          lab-strimzi   1            1                    True
egeria.openmetadata.repositoryservices.cohort.cts.omrstopic.instances---b4c6331ba685d73784260e70d32f1d99b25735f5                 cts-strimzi   1            1                    True
egeria.openmetadata.repositoryservices.cohort.cts.omrstopic.registration---88fc921a95a340e412da2fe4168b8b642244b348              cts-strimzi   1            1                    True
egeria.openmetadata.repositoryservices.cohort.cts.omrstopic.types---f7b0cc1a81869085a7f122bde5bea121695089e7                     cts-strimzi   1            1                    True
strimzi-store-topic---effb8e3e057afce1ecf67c3f5d8e4e3ff177fc55                                                                   lab-strimzi   1            1                    True
strimzi-topic-operator-kstreams-topic-store-changelog---b75e702040b99be8a9263134de3507fc0cc4017b                                 lab-strimzi   1            1                    True

@planetf1
Copy link
Member Author

planetf1 commented Mar 29, 2022

(logging debug process here for benefit of future attempts)

Commands can also be issued direct on the strimzi kafka brokers - in this case I have just one. Nice to see the strimzi container has the client tools present:

jonesn:~/ $ kubectl exec cts-strimzi-kafka-0 -- /opt/kafka/bin/kafka-topics.sh --bootstrap-server cts-strimzi-kafka-bootstrap:9092 --list                        [11:14:23]
__consumer_offsets
__strimzi-topic-operator-kstreams-topic-store-changelog
__strimzi_store_topic
egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances
egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.types

@planetf1
Copy link
Member Author

jonesn:~/ $ kubectl exec cts-strimzi-kafka-0 -- /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server cts-strimzi-kafka-bootstrap:9092 --list               [11:14:36]
ecca35f2-ef3b-44d4-9d21-7b30f23bbed9
9068b027-b891-4820-ae93-4c51f588fa74
__strimzi-topic-operator-kstreams
jonesn:~/ $ kubectl exec cts-strimzi-kafka-0 -- /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server cts-strimzi-kafka-bootstrap:9092 --describe --group ecca35f2-ef3b-44d4-9d21-7b30f23bbed9

GROUP                                TOPIC                                                                    PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                                                          HOST            CLIENT-ID
ecca35f2-ef3b-44d4-9d21-7b30f23bbed9 egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration 0          4               4               0               consumer-ecca35f2-ef3b-44d4-9d21-7b30f23bbed9-1-3e1aba8d-9054-43d8-935d-42ea51617551 /172.17.16.166  consumer-ecca35f2-ef3b-44d4-9d21-7b30f23bbed9-1
ecca35f2-ef3b-44d4-9d21-7b30f23bbed9 egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.types        0          1748            1748            0               consumer-ecca35f2-ef3b-44d4-9d21-7b30f23bbed9-3-25943500-f72b-4f71-a0f7-36aeea7ed8e5 /172.17.16.166  consumer-ecca35f2-ef3b-44d4-9d21-7b30f23bbed9-3
ecca35f2-ef3b-44d4-9d21-7b30f23bbed9 egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances    0          0               0               0               consumer-ecca35f2-ef3b-44d4-9d21-7b30f23bbed9-5-2b92da1b-9f0f-4ee7-b6cb-5505dccc145b /172.17.16.166  consumer-ecca35f2-ef3b-44d4-9d21-7b30f23bbed9-5
jonesn:~/ $ kubectl exec cts-strimzi-kafka-0 -- /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server cts-strimzi-kafka-bootstrap:9092 --describe --group 9068b027-b891-4820-ae93-4c51f588fa74

GROUP                                TOPIC                                                                    PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                                                           HOST            CLIENT-ID
9068b027-b891-4820-ae93-4c51f588fa74 egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.types        0          1748            1748            0               consumer-9068b027-b891-4820-ae93-4c51f588fa74-9-3e4d3c28-57f6-44a4-8799-8e2fdbac969a  /172.17.16.166  consumer-9068b027-b891-4820-ae93-4c51f588fa74-9
9068b027-b891-4820-ae93-4c51f588fa74 egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances    0          0               0               0               consumer-9068b027-b891-4820-ae93-4c51f588fa74-11-40d9fb94-5682-4818-ad87-21b7ccc4b6e6 /172.17.16.166  consumer-9068b027-b891-4820-ae93-4c51f588fa74-11
9068b027-b891-4820-ae93-4c51f588fa74 egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration 0          4               4               0               consumer-9068b027-b891-4820-ae93-4c51f588fa74-7-cf36ec15-a1af-40c9-aaaa-63a9d72dc37e  /172.17.16.166  consumer-9068b027-b891-4820-ae93-4c51f588fa74-7

@planetf1
Copy link
Member Author

planetf1 commented Mar 29, 2022

This is pretty much what I'd expect

  • 4 messages relating to cohort registration
  • 1748 messages relating to types

The offsets are consistent with this having started from a clean slate.

So why aren't the cohort members communicating? Why did the registration process not seem to work?

Any ideas @mandy-chessell

@planetf1
Copy link
Member Author

Checking topic in more detail:

onesn:~/ $ kubectl exec cts-strimzi-kafka-0 -- /opt/kafka/bin/kafka-topics.sh --bootstrap-server cts-strimzi-kafka-bootstrap:9092 --describe --topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Topic: egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration	TopicId: opavW4KNS8mYcB7-HqDjOA	PartitionCount: 1	ReplicationFactor: 1	Configs: message.format.version=3.0-IV1
	Topic: egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration	Partition: 0	Leader: 0	Replicas: 0	Isr: 0

@planetf1
Copy link
Member Author

Taking a look at one of these topics (after a restart.. so the ids differ from above commands)

jonesn:charts/ (ctsupd*) $ kubectl exec cts-strimzi-kafka-0 -- /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server cts-strimzi-kafka-bootstrap:9092 --topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration --from-beginning
{"class":"OMRSEventV1","protocolVersionId":"OMRS V1.0","timestamp":1648550710143,"originator":{"metadataCollectionId":"e15204a1-83f5-4521-a13c-fe103640a6c4","serverName":"cts","serverType":"Conformance Suite Services"},"eventCategory":"REGISTRY","registryEventSection":{"registryEventType":"REGISTRATION_EVENT","registrationTimestamp":1648550710141,"metadataCollectionName":"cts","remoteConnection":{"class":"Connection","headerVersion":0,"connectorType":{"class":"ConnectorType","headerVersion":0,"type":{"class":"ElementType","headerVersion":0,"elementOrigin":"LOCAL_COHORT","elementVersion":0,"elementTypeId":"954421eb-33a6-462d-a8ca-b5709a1bd0d4","elementTypeName":"ConnectorType","elementTypeVersion":1,"elementTypeDescription":"A set of properties describing a type of connector."},"guid":"75ea56d1-656c-43fb-bc0c-9d35c5553b9e","qualifiedName":"Egeria:OMRSRepositoryConnector:CohortMemberClient:REST","displayName":"REST Cohort Member Client Connector","description":"Cohort member client connector that provides access to open metadata located in a remote repository via REST calls.","connectorProviderClassName":"org.odpi.openmetadata.adapters.repositoryservices.rest.repositoryconnector.OMRSRESTRepositoryConnectorProvider"},"endpoint":{"class":"Endpoint","headerVersion":0,"address":"https://cts-platform:9443/servers/cts"}}}}
{"class":"OMRSEventV1","protocolVersionId":"OMRS V1.0","timestamp":1648550710217,"originator":{"serverName":"cts","serverType":"Conformance Suite Services"},"eventCategory":"REGISTRY","registryEventSection":{"registryEventType":"REFRESH_REGISTRATION_REQUEST"}}
{"class":"OMRSEventV1","protocolVersionId":"OMRS V1.0","timestamp":1648550713658,"originator":{"metadataCollectionId":"8474ae28-7556-4f89-b3bd-7e6e6243bc73","serverName":"tut","serverType":"TUT","organizationName":"Egeria"},"eventCategory":"REGISTRY","registryEventSection":{"registryEventType":"REGISTRATION_EVENT","registrationTimestamp":1648550713657,"metadataCollectionName":"TUT_MDR","remoteConnection":{"class":"Connection","headerVersion":0,"connectorType":{"class":"ConnectorType","headerVersion":0,"type":{"class":"ElementType","headerVersion":0,"elementOrigin":"LOCAL_COHORT","elementVersion":0,"elementTypeId":"954421eb-33a6-462d-a8ca-b5709a1bd0d4","elementTypeName":"ConnectorType","elementTypeVersion":1,"elementTypeDescription":"A set of properties describing a type of connector."},"guid":"75ea56d1-656c-43fb-bc0c-9d35c5553b9e","qualifiedName":"Egeria:OMRSRepositoryConnector:CohortMemberClient:REST","displayName":"REST Cohort Member Client Connector","description":"Cohort member client connector that provides access to open metadata located in a remote repository via REST calls.","connectorProviderClassName":"org.odpi.openmetadata.adapters.repositoryservices.rest.repositoryconnector.OMRSRESTRepositoryConnectorProvider"},"endpoint":{"class":"Endpoint","headerVersion":0,"address":"https://cts-platform:9443/servers/tut"}}}}
{"class":"OMRSEventV1","protocolVersionId":"OMRS V1.0","timestamp":1648550713678,"originator":{"serverName":"tut","serverType":"TUT","organizationName":"Egeria"},"eventCategory":"REGISTRY","registryEventSection":{"registryEventType":"REFRESH_REGISTRATION_REQUEST"}}

This is all as I'd expect..

@planetf1
Copy link
Member Author

After more testing

  • Graph consistently starts ok
  • Inmem consistently does not start

Yet the same infrastructure (clean each time, fresh strimzi cluster), charts are identical.

The only difference is that in the inmemcase I pass

tut:
  connectorProvider: "org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.InMemoryOMRSRepositoryConnectorProvider"

whilst in the graph case, the default value is used which is (from values.yaml):

connectorProvider:
  "org.odpi.openmetadata.adapters.repositoryservices.graphrepository.repositoryconnector.GraphOMRSRepositoryConnectorProvider"

The config logs prove this is respected, and the value is only used for this expression in the configuration job:

if [ "${TUT_TYPE}" = "native" ]; then
  if [ "${CONNECTOR_PROVIDER}" = "org.odpi.openmetadata.adapters.repositoryservices.graphrepository.repositoryconnector.GraphOMRSRepositoryConnectorProvider" ]; then
    curl -f -k -w "\n   (%{http_code} - %{url_effective})\n" --silent -X POST \
      "${EGERIA_ENDPOINT}/open-metadata/admin-services/users/${EGERIA_USER}/servers/${TUT_SERVER}/local-repository/mode/local-graph-repository" || exit $?
  elif [ "${CONNECTOR_PROVIDER}" = "org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.InMemoryOMRSRepositoryConnectorProvider" ]; then
      curl -f -k -w "\n   (%{http_code} - %{url_effective})\n" --silent -X POST \
        "${EGERIA_ENDPOINT}/open-metadata/admin-services/users/${EGERIA_USER}/servers/${TUT_SERVER}/local-repository/mode/in-memory-repository" || exit $?
  else
    echo "-- Unknown native repository provider: ${CONNECTOR_PROVIDER} -- exiting."
    exit 1
  fi

Timing, or config differences

Next Steps

  • replay the same config used by the chart manually
  • compare with notebook cts
  • check code for cts.. and inmem repositories to see any explanation - particularly why the cohort config seems incorrect.

@cmgrote
Copy link
Member

cmgrote commented Mar 29, 2022

My only thought here is that most repos (Graph, XTDB, etc) will have some initialization time before they're "ready" and joining into the cohort -- I suspect that the in-memory repo's initialization time is negligible (basically instantaneous).

Most likely, then, this is something akin to a race condition -- e.g. where the in-memory repository has finished its initialization just as / just before the CTS driver starts its wait loop (?) A simple test fix would be to add a "sleep" in between the steps where the CTS server (instance) is started and the TUT server (instance) is started and see if that resolve it?

@planetf1
Copy link
Member Author

planetf1 commented Mar 29, 2022

I agree, though our cohort design shouldn't allow for a gap so I think we do also need to understand where the issue is in cohort initialization (or indeed is specific to the cts workbench code in the server)

I can add a sleep in the run - and / or maybe start tut first - that may give us reliability. If so I'll externalize so it's easy to revert back to the broken behaviour for future debugging

@planetf1
Copy link
Member Author

I've left order as-is and added configurable delay.

For in-mem it consistently works with 3s, fails with 2s - though this could well be quite specific to my infrastructure.

I've set the default as 10s, which should be reasonable.

The main focus of this issue is completed now. However the cohort config failure is a concern.
I'll plan to open an issue in egeria, and point to this issue + a 0-delay configuration as the reproducing test case.

@planetf1
Copy link
Member Author

With these changes, CTS inmem completed successfully with our typical profile results.

@planetf1
Copy link
Member Author

CTS now working. Issue opened against egeria to understand any issues with the core.
See odpi/egeria#6353

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants