Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another Pack from IBM (MONITOR_INGEST) #163

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

Anshika-Gautam
Copy link

This is another pack from IBM similar to the first one monitor_mqtt

@@ -30,7 +30,7 @@ jobs:
- run:
name: Download dependencies
command: |
git clone -b ${CI_BRANCH:-master} [email protected]:StackStorm-Exchange/ci.git ~/ci
git clone -b ${CI_BRANCH:-master} [email protected]:Anshika-Gautam/ci.git ~/ci
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to fix this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our pack is using mam-sdk which is dependent on iotfunctions package for its functioning. The "[email protected]:StackStorm-Exchange/ci.git ~/ci" is installing the pip version 9.0.3 which is not able to get the iotfunctions package from the specified git repository. It is resulting into the below mentioned error on circle:


`Using /home/circleci/virtualenv/lib/python3.6/site-packages
Finished processing dependencies for st2common==3.4.dev0

Exited with code exit status 1
CircleCI received exit code 1`


We had removed the pip restriction in the package "[email protected]:Anshika-Gautam/ci.git ~/ci" and after that we are able to download all the dependencies for our pack. But the run test step is failing for the pack now as shown in snapshot below:
Screenshot 2021-02-19 at 7 18 13 PM

monitor_ingest/README.md Show resolved Hide resolved
monitor_ingest/README.md Outdated Show resolved Hide resolved
monitor_ingest/config.schema.yaml Outdated Show resolved Hide resolved
type: "string"
required: true
json_schema_path:
description: "json Schema is must to validate CSV in case of action_type : DataClean"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this a JSON schema for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The json schema defines particular format in which the data for ingestion should be supplied for our pack's data_ingest action

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does _path refer to a file on disk?

And does this mean that the file will need to exist on all st2actionrunner nodes?

Copy link
Author

@Anshika-Gautam Anshika-Gautam Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blag Yes _path refers to a file on disk. What are you trying point with st2actionrunner nodes being using the path? Can you please explain?
Note - The _path is mandatory to execute setup_entity action.


def run(self):
# define validation elements
print('1. Starting data Clean Action ..')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should replace prints with self.logger.debug()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the changes for print

if errors:
errors_index_rows = [e.row for e in errors]
print('5. Cleaning input CSV data ..')
data_clean = data.drop(index=errors_index_rows)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's great practice to hard code paths like this in your actions. You're going to run into problems with there is >1 instance of your action running at the same time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had removed the hard code paths. Thanks for your suggestion.

@@ -0,0 +1,102 @@
import json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a naming perspective, a better name would be data_clean_csv you can leave out the _action

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -0,0 +1,37 @@
import yaml
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with the name here, maybe just data_ingest_csv

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

dimension_data_path = None
function_data_path = None

if self._action_type == "SetupEntityAction":
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why these are not parameters to the action and instead hard coded in the config?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our config file is actually working as a handler to perform different actions on the entity. We had updated the code to follow a modular approach now. Instead of if loops we tried using functions to call a particular action type.

@abharast
Copy link

abharast commented Mar 2, 2021

Hi @nmaludy

could you please suggest us with the issue related to pip version as it made us blocked in this submission
#163 (comment)

-Abhay

entity_name:
description: "Entity Name is must in case of action_type : LoadCsv"
type: "string"
data_file_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the file identified here come from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blag It is the same file which is residing on disk. This variable here is pointing to the path of the file.

raise ValueError('Missing action type key in config file')

def run(self):
operations_completed = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like using a mutable dictionary as a global variable. That's not intuitive and difficult to debug. Since all of the setup_* functions are mutually exclusive, simply have them return the result you would like to return to StackStorm.

Copy link
Author

@Anshika-Gautam Anshika-Gautam Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blag removed mutable dictionary


"""----------STATUS----------"""
self.logger.info('RESULT :')
for name, status in operations_completed.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the logic in this loop is overcomplicated. Why not just return a tuple of (success, result_text) to StackStorm? Seems much cleaner to me.

Copy link
Author

@Anshika-Gautam Anshika-Gautam Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blag simplified the logic to return boolean values

@Anshika-Gautam
Copy link
Author

@nmaludy @blag we have a blocker regarding pip version for downloading dependencies. As mentioned in thread #163 (comment).

@Anshika-Gautam Anshika-Gautam requested a review from blag March 4, 2021 10:29
@cognifloyd
Copy link
Member

StackStorm-Exchange/ci#102 was merged today which updates the pinned version of pip to 20.0.2

Please push another commit to restart the tests.

The following error was encountered while running circleci
Traceback (most recent call last):
  File "/home/circleci/ci/.circle/validate.py", line 21, in <module>
    from st2common.models.api.pack import PackAPI
  File "/tmp/st2/st2common/st2common/models/api/pack.py", line 25, in <module>
    from st2common.util import schema as util_schema
  File "/tmp/st2/st2common/st2common/util/schema/__init__.py", line 112, in <module>
    "allOf": _validators.allOf_draft4,
AttributeError: module 'jsonschema._validators' has no attribute 'allOf_draft4'
Unable to retrieve pack name.

In order to avoid this error the jsonschema is restricted to version 3.0.0
@cognifloyd
Copy link
Member

Eww. That's a nasty dependency conflict.

ERROR: st2common 3.5.dev0 has requirement jsonschema==2.6.0, but you'll have jsonschema 3.0.0 which is incompatible.
ERROR: orquesta 1.3.0 has requirement jsonschema!=2.5.0,<3.0.0,>=2.0.0, but you'll have jsonschema 3.0.0 which is incompatible.
ERROR: mam-sdk 0.0.0 has requirement jsonschema>=3.2.0, but you'll have jsonschema 3.0.0 which is incompatible.

jsonschema._validators.allOf_draft4 was moved and dropped in the 3.0 series. So, st2common and orquesta will need to be updated to support jsonschema 3+ before packs can use newer versions of jsonschema.

@Anshika-Gautam
Copy link
Author

@cognifloyd are you working on updating Stackstorm packs to use the latest versions of the packages like jsonschema and others. As you can see the conflict is because of the different versions these dependencies(st2,mam-sdk) are using.

@cognifloyd
Copy link
Member

I have a variety of other things I'm working on contributing. Recently I helped to get pip pinned to the same version in several of the StackStorm repos. That's how I came across this new pack. :)

Would you be able to look into what it will take to update jsonschema in https://github.com/StackStorm/orquesta and https://github.com/StackStorm/st2 ?

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@cognifloyd
Copy link
Member

I'm closing and reopening this to trigger the latest CI.

@cognifloyd cognifloyd closed this May 23, 2022
@cognifloyd cognifloyd reopened this May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants