Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCPClient error check Gearman worker creation #674

Conversation

minusdavid
Copy link

This patch adds a try/except block to the MCPClient when creating
a Gearman worker in startThread().

Without this patch, if the MCPClient configuration item
"MCPArchivematicaServer" has an invalid value, no Gearman worker
will be created and Archivematica will be stuck thinking that a
job is executing indefinitely with no indication of what happened
in the user interface or the logs.

To test, open "/etc/archivematica/MCPClient/clientConfig.conf",
and change "MCPArchivematicaServer" to something invalid like
"buffalo" or "localhost::9999", and then try to do a standard
transfer in the Archivematica dashboard UI. In the micro-service
"Verify transfer compliance", you'll get stuck at "Job: Set file
permissions". It will say it's still executing but the job will
never actually run.

@sevein
Copy link
Member

sevein commented Jul 12, 2017

Hi @minusdavid! This is a good improvement. Thanks for spending time fixing it.

There was a similar issue in MCPServer where the child threads execute in their own context. @Hwesta solved this elegantly using a decorator here: 8efc709. Would you consider taking the same approach?

Ultimately I think we should be establishing a channel so the threads can communicate with the main thread. There is an example here: https://stackoverflow.com/a/2830127. You may want to try that approach instead.

@minusdavid
Copy link
Author

I'm intrigued by the channel idea. While I started using Python a few years ago for fun at home, I am very new to multithreading (I'm used to forking multiple processes and using IPC to exchange data in Perl rather than using threads), and I only wrote my first Python multithreading program using Queue a week or two ago, so I'm still trying to wrap my head around how memory is used for multithreading. At a glance, it looks like Queue manages some shared memory using locking to be thread-safe... and in the case of that Stackoverflow link it passes information back by putting items in the Queue. The accepted answer seems to have some reasonable criticisms attached to it though. I don't know that a Queue is the best way to pass messages, although it seems to be the most common approach on Google both in Python and Java. In Perl, I fork off some worker processes and have pipes between them for exchanging data and I catch the SIGCHLD signal to see if it died or exited successfully. But neither really seems an option with multithreading. I like the idea of the main thread knowing what's happening with the other threads though so that it can take any necessary action. I think I'll leave that one up to you folk as the core developers for the future.

I like the decorator idea, although I think that may involve copying src/MCPServer/lib/utils.py to src/MCPClient/lib/utils.py. Do these modules share any code? Maybe I could put the actual code into a module in /usr/lib/archivematica/archivematicaCommon and then the MCPServer/lib/utils.py and MCPClient/lib/utils.py could create their respective loggers and pass that to the a wrap() function from that shared module?

Looks like startThread already has a @auto_close_db decorator but I don't see why we couldn't stack another on.

@sevein
Copy link
Member

sevein commented Jul 13, 2017

I like the idea of the main thread knowing what's happening with the other threads though so that it can take any necessary action. I think I'll leave that one up to you folk as the core developers for the future.

Ok, no problem! Thanks for sharing your thoughts.

I like the decorator idea, although I think that may involve copying src/MCPServer/lib/utils.py to src/MCPClient/lib/utils.py. Do these modules share any code? Maybe I could put the actual code into a module in /usr/lib/archivematica/archivematicaCommon and then the MCPServer/lib/utils.py and MCPClient/lib/utils.py could create their respective loggers and pass that to the a wrap() function from that shared module?

Sounds like a good plan, @minusdavid. We may have some more feedback once we look at the code. Stacking decorators is fine, yep! Thank you again - there's a lot of room for improvement in these areas of AM.

@sevein
Copy link
Member

sevein commented Jul 13, 2017

@minusdavid, I've noticed that you're submitting this PR against stable/1.6.x but it should be using qa/1.x instead. Also, please remember to sign the contributor agreement - you can read more about it here: CONTRIBUTING.md.

@minusdavid
Copy link
Author

Sounds like a good plan, @minusdavid. We may have some more feedback once we look at the code. Stacking decorators is fine, yep! Thank you again - there's a lot of room for improvement in these areas of AM.

Looks like there's no unit test for this one, so I suppose I'll have to run up the whole thing to test out the decorator. I might look at that next week then...

@minusdavid, I've noticed that you're submitting this PR against stable/1.6.x but it should be using qa/1.x instead. Also, please remember to sign the contributor agreement - you can read more about it here: CONTRIBUTING.md.

Since there's no master branch, I was wondering what best to submit against. If there's time next week, I'll look at redoing the PR.

I noticed the contributor agreement, but wasn't sure if that needed to be done until you approved the changes? I'll have to look at that too.

@sevein
Copy link
Member

sevein commented Jul 26, 2017

I noticed the contributor agreement, but wasn't sure if that needed to be done until you approved the changes? I'll have to look at that too.

It needs to be done before merging, that's all!

@minusdavid
Copy link
Author

I'll have to do the contributor agreement later.

I'm not sure that I know how to test my change actually. I suppose I could change "deploy-pub/playbooks/archivematica/src/archivematica" to the branch I want to work on. I suppose I should've set up that Vagrant install with the qa branch rather than the stable branch, but maybe that will work for now since it's a fairly targeted change.

@minusdavid minusdavid force-pushed the dev/gh-issue-673-mcpclient-error-checking branch from 50e0d04 to cc7b9cb Compare August 3, 2017 06:53
@minusdavid minusdavid changed the base branch from stable/1.6.x to qa/1.x August 3, 2017 06:53
@minusdavid
Copy link
Author

Ahh looks like my qa/1.x is behind the times...

@minusdavid minusdavid force-pushed the dev/gh-issue-673-mcpclient-error-checking branch from cc7b9cb to e1accdd Compare August 3, 2017 07:05
@minusdavid
Copy link
Author

Looks like the CI build is failing... I'm guessing because the MCPClient fails differently than before?

@minusdavid
Copy link
Author

So I've fixed the base branch.

I still need to do the contributor agreement and change the try/except to use a decorator instead...

But still not sure the best way to test this. I don't know that I can use deploy-pub... unless I re-run it against qa/1.x instead of stable/1.6.x. Is that how developers do it?

What's the standard way that developers test their changes for Archivematica?

@minusdavid
Copy link
Author

I've sent in the contributor agreement.

Now I'm trying to test my changes...

I used Vagrant to set up a 1.6.x box and now I want to switch to a qa box without re-doing the whole thing... but I suppose there's no harm in destroying it and creating a new one now.

I've updated vars-singlenode-qa.yml to use my own archivematica_src_am_version and archivematica_src_am_repo variables. I suppose I could've left them with the default and then switched branches later though?

@minusdavid
Copy link
Author

minusdavid commented Aug 8, 2017

Struggling to get a development environment up and running for qa/1.x. I think there might be a bug in archivematica-storage-server qa/0.x?

As per artefactual/deploy-pub#40, I keep getting this error:

TASK [artefactual.archivematica-src : Run SS django collectstatic] *************
fatal: [am-local]: FAILED! => {"changed": false, "cmd": "./manage.py collectstatic --noinput", "failed": true, "msg": "\n:stderr: Traceback (most recent call last):\n  File \"./manage.py\", line 10, in <module>\n    execute_from_command_
line(sys.argv)\n  File \"/usr/share/python/archivematica-storage-service/local/lib/python2.7/site-packages/django/core/management/__init__.py\", line 354, in execute_from_command_line\n    utility.execute()\n  File \"/usr/share/python/ar
chivematica-storage-service/local/lib/python2.7/site-packages/django/core/management/__init__.py\", line 346, in execute\n    self.fetch_command(subcommand).run_from_argv(self.argv)\n  File \"/usr/share/python/archivematica-storage-servi
ce/local/lib/python2.7/site-packages/django/core/management/__init__.py\", line 182, in fetch_command\n    settings.INSTALLED_APPS\n  File \"/usr/share/python/archivematica-storage-service/local/lib/python2.7/site-packages/django/conf/__
init__.py\", line 48, in __getattr__\n    self._setup(name)\n  File \"/usr/share/python/archivematica-storage-service/local/lib/python2.7/site-packages/django/conf/__init__.py\", line 44, in _setup\n    self._wrapped = Settings(settings_
module)\n  File \"/usr/share/python/archivematica-storage-service/local/lib/python2.7/site-packages/django/conf/__init__.py\", line 92, in __init__\n    mod = importlib.import_module(self.SETTINGS_MODULE)\n  File \"/usr/lib/python2.7/imp
ortlib/__init__.py\", line 37, in import_module\n    __import__(name)\n  File \"/opt/archivematica/archivematica-storage-service/storage_service/storage_service/settings/production.py\", line 33, in <module>\n    ALLOWED_HOSTS = get_env_
variable('DJANGO_ALLOWED_HOSTS').split(',')\n  File \"/opt/archivematica/archivematica-storage-service/storage_service/storage_service/settings/base.py\", line 19, in get_env_variable\n    raise ImproperlyConfigured(error_msg)\ndjango.co
re.exceptions.ImproperlyConfigured: Set the DJANGO_ALLOWED_HOSTS environment variable\n", "path": "/usr/share/python/archivematica-storage-service/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin", "state": "abs
ent", "syspath": ["/tmp/ansible_HV1PHp", "/tmp/ansible_HV1PHp/ansible_modlib.zip", "/tmp/ansible_HV1PHp/ansible_modlib.zip", "/usr/lib/python2.7", "/usr/lib/python2.7/plat-x86_64-linux-gnu", "/usr/lib/python2.7/lib-tk", "/usr/lib/python2
.7/lib-old", "/usr/lib/python2.7/lib-dynload", "/usr/local/lib/python2.7/dist-packages", "/usr/lib/python2.7/dist-packages"]}
        to retry, use: --limit @/vagrant/singlenode.retry

@minusdavid
Copy link
Author

I don't want to abandon this pull request, but if I can't get a dev environment up soon, I think I just won't be able to devote more time to this. I probably should've stopped working on this ages ago, but I'm just stubborn and interested.

@jraddaoui
Copy link
Contributor

jraddaoui commented Aug 8, 2017

The following PR should fix the error in TASK [artefactual.archivematica-src : Run SS django collectstatic]

artefactual-labs/ansible-archivematica-src#123

@minusdavid
Copy link
Author

I think once #704 is resolved, I should be able to put together a development environment to test my changes here, so I'll hold out for that.

@minusdavid
Copy link
Author

Oh, one thing. Any idea why the Travis CI build is failing?

@jhsimpson
Copy link
Member

Looks like a flake8 formatting problem.

./src/MCPClient/lib/archivematicaClient.py:171:1: E302 expected 2 blank lines, found 1

If you add an extra blank line in there, and push another commit, it should pass.

@minusdavid minusdavid force-pushed the dev/gh-issue-673-mcpclient-error-checking branch from e1accdd to 40026e0 Compare August 9, 2017 00:19
This patch adds a try/except block to the MCPClient when creating
a Gearman worker in startThread().

Without this patch, if the MCPClient configuration item
"MCPArchivematicaServer" has an invalid value, no Gearman worker
will be created and Archivematica will be stuck thinking that a
job is executing indefinitely with no indication of what happened
in the user interface or the logs.

To test, open "/etc/archivematica/MCPClient/clientConfig.conf",
and change "MCPArchivematicaServer" to something invalid like
"buffalo" or "localhost::9999", and then try to do a standard
transfer in the Archivematica dashboard UI. In the micro-service
"Verify transfer compliance", you'll get stuck at "Job: Set file
permissions". It will say it's still executing but the job will
never actually run.
@minusdavid minusdavid force-pushed the dev/gh-issue-673-mcpclient-error-checking branch from 40026e0 to d5b0a92 Compare August 9, 2017 00:26
@minusdavid
Copy link
Author

@jhsimpson Thanks for the explanation. I'm not familiar with flake8 or Travis CI, so it wasn't until you copied that line that I saw the line number and realised what it was saying. All good now.

@sevein
Copy link
Member

sevein commented Aug 9, 2017

I think once #704 is resolved, I should be able to put together a development environment to test my changes here, so I'll hold out for that.

@minusdavid, #704 is now fixed!

@minusdavid
Copy link
Author

Alas, still no luck. Looks like the problem was with the symlinks in the stable branches...

@sevein
Copy link
Member

sevein commented Sep 25, 2017

Closing this, pleased read the original issue #673 to know more.

@sevein sevein closed this Sep 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants