-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kicost hangs when parallel scraping on Mac-OS #174
Comments
Hi @timoalho, could you attach the XML or, if is something for work, check with @xesscorp how to access it? |
@hildogjr The problem appears using the |
Thanks @timoalho by being a "beta user" and contributor. multiprocessing.pool.RemoteTraceback: The above exception was the direct cause of the following exception: Traceback (most recent call last): |
@hildogjr Indeed, the commit where the problem appears is the one which adds throttling_delay. I've been trying to debug this on my machine, specifically I think the problem is in the synchronization in
However, fiddling around with these has not revealed anything, nor fixed the problem. So right now I'm basically just shotgun debugging. |
Please ignore much of my last comment, I only now realized that |
I tried reverting the implementation of throttling, but it doesn't seem to help, it still hangs. I.e. the main loop part of
The code still hangs. |
I confirm @timoalho, may be some race condition. Because sometimes, when I am debugging new features, I usually use more than one KiCost (main) instance and got this error. After running each file by time, didn't. |
I'm not seeing any hanging with test3.xml. I've run it with the latest KiCost under Python 2.7 and 3.6 on Windows 7. I've tried it with throttling both on and off with no change. test.xml is throwing an error because something has been changed with the code that handles custom part data as described in the usage section of the manual. |
So, it is a fake-positive error? (Because, yes, on Ubuntu 16.04, Python 2.7 and 3, test3.xml have no error). |
I don't know if it's a fake error, I think it's an error that may only occur on OSX. This is not the first time I've seen that with OSX. I'm not sure if the error is in KiCost or within some Python library that OSX is using. |
@timoalho, could you attach a |
Sure. Attached is a log from
and then hung. I waited for an hour, with no further progress, after which I interrupted the run with CTRL + C (note that this seems to have generated quite a lot of output at the end of the log). |
Here's another run, but with the log taken before pressing CTRL+C, to reduce clutter: |
Thanks @timoalho. Could you, now, run in a sequential way? ( You log is reporting a lot of fails, appear to be from Farnell submodule but, I will kwon better if the messages of the different threads be not mixed (this is the why of the Please, check also excluding farnell and newark from scrap @xesscorp, see that that part number got from the page table have a lot of |
Sorry about the delay, but here's finally the log from a sequential run: |
@timoalho , after the last part scraped (U9), KiCost just hangs with apparent no reason? (even in the sequential mode). |
I just tried another run of test3.xml using the latest pull from the kicost master branch. I didn't get any stalls or other strange behavior running on Windows 7 Professional with Python 3.6.4. As for the results, everything but RS and TME seemed pretty well populated. |
I just tried the most recent version, and the same result, this time it hung at 17/27, I aborted after about half an hour. The log is taken just before pressing CTRL+C, to avoid having it littered with messages from the abort. The CLI command was again:
|
Could you test in the serial (not parallel mode)?
|
Sure! This time the serial version hung too, at 8/27, I aborted after several hours. The command was:
I again grabbed the log before pressing CTRL + C. |
I also tried running with
i.e. excluding farnell and newark, and this time it finished (I don't know in what time though, as I left it running when leaving work). I'll retry the parallel version with the exclusions, and also just to be sure try running the serial version without the exclusions, to see if the problem is intermittent. |
Parallel with exclusions, hung at 12/27 after about an hour:
You might also be interested in the console output after pressing CTRL + C, as that seems to tell us where some of the threads were waiting:
|
@hildogjr Rerunning the serial version, kicost indeed finished in 42 minutes:
|
I was insisting because I have not access to a Mac and create a virtual machine to it is quite complicated. |
My refactoring branch reimplements the whole locking mechanism. Maybe this issue resolves during the that process. I can try on a Mac after the refactoring has stabilized. |
I tested my refactored version on a Mac and it works now. |
@timoalho, please, if not ask too much, could you confirm the refactored version in https://github.com/mmmaisel/KiCost before we merge in the main branch? |
@hildogjr sure, no problem! I'll try to get around to doing that tomorrow. |
The @mmmaisel refactored branch seems to work!! The number of parts in test3.xml seems to be 162 now (previously the progress indicator showed xx/27, now xx/162), is this actually the number of parts times the number of distributors or what has increased? Anyway, here's the delightful final result:
Big thanks to @hildogjr , @xesscorp and @mmmaisel for persevering with this, some of the features in the new version are really useful to me, but without the parallel scraping it was unusably slow :) |
...and also the BOMs for my project, in which I originally discovered the problem, works now! A very minor gripe: when using multiple BOMs, the output file is written to the directory where the first BOM is located, whereas I'd have expected it to be placed in the current directory. Took me a while to find it! |
Thanks be the opnion @timoalho, since the implementation of this feature you are the first user that give us some opinion. |
KiCost hangs on all at least moderately complicated BoMs when scraping, at least on OS X (see below for exact versions). For example, when running
where test3.xml is the included test BoM, after scraping for a while, KiCost hangs, displaying
or something similar. Waiting for up to an hour does not help, whereas the same BoM finishes in less than two minutes on earlier versions. For the same BoM and consecutive runs on the same machine, the hang mostly happens on the same part, but otherwise seems to vary.
After testing various points in the commit history, I've narrowed the problem down to appearing at commit
6e960bc490f86567945f21b477b533471923958c
, on Jan 15th. The previous commit,7417ab20596e9921b9a6b0f90b11b8463d96bbcd
, on Jan 13th, still works (although crashes on a different error after the scraping is #finished).This appears very much like a race condition somewhere in the code.
Tested on OS X 10.11.6., Python 3.6.2.
The text was updated successfully, but these errors were encountered: