Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kicost hangs when parallel scraping on Mac-OS #174

Closed
timoalho opened this issue Feb 6, 2018 · 32 comments
Closed

Kicost hangs when parallel scraping on Mac-OS #174

timoalho opened this issue Feb 6, 2018 · 32 comments
Assignees
Labels
bug Bugs that impacts on main KiCost functionality.

Comments

@timoalho
Copy link
Contributor

timoalho commented Feb 6, 2018

KiCost hangs on all at least moderately complicated BoMs when scraping, at least on OS X (see below for exact versions). For example, when running

kicost -i  tests/test3.xml

where test3.xml is the included test BoM, after scraping for a while, KiCost hangs, displaying

Progress:  69%|█████████████████████▎         | 22/32 [00:56<00:25,  2.56s/part]

or something similar. Waiting for up to an hour does not help, whereas the same BoM finishes in less than two minutes on earlier versions. For the same BoM and consecutive runs on the same machine, the hang mostly happens on the same part, but otherwise seems to vary.

After testing various points in the commit history, I've narrowed the problem down to appearing at commit 6e960bc490f86567945f21b477b533471923958c, on Jan 15th. The previous commit, 7417ab20596e9921b9a6b0f90b11b8463d96bbcd, on Jan 13th, still works (although crashes on a different error after the scraping is #finished).

This appears very much like a race condition somewhere in the code.

Tested on OS X 10.11.6., Python 3.6.2.

@hildogjr
Copy link
Owner

hildogjr commented Feb 6, 2018

Hi @timoalho, could you attach the XML or, if is something for work, check with @xesscorp how to access it?
I am using also this last commit of KiCost (on Linux/Ubuntu) and, using also the multifiles resource to create a big spreadsheet / BOM (100 or more parts) I could not reproduce the error. I am having issues with RS and Mouser, but they are already reported.

@timoalho
Copy link
Contributor Author

timoalho commented Feb 6, 2018

@hildogjr The problem appears using the test3.xml which is in the tests directory in this very repo.

@hildogjr
Copy link
Owner

hildogjr commented Feb 6, 2018

Thanks @timoalho by being a "beta user" and contributor.
I confirm some error. Tracking the origin, may be related with the throttling_delay resource inclusion.
I got no error if tests/test3.xml, but tests/test.xml return me:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.5/dist-packages/kicost/distributors/web_routines.py", line 204, in scrape_part
dist_module = dist_modules[d]
KeyError: 'mine'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/bin/kicost", line 11, in
sys.exit(main())
File "/usr/local/lib/python3.5/dist-packages/kicost/main.py", line 244, in main
scrape_retries=args.retries, throttling_delay=args.throttling_delay)
File "/usr/local/lib/python3.5/dist-packages/kicost/kicost.py", line 199, in kicost
id, url, part_num, price_tiers, qty_avail = result.get()
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
KeyError: 'mine'

@hildogjr hildogjr added the bug Bugs that impacts on main KiCost functionality. label Feb 6, 2018
@timoalho
Copy link
Contributor Author

timoalho commented Feb 6, 2018

@hildogjr Indeed, the commit where the problem appears is the one which adds throttling_delay. I've been trying to debug this on my machine, specifically I think the problem is in the synchronization in scrape_part. The following lines seem a bit suspicious, but I don't understand the code well enough to say for sure if any of these is a problem:

  • d = choice(distributors) is outside the synchronization, i.e. the distributor is chosen without locking the list
  • distributors.remove(d) is also outside the lock, so the distributor is also removed without locking.
  • I don't see what prevents two threads from picking the same distributor, as the removal happens only after a lengthy delay where the actual scraping happens, and the lock is open. Unless I'm misunderstanding the code?
  • why is the lock acquired non-blocking, since the code just goes into a busy loop drawing different random distributors if the lock is not acquired?

However, fiddling around with these has not revealed anything, nor fixed the problem. So right now I'm basically just shotgun debugging.

@timoalho
Copy link
Contributor Author

timoalho commented Feb 6, 2018

Please ignore much of my last comment, I only now realized that distributors is not shared between threads...

@timoalho
Copy link
Contributor Author

timoalho commented Feb 6, 2018

I tried reverting the implementation of throttling, but it doesn't seem to help, it still hangs. I.e. the main loop part of scrape_part I have is (note that almost everything is commented out):

# Scrape the part data from each distributor website or the local HTML.
# Create a list of the distributor keys and randomly choose one of the
# keys to scrape. After scraping, remove the distributor key.
# Do this until all the distributors have been scraped.
# distributors = list(distributor_dict.keys())
# while distributors:
for d in distributor_dict:

    # d = choice(distributors)  # Randomly choose one of the available distributors.

    try:
        # dist_module = getattr(THIS_MODULE, d)
        dist_module = dist_modules[d]
    except AttributeError:
        dist_module = dist_modules[distributor_dict[d]['module']]

    # Try to access the list of distributor throttling timeouts.
    # Abort if some other process is already using the timeouts.
    # if throttle_lock.acquire(blocking=False):

        # Check the throttling timeout for the chosen distributor to see if
        # another access to its website is allowed.
        # if throttle_timeouts[d] <= time():

            # Update the timeout for this distributor website and release the sync. lock.
            # throttle_timeouts[d] = time() + distributor_dict[d]['throttling_delay']
            # throttle_lock.release()

    # Get the HTML tree for the part.
    html_tree, url[d] = get_part_html_tree(part, d, dist_module.get_part_html_tree, local_part_html, scrape_retries, scrape_logger)

    # Call the functions that extract the data from the HTML tree.
    part_num[d] = dist_module.get_part_num(html_tree)
    qty_avail[d] = dist_module.get_qty_avail(html_tree)
    price_tiers[d] = dist_module.get_price_tiers(html_tree)

    # The part data has been scraped from this distributor, so remove it from the list.
    # distributors.remove(d)

        # If the timeout for this distributor has not expired, then release
        # the sync. lock and try another distributor.
        #else:
        #    throttle_lock.release()

# Return the part data.
return id, url, part_num, price_tiers, qty_avail

The code still hangs.

@hildogjr
Copy link
Owner

hildogjr commented Feb 7, 2018

I am assigning this to @xesscorp to help in the discution, because this was coded by @devbisme and I did understand completely yet how the DummyLock() and the throttle_timeouts work.

@hildogjr
Copy link
Owner

hildogjr commented Feb 7, 2018

I confirm @timoalho, may be some race condition. Because sometimes, when I am debugging new features, I usually use more than one KiCost (main) instance and got this error. After running each file by time, didn't.

@xesscorp
Copy link
Collaborator

I'm not seeing any hanging with test3.xml. I've run it with the latest KiCost under Python 2.7 and 3.6 on Windows 7. I've tried it with throttling both on and off with no change.

test.xml is throwing an error because something has been changed with the code that handles custom part data as described in the usage section of the manual.

@hildogjr
Copy link
Owner

So, it is a fake-positive error? (Because, yes, on Ubuntu 16.04, Python 2.7 and 3, test3.xml have no error).

@xesscorp
Copy link
Collaborator

I don't know if it's a fake error, I think it's an error that may only occur on OSX. This is not the first time I've seen that with OSX. I'm not sure if the error is in KiCost or within some Python library that OSX is using.

@hildogjr
Copy link
Owner

hildogjr commented Feb 15, 2018

@timoalho, could you attach a kicost --debug 8 (Obsessive degub message) to give us some clue?
About the other issue, I found the source, it is related if the create_local_part_html(). The function is including a 'mine' distributor.

@timoalho
Copy link
Contributor Author

Sure. Attached is a log from kicad -i test3.xml --debug 8 > issue174.log. In about a minute or two, it got to

Progress:  59%|██████████████████▍            | 19/32 [01:03<00:30,  2.34s/part] 

and then hung. I waited for an hour, with no further progress, after which I interrupted the run with CTRL + C (note that this seems to have generated quite a lot of output at the end of the log).

issue174.log

@timoalho
Copy link
Contributor Author

Here's another run, but with the log taken before pressing CTRL+C, to reduce clutter:
issue174_run2.log

@hildogjr hildogjr self-assigned this Feb 20, 2018
@hildogjr
Copy link
Owner

hildogjr commented Feb 20, 2018

Thanks @timoalho. Could you, now, run in a sequential way? (-s or np 1)
kicost -i %file% --debug 8 -s.

You log is reporting a lot of fails, appear to be from Farnell submodule but, I will kwon better if the messages of the different threads be not mixed (this is the why of the -s, I forgot to ask this).

Please, check also excluding farnell and newark from scrap kicost -i %file% --exclude farnell newark.

@xesscorp, see that that part number got from the page table have a lot of \t, and this happens in more than one distributor submodule, but mainly in Farnell and Newark and may create some complication to match? (solution, fix this 2 module and add some pn.strip() in all submodule).

@timoalho
Copy link
Contributor Author

Sorry about the delay, but here's finally the log from a sequential run:

issue174_seq.log

@hildogjr
Copy link
Owner

@timoalho , after the last part scraped (U9), KiCost just hangs with apparent no reason? (even in the sequential mode).
I took a look and didn't found not strange in the log (yet).

@xesscorp
Copy link
Collaborator

I just tried another run of test3.xml using the latest pull from the kicost master branch. I didn't get any stalls or other strange behavior running on Windows 7 Professional with Python 3.6.4. As for the results, everything but RS and TME seemed pretty well populated.

@timoalho
Copy link
Contributor Author

timoalho commented Apr 3, 2018

I just tried the most recent version, and the same result, this time it hung at 17/27, I aborted after about half an hour. The log is taken just before pressing CTRL+C, to avoid having it littered with messages from the abort.

The CLI command was again:

kicost -i test3.xml --debug 8 > issue174_180403.log

issue174_180403.log

@hildogjr
Copy link
Owner

hildogjr commented Apr 3, 2018

Could you test in the serial (not parallel mode)?

kicost -i test3.xml --debug 8 --serial > issue174_180403.log

@timoalho
Copy link
Contributor Author

timoalho commented Apr 4, 2018

Sure! This time the serial version hung too, at 8/27, I aborted after several hours. The command was:

kicost -i test3.xml --debug 8 --serial > issue174_ser_180404.log

I again grabbed the log before pressing CTRL + C.

issue174_ser_180404.log

@timoalho
Copy link
Contributor Author

timoalho commented Apr 5, 2018

I also tried running with

kicost -i test3.xml --debug 8 --serial --exclude farnell newark > issue174_ser_180405.log

i.e. excluding farnell and newark, and this time it finished (I don't know in what time though, as I left it running when leaving work).

I'll retry the parallel version with the exclusions, and also just to be sure try running the serial version without the exclusions, to see if the problem is intermittent.

issue174_ser_180405.log

@timoalho
Copy link
Contributor Author

timoalho commented Apr 5, 2018

Parallel with exclusions, hung at 12/27 after about an hour:

kicost -i test3.xml --debug 8 --exclude farnell newark > issue174_excl_180405.log

issue174_excl_180405.log

You might also be interested in the console output after pressing CTRL + C, as that seems to tell us where some of the threads were waiting:

    kicost -i test3.xml --debug 8 --exclude farnell newark > issue174_ser.log
Progress:  44%|████████████████▍                    | 12/27 [00:49<00:42,  2.86s/part]^CProcess ForkPoolWorker-66:
Process ForkPoolWorker-58:
Process ForkPoolWorker-73:
Process ForkPoolWorker-74:
Process ForkPoolWorker-64:
Process ForkPoolWorker-63:
Process ForkPoolWorker-75:
Process ForkPoolWorker-72:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-71:
Process ForkPoolWorker-36:
Process ForkPoolWorker-57:
Process ForkPoolWorker-40:
Process ForkPoolWorker-70:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semProcess ForkPoolWorker-62:
Process ForkPoolWorker-65:
Process ForkPoolWorker-68:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-53:
Traceback (most recent call last):
  File "/usr/local/bin/kicost", line 11, in <module>
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
    load_entry_point('kicost==0.1.42', 'console_scripts', 'kicost')()
  File "/usr/local/lib/python3.6/site-packages/kicost-0.1.42-py3.6.egg/kicost/__main__.py", line 275, in main
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
    local_currency=args.currency)
  File "/usr/local/lib/python3.6/site-packages/kicost-0.1.42-py3.6.egg/kicost/kicost.py", line 297, in kicost
Process ForkPoolWorker-67:
    while(not r.ready()):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 628, in ready
    return self._event.is_set()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 342, in get
    res = self._reader.recv_bytes()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 407, in _rProcess ForkPoolWorker-69:
ecv_bytes
    buf = self._recv(4)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-49:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-35:
lock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-59:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-55:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-51:
Traceback (most recent call last):
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Process ForkPoolWorker-34:
Process ForkPoolWorker-39:
Traceback (most recent call last):
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 341, in get
    with self._rlock:
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt

@timoalho
Copy link
Contributor Author

timoalho commented Apr 5, 2018

@hildogjr Rerunning the serial version, kicost indeed finished in 42 minutes:

kicost -i test3.xml --debug 8 --serial > issue174_ser_180405.log
Progress: 100%|████████████████████████████████████| 27/27 [42:27<00:00, 122.26s/part]

issue174_ser_180405.log

@hildogjr
Copy link
Owner

hildogjr commented Apr 11, 2018

I was insisting because I have not access to a Mac and create a virtual machine to it is quite complicated.
Serial version is really low speed, but now I now that is related if the parallel scrape. Apear that the lock have a different behavior between OS.

@hildogjr hildogjr changed the title Kicost hangs when scraping Kicost hangs when parallel scraping on Mac-OS Apr 11, 2018
@mmmaisel
Copy link
Contributor

mmmaisel commented Jun 1, 2018

My refactoring branch reimplements the whole locking mechanism. Maybe this issue resolves during the that process.

I can try on a Mac after the refactoring has stabilized.

@mmmaisel
Copy link
Contributor

mmmaisel commented Jun 4, 2018

I tested my refactored version on a Mac and it works now.

@hildogjr
Copy link
Owner

hildogjr commented Jun 4, 2018

@timoalho, please, if not ask too much, could you confirm the refactored version in https://github.com/mmmaisel/KiCost before we merge in the main branch?

@timoalho
Copy link
Contributor Author

timoalho commented Jun 6, 2018

@hildogjr sure, no problem! I'll try to get around to doing that tomorrow.

@timoalho
Copy link
Contributor Author

timoalho commented Jun 7, 2018

The @mmmaisel refactored branch seems to work!! The number of parts in test3.xml seems to be 162 now (previously the progress indicator showed xx/27, now xx/162), is this actually the number of parts times the number of distributors or what has increased?

Anyway, here's the delightful final result:

Progress: 100%|█████████████████████████████| 162/162 [10:35<00:00, 18.58s/part]

Big thanks to @hildogjr , @xesscorp and @mmmaisel for persevering with this, some of the features in the new version are really useful to me, but without the parallel scraping it was unusably slow :)

@timoalho
Copy link
Contributor Author

timoalho commented Jun 7, 2018

...and also the BOMs for my project, in which I originally discovered the problem, works now!

A very minor gripe: when using multiple BOMs, the output file is written to the directory where the first BOM is located, whereas I'd have expected it to be placed in the current directory. Took me a while to find it!

@hildogjr
Copy link
Owner

hildogjr commented Jun 7, 2018

Thanks be the opnion @timoalho, since the implementation of this feature you are the first user that give us some opinion.
The other issues, such parallel scrape and the counter (now I think is counting 'distributors*parts', I will check before release on pypi.

@hildogjr hildogjr closed this as completed Jun 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs that impacts on main KiCost functionality.
Projects
None yet
Development

No branches or pull requests

4 participants