Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when checking specific link #750

Closed
martindholmes opened this issue Apr 18, 2018 · 2 comments
Closed

Error when checking specific link #750

martindholmes opened this issue Apr 18, 2018 · 2 comments

Comments

@martindholmes
Copy link

I'm using the 9.3 version of linkchecker on Ubuntu 17.10, and I get this error reliably with this URL (output using -Dall):

linkchecker -Dall http://mythologica.fr/grec/heracles0.htm 
DEBUG 2018-04-18 10:39:32,091 MainThread Python 2.7.14 (default, Sep 23 2017, 22:06:14) 
[GCC 7.2.0] on linux2
DEBUG 2018-04-18 10:39:32,091 MainThread reading configuration from ['/home/mholmes/.linkchecker/linkcheckerrc']
INFO 2018-04-18 10:39:32,094 MainThread Checking intern URLs only; use --check-extern to check extern URLs.
DEBUG 2018-04-18 10:39:32,100 MainThread configuration: [('aborttimeout', 300),
 ('allowedschemes', []),
 ('authentication', []),
 ('blacklist', {}),
 ('checkextern', False),
 ('cookiefile', None),
 ('csv', {}),
 ('debugmemory', False),
 ('dot', {}),
 ('enabledplugins', []),
 ('externlinks', []),
 ('fileoutput', []),
 ('gml', {}),
 ('gxml', {}),
 ('html', {}),
 ('ignorewarnings', []),
 ('internlinks', []),
 ('localwebroot', None),
 ('logger', 'TextLogger'),
 ('loginextrafields', {}),
 ('loginpasswordfield', 'password'),
 ('loginurl', None),
 ('loginuserfield', 'login'),
 ('maxfilesizedownload', 5242880),
 ('maxfilesizeparse', 1048576),
 ('maxhttpredirects', 10),
 ('maxnumurls', None),
 ('maxrequestspersecond', 10),
 ('maxrunseconds', None),
 ('nntpserver', None),
 ('none', {}),
 ('output', 'text'),
 ('pluginfolders', []),
 ('proxy', {}),
 ('quiet', False),
 ('recursionlevel', -1),
 ('sitemap', {}),
 ('sql', {}),
 ('sslverify', True),
 ('status', True),
 ('status_wait_seconds', 5),
 ('text', {}),
 ('threads', 10),
 ('timeout', 60),
 ('trace', False),
 ('useragent',
  u'Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/)'),
 ('verbose', False),
 ('warnings', True),
 ('xml', {})]
DEBUG 2018-04-18 10:39:32,100 MainThread HttpUrl handles url http://mythologica.fr/grec/heracles0.htm
DEBUG 2018-04-18 10:39:32,100 MainThread checking syntax
DEBUG 2018-04-18 10:39:32,101 MainThread Add intern pattern u'^https?://(www\\.|)mythologica\\.fr\\/grec'
DEBUG 2018-04-18 10:39:32,101 MainThread Link pattern u'^https?://(www\\.|)mythologica\\.fr\\/grec' strict=False
DEBUG 2018-04-18 10:39:32,101 MainThread queueing http://mythologica.fr/grec/heracles0.htm
LinkChecker 9.3              Copyright (C) 2000-2014 Bastian Kleineidam
LinkChecker comes with ABSOLUTELY NO WARRANTY!
This is free software, and you are welcome to redistribute it
under certain conditions. Look at the file `LICENSE' within this
distribution.
Get the newest version at http://wummel.github.io/linkchecker/
Write comments and bugs to https://github.com/wummel/linkchecker/issues
Support this project at http://wummel.github.io/linkchecker/donations.html

Start checking at 2018-04-18 10:39:32-007
DEBUG 2018-04-18 10:39:32,104 CheckThread-http://mythologica.fr/grec/heracles0.htm Checking http link
base_url=u'http://mythologica.fr/grec/heracles0.htm'
parent_url=None
base_ref=None
recursion_level=0
url_connection=None
line=0
column=0
page=0
name=u''
anchor=u''
cache_url=http://mythologica.fr/grec/heracles0.htm
DEBUG 2018-04-18 10:39:32,104 CheckThread-http://mythologica.fr/grec/heracles0.htm checking connection
 1 thread active,     0 links queued,    0 links in   0 URLs checked, runtime 1 seconds
DEBUG 2018-04-18 10:39:33,111 CheckThread-http://mythologica.fr/grec/heracles0.htm u'http://mythologica.fr/robots.txt' parse lines
DEBUG 2018-04-18 10:39:33,111 CheckThread-http://mythologica.fr/grec/heracles0.htm Parsed rules:
User-agent: *
Allow: /
DEBUG 2018-04-18 10:39:33,112 CheckThread-http://mythologica.fr/grec/heracles0.htm u'http://mythologica.fr/robots.txt' check allowance for:
  user agent: u'Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/)'
  url: u'http://mythologica.fr/grec/heracles0.htm' ...
DEBUG 2018-04-18 10:39:33,112 CheckThread-http://mythologica.fr/grec/heracles0.htm /grec/heracles0.htm Allow: / True
DEBUG 2018-04-18 10:39:33,112 CheckThread-http://mythologica.fr/grec/heracles0.htm  ... rule line Allow: /
DEBUG 2018-04-18 10:39:33,113 CheckThread-http://mythologica.fr/grec/heracles0.htm Prepare request with {'headers': {}, 'url': u'http://mythologica.fr/grec/heracles0.htm', 'method': 'GET'}
DEBUG 2018-04-18 10:39:33,114 CheckThread-http://mythologica.fr/grec/heracles0.htm Send request with {'verify': False, 'timeout': 60, 'stream': True, 'allow_redirects': False}
DEBUG 2018-04-18 10:39:33,284 CheckThread-http://mythologica.fr/grec/heracles0.htm follow all redirections
DEBUG 2018-04-18 10:39:33,456 CheckThread-http://mythologica.fr/grec/heracles0.htm Redirected to u'https://mythologica.fr/grec/heracles0.htm'
DEBUG 2018-04-18 10:39:33,456 CheckThread-http://mythologica.fr/grec/heracles0.htm Intern URL u'https://mythologica.fr/grec/heracles0.htm'
DEBUG 2018-04-18 10:39:33,456 CheckThread-http://mythologica.fr/grec/heracles0.htm task_done https://mythologica.fr/grec/heracles0.htm


********** Oops, I did it again. *************

You have found an internal error in LinkChecker. Please write a bug report
at https://github.com/wummel/linkchecker/issues
and include the following information:
- the URL or file you are testing
- the system information below

When using the commandline client:
- your commandline arguments and any custom configuration files.
- the output of a debug run with option "-Dall"

Not disclosing some of the information above due to privacy reasons is ok.
I will try to help you nonetheless, but you have to give me something
I can work with ;) .

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 104, in check_url
    line: self.check_url_data(url_data)
    locals:
      self = <local> <Checker(CheckThread-http://mythologica.fr/grec/heracles0.htm, started 140603819489024)>
      self.check_url_data = <local> <bound method Checker.check_url_data of <Checker(CheckThread-http://mythologica.fr/grec/heracles0.htm, started 140603819489024)>>
      url_data = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
  File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 120, in check_url_data
    line: check_url(url_data, self.logger)
    locals:
      check_url = <global> <function check_url at 0x7fe0e55c9ed8>
      url_data = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
      self = <local> <Checker(CheckThread-http://mythologica.fr/grec/heracles0.htm, started 140603819489024)>
      self.logger = <local> <linkcheck.director.logger.Logger object at 0x7fe0e4ea7550>
  File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 52, in check_url
    line: url_data.check()
    locals:
      url_data = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
      url_data.check = <local> <bound method HttpUrl.check of <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>>
  File "/usr/lib/python2.7/dist-packages/linkcheck/checker/urlbase.py", line 424, in check
    line: self.local_check()
    locals:
      self = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
      self.local_check = <local> <bound method HttpUrl.local_check of <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>>
  File "/usr/lib/python2.7/dist-packages/linkcheck/checker/urlbase.py", line 442, in local_check
    line: self.check_connection()
    locals:
      self = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
      self.check_connection = <local> <bound method HttpUrl.check_connection of <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>>
  File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 137, in check_connection
    line: self.follow_redirections(request)
    locals:
      self = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
      self.follow_redirections = <local> <bound method HttpUrl.follow_redirections of <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>>
      request = <local> <PreparedRequest [GET]>
  File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 263, in follow_redirections
    line: self._add_ssl_info()
    locals:
      self = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
      self._add_ssl_info = <local> <bound method HttpUrl._add_ssl_info of <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>>
  File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 193, in _add_ssl_info
    line: sock = self._get_ssl_sock()
    locals:
      sock = <not found>
      self = <local> <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>
      self._get_ssl_sock = <local> <bound method HttpUrl._get_ssl_sock of <https link, base_url=u'http://mythologica.fr/grec/heracles0.htm', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=http://mythologica.fr/grec/heracles0.htm>>
  File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 184, in _get_ssl_sock
    line: if raw_connection.sock is None:
    locals:
      raw_connection = <local> None
      raw_connection.sock = <local> !AttributeError: 'NoneType' object has no attribute 'sock'
      None = <builtin> None
AttributeError: 'NoneType' object has no attribute 'sock'
System info:
LinkChecker 9.3
Released on: 16.7.2014
Python 2.7.14 (default, Sep 23 2017, 22:06:14) 
[GCC 7.2.0] on linux2
Requests: 2.18.1
Qt: 4.8.7 / PyQt: 4.11.4
Modules: Sqlite, Gconf
Local time: 2018-04-18 10:39:33-007
sys.argv: ['/usr/bin/linkchecker', '-Dall', 'http://mythologica.fr/grec/heracles0.htm']

LANGUAGEStatistics:
 =Downloaded: 0B.
 No statistics available since no URLs were checked.
'en_CA:en'

That's it. 0 linksLANG  in 0 URLs= checked.  'en_CA.UTF-8'0 warnings found
Default locale:.  ('en', 'UTF-8')0 errors found

.
 Stopped checking at 2018-04-18 10:39:33-007 (1 seconds)
******** LinkChecker internal error, over and out ********
@dpalic
Copy link

dpalic commented Jun 12, 2018

Thank you for the issue report. Sadly this project is dead, and a new team is around with https://github.com/linkcheck/linkchecker
for more details please see: #708
Also please close this issue and report it freshly on the new repo https://github.com/linkcheck/linkchecker/issues

@martindholmes
Copy link
Author

Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants