Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mercurial error on Update #1

Open
Ellerbrok opened this issue Feb 16, 2024 · 4 comments
Open

Mercurial error on Update #1

Ellerbrok opened this issue Feb 16, 2024 · 4 comments

Comments

@Ellerbrok
Copy link

Hi there,

there seems to be an issue with utf-8 in here. After installling the extension to Mercurial I get the following error message if I try to "Update" to a newer Revision of my repository.

** Mercurial version (6.4.2).  TortoiseHg version (6.4.2)
** Command: 
** CWD: C:\Program Files\TortoiseHg
** Encoding: cp1252
** Extensions loaded: mercurial_keyring unknown, rebase, strip, tortoisehg.util.configitems, win32lfn
** Python version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
** Windows version: sys.getwindowsversion(major=6, minor=2, build=9200, platform=2, service_pack='')
** Processor architecture: x64
** Qt-5.15.2 PyQt-5.15.7 QScintilla-2.13.3
Traceback (most recent call last):
  File "tortoisehg\hgqt\cmdui.pyc", line 649, in runCommand
  File "tortoisehg\hgqt\update.pyc", line 398, in runCommand
  File "tortoisehg\hgqt\update.pyc", line 342, in isclean
  File "mercurial\context.pyc", line 1460, in modified
  File "mercurial\util.pyc", line 1760, in __get__
  File "mercurial\context.pyc", line 1425, in _status
  File "mercurial\localrepo.pyc", line 3388, in status
  File "mercurial\context.pyc", line 432, in status
  File "mercurial\context.pyc", line 2001, in _buildstatus
  File "mercurial\context.pyc", line 1906, in _dirstatestatus
  File "mercurial\dirstate.pyc", line 1681, in status
  File "mercurial\dirstate.pyc", line 1505, in walk
  File "mercurial\windows.pyc", line 599, in statfiles
  File "C:/Program Files/TortoiseHg/win32lfn.py", line 116, in fn
    path = stringtobytes(uncabspath(args[0]))
  File "C:/Program Files/TortoiseHg/win32lfn.py", line 97, in uncabspath
    path = bytestostring(path)
  File "C:/Program Files/TortoiseHg/win32lfn.py", line 377, in bytestostring
    string = string.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdc in position 43: invalid continuation byte
@Ellerbrok
Copy link
Author

I also tried the most recent Mercurial version:

** Mercurial version (6.5.1).  TortoiseHg version (6.5.1)
** Command: 
** CWD: C:\Program Files\TortoiseHg
** Encoding: cp1252
** Extensions loaded: mercurial_keyring unknown, rebase, strip, tortoisehg.util.configitems, win32lfn
** Python version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
** Windows version: sys.getwindowsversion(major=6, minor=2, build=9200, platform=2, service_pack='')
** Processor architecture: x64
** Qt-5.15.2 PyQt-5.15.7 QScintilla-2.13.3
Traceback (most recent call last):
  File "tortoisehg\hgqt\cmdui.pyc", line 649, in runCommand
  File "tortoisehg\hgqt\update.pyc", line 398, in runCommand
  File "tortoisehg\hgqt\update.pyc", line 342, in isclean
  File "mercurial\context.pyc", line 1460, in modified
  File "mercurial\util.pyc", line 1760, in __get__
  File "mercurial\context.pyc", line 1425, in _status
  File "mercurial\localrepo.pyc", line 3408, in status
  File "mercurial\context.pyc", line 432, in status
  File "mercurial\context.pyc", line 2001, in _buildstatus
  File "mercurial\context.pyc", line 1906, in _dirstatestatus
  File "mercurial\dirstate.pyc", line 1681, in status
  File "mercurial\dirstate.pyc", line 1505, in walk
  File "mercurial\windows.pyc", line 599, in statfiles
  File "C:/Program Files/TortoiseHg/win32lfn.py", line 116, in fn
    path = stringtobytes(uncabspath(args[0]))
  File "C:/Program Files/TortoiseHg/win32lfn.py", line 97, in uncabspath
    path = bytestostring(path)
  File "C:/Program Files/TortoiseHg/win32lfn.py", line 377, in bytestostring
    string = string.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdc in position 43: invalid continuation byte

@Clonkex
Copy link
Owner

Clonkex commented Feb 16, 2024

Hmm. I guess something isn't encoded as utf-8 in your repo that was in mine 😕 I wonder if it could be related to being on Windows 8 🤔 What happens if you change line 377 from string = string.decode('utf-8') to string = string.decode('latin-1')? It's been a while since I worked on this and I never fully understood it to begin with so I can't say whether that's likely to work, but it's worth a shot.

If that works, or if that at least changes the error, we might need to change that part to try decoding as utf-8 and if that fails decode as something else. Or maybe Python has a way to properly detect the encoding of a string, if such a thing is possible. I'm not actually sure what data is being passed to that function, so it's a bit tricky to know what it should be doing exactly.

Or, if you have a Windows 10 box it might be worth testing whether your repo and this extension works there. My suspicion is that Windows 10 may be handling things as unicode where Windows 8 still returned directory listings in older encodings, or something along those lines.

@Ellerbrok
Copy link
Author

Ellerbrok commented Feb 16, 2024

Hi, in fact this is Windows 11. Maybe somthing in the Repository is utf16?

I found something that might help, but I have not testet this in the py file because I have no experience with Python.

def force_decode(string, codecs=['utf8', 'cp1252', 'latin-1', 'utf16' ]):
for i in codecs:
try:
return string.decode(i)
except UnicodeDecodeError:
pass

for item in os.listdir(rootPath):
#Convert to Unicode
if isinstance(item, str):
item = force_decode(item)
print item

@Clonkex
Copy link
Owner

Clonkex commented Feb 16, 2024

How strange! The log is reporting Windows 8 (version 6.2 is Windows 8, as is build 9200).

Ok, try changing this bit at line 375:

def bytestostring(string):
    if isinstance(string, bytes):
        string = string.decode('utf-8')
    return string

...to this:

def bytestostring(string):
    if isinstance(string, bytes):
        string = force_decode(string)
    return string

def force_decode(string, codecs=['utf8', 'cp1252', 'latin-1', 'utf16' ]):
    for i in codecs:
        try:
            return string.decode(i)
        except UnicodeDecodeError:
            pass

...and see if that helps. I have no real experience in Python except for the occasional Blender script so I can't say if this is right, but I think it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants