Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vCard: Use sabre/vobject library #9641

Open
respiranto opened this issue Sep 19, 2024 · 9 comments
Open

vCard: Use sabre/vobject library #9641

respiranto opened this issue Sep 19, 2024 · 9 comments

Comments

@respiranto
Copy link
Contributor

The vCard code currently has its own custom vCard parser. There are many issues with the current code and I deemed it better to use an external library, sabre/vobject being the apparently best choice.

See in particular the bug report #9593.

This may also help to address #9590.

@respiranto
Copy link
Contributor Author

@respiranto
Copy link
Contributor Author

respiranto commented Sep 19, 2024

Question 1: How should / can I report errors on parsing?

  • (a) When loading an individual vCard (constructor / load(); should generally be from the database),
  • (b) When import()ing (generally from a user provided vCard file).

For (b), I'd like to cause a user-visible message like "Failed importing $person_name", "Failed importing record without name or e-mail address ($record_data)", "Failed importing vCard due to unsupported encoding".

@respiranto
Copy link
Contributor Author

Question 2 (backslash handling): Would it be OK to break #4189 (and Roundcube\Tests\Framework\VCardTest::test_parse_five)?

  • In v3.0 and v4.0, backslashes that are not explicitly permitted as part of an escape sequence are generally illegal.
  • In v2.1, backslashes should apparently generally be treated like any other character.
  • The current code drops stray backslashes (e.g., \: becomes :).
    • This seems to be invalid for v2.1. (For v3.0 and v4.0, the input is illegal.)
  • sabre/vobject seems to parse stray (invalid) backslashes as themselves.
    • Therefore, I cannot distinguish them from propely (v3.0/v4.0)-escaped backslashes (\\), except by working on the raw input (which I very much would like to avoid).
    • Sidenote: sabre/vobject seems to incorrectly parse v3.0/v4.0 escape sequences for v2.1.

@respiranto
Copy link
Contributor Author

Question 3 (charset support): Would it be OK to reduce the set of supported charsets to that supported by sabre/vobject?

  • This would notably remove support for UTF-16, breaking Roundcube\Tests\Framework\VCardTest::test_import.
  • sabre/vobject supports UTF-8, ISO-8859-1, Windows-1252.

@respiranto
Copy link
Contributor Author

respiranto commented Sep 21, 2024

Suggestion 4 (charset support): Restrict document charset to (and assume) UTF-8.

  • The CHARSET parameter would still be respected (for v2.1; v3.0 and v4.0 do not define it).
  • Reason:
    • We can drop the detect_charset() heuristic, which may give wrong results.
  • Justification:
    • v4.0 mandates UTF-8. - RFC 6350#3.1
    • v3.0:
      • "Character set can only be specified on the CHARSET parameter on the Content-Type MIME header field." - RFC 2426#5
      • "UTF-8 preferred" - RFC 2426#4
      • It seems v3.0 is supposed to have a mime header, which does not seem to be the case here.
    • v2.1:
      • "The default character set is ASCII." - vCard 2.1 (section 2)
        • I see no harm in extending this to UTF-8.
      • "Some transports (e.g., MIME based electronic mail) may also provide a character set property at the transport wrapper level." - vCard 2.1 (section 2)
      • Again, we do not have a MIME header or similar AFAIK.
  • See also Question 3.

@alecpl
Copy link
Member

alecpl commented Sep 21, 2024

  1. I would postpone error handling to later, after the whole migration is done. I guess we can do something like rcube_addressbook::set_error() and friends.
  2. It would be good to verify if GMail is still doing that. It would be good to be able to import from GMail without issues.
  3. I think we could do UTF-8 only, but we probably should investigate what GMail and others use, so we can import data from them without much hassle.
  4. Where is question 4?

@respiranto
Copy link
Contributor Author

  1. OK
  2. I'll check that. Otherwise, if we assume UTF-8, removing illegal backslashes should actually not be too difficult.
  3. Any particular other software or providers that should be tested? Next to GMail, I'd definitely test Thunderbird.
  4. Reference should have been to Question 3. Corrected.

@respiranto
Copy link
Contributor Author

Testing:

  • GMail: v3.0, UTF-8, replaces : by \:.
  • Thunderbird: v4.0, UTF-8, : unchanged, \: correctly encoded as \\:.

GMail example:

BEGIN:VCARD
VERSION:3.0
FN:Jane Doé
N:Doé;Jane;;;
item1.URL:https\://example.org/
item1.X-ABLabel:
NOTE:noté\\\:foo\:bar
CATEGORIES:myContacts
END:VCARD

Thunderbird example:

BEGIN:VCARD
VERSION:4.0
N:Doé;Jane;;;
FN:Jane Doé
NOTE:noté:foo\\:bar
UID:fdc25d89-21a0-4069-a93f-2d1a13a6260c
URL:http://exampe.org
END:VCARD

@respiranto
Copy link
Contributor Author

Note that I have found 15 issues (mostly bugs) for sabre/vobject so far, all of which we should be able to work with, some of which may require workarounds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants