Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contacts import: Whitespace & vCard object separators #9593

Open
2 tasks done
respiranto opened this issue Aug 17, 2024 · 9 comments
Open
2 tasks done

Contacts import: Whitespace & vCard object separators #9593

respiranto opened this issue Aug 17, 2024 · 9 comments

Comments

@respiranto
Copy link
Contributor

respiranto commented Aug 17, 2024

Prerequisites

  • I have searched for duplicate or closed issues
  • I can recreate the issue with all plugins disabled

Describe the issue

This bug report describes several related bugs, related to

  • a) whitespace (space, tab) in imported vCard files, and
  • b) vCard object separators (BEGIN:VCARD, END:VCARD).

Bug 1: Leading whitespace in line continuation silently dropped

Example (note that vCard mandates CRLF as newline sequence):

BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
NOTE:an
  example
END:VCARD

The NOTE value is parsed as anexample instead of an example (only the first whitespace character should be dropped - see RFC 2426).

In particular, this means that a Roundcube export followed by a Roundcube import may silently fail to recreate the original data.


Note 2: Leading and trailing whitespace in a logical line is silently dropped

Any number of space/tab characters at the start or end of a logical line (or a component value, such as in N) is dropped.

This is not a bug, IMHO, given that Roundcube also strips surrounding whitespace when entering data via its web UI.

I think it might be related, however.


Note 3: Repeated BEGIN:VCARD: All ignored until last

If there are multiple BEGIN:VCARD lines before any END:VCARD line, all lines (not only BEGIN:VCARD lines) until the last of those BEGIN:VCARD lines are ignored.

I would say this is also not a bug, because such input is invalid. A warning or error message would certainly be nice, though.


Note 4: Repeated END:VCARD cause duplication

If a VCard object is terminated by more than one END:VCARD line, the entry is imported as often as there are END:VCARD lines.

If on import, one does not choose to "[r]eplace the entire address book", only one instance is imported, but with the note "Skipped (n-1) existing entries: [...]".

Any physical lines after the first END:VCARD that are neither BEGIN:VCARD nor END:VCARD are apparently ignored.

Again, not necessarily a bug, because any such input is of invalid syntax (but a warning or error message would be nice).


Bug 5: vCard object separators wrongly recognized in line continuations

If a physical line is of the form [ \t]+(BEGIN|END):VCARD, it is used as line continuation, but also recognized as vcard object start/end marker.

Example:

NOTE:example
 END:VCARD

This is treated the same as:

NOTE:exampleEND:VCARD
END:VCARD

Note: For BEGIN:VCARD, the use as line continuation can only be assumed, given that preceding lines are ignored (see Note 3).

While this bug may seem unlikely in practice, I actually witnessed it with real data, likely due to past import/export errors.


Bug 6: vCard object separators not parsed as logical lines

  • (6.1) Any logical line BEGIN:VCARD or END:VCARD that is broken into multiple physical lines using \r\n[ \t] is not recognized as such.
  • (6.2) On the other hand, if a physical line BEGIN:VCARD or END:VCARD is followed by a line continuation (i.e., a line starting with [ \t]), this is (incorrectly) recognized as the corresponding vCard object separator, and the line continuation is silently ignored--unless Bug 5 applies.

Example for (6.1):

BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
EMAIL:[email protected]
END:
 VCARD

The above example fails to import (and is instead attempted to be parsed as CSV--without success).

I acknowledge that this bug hardly occurs in practice. I found it while investigating the other bugs.


Originally reported at the Debian BTS as Bug 1078775, where I was asked to report the bug here instead. The bug report has slightly changed.

What browser(s) are you seeing the problem on?

Firefox

What version of PHP are you using?

v8.2

What version of Roundcube are you using?

master / 58721e3

JavaScript errors

No response

PHP errors

No response

@alecpl alecpl added this to the later milestone Aug 17, 2024
@johndoh
Copy link
Contributor

johndoh commented Aug 25, 2024

fixing the first part may be as simple as tightening up the regex pattern since the RFC states:

A logical line MAY be continued on the next physical line anywhere between two characters by inserting a CRLF immediately followed by a single white space character (space, ASCII decimal 32, or horizontal tab, ASCII decimal 9)

--- a/program/lib/Roundcube/rcube_vcard.php
+++ b/program/lib/Roundcube/rcube_vcard.php
@@ -681,7 +681,7 @@ class rcube_vcard
     private static function vcard_decode($vcard)
     {
         // Perform RFC2425 line unfolding and split lines
-        $vcard = preg_replace(["/\r/", "/\n\\s+/"], '', $vcard);
+        $vcard = preg_replace(["/\r/", "/\n\\s/", "/\n\\t/"], '', $vcard);
         $lines = explode("\n", $vcard);
         $result = [];

@respiranto
Copy link
Contributor Author

fixing the first part may be as simple as tightening up the regex pattern since the RFC states:

A logical line MAY be continued on the next physical line anywhere between two characters by inserting a CRLF immediately followed by a single white space character (space, ASCII decimal 32, or horizontal tab, ASCII decimal 9)

--- a/program/lib/Roundcube/rcube_vcard.php
+++ b/program/lib/Roundcube/rcube_vcard.php
@@ -681,7 +681,7 @@ class rcube_vcard
     private static function vcard_decode($vcard)
     {
         // Perform RFC2425 line unfolding and split lines
-        $vcard = preg_replace(["/\r/", "/\n\\s+/"], '', $vcard);
+        $vcard = preg_replace(["/\r/", "/\n\\s/", "/\n\\t/"], '', $vcard);
         $lines = explode("\n", $vcard);
         $result = [];

I believe the \s is incorrect in your suggestion, and should be replaced by a literal space character ( ).

I am not sure why \r is just dropped, but sticking with that, I'd suggest:

--- a/program/lib/Roundcube/rcube_vcard.php
+++ b/program/lib/Roundcube/rcube_vcard.php
@@ -681,7 +681,7 @@ class rcube_vcard
     private static function vcard_decode($vcard)
     {
         // Perform RFC2425 line unfolding and split lines
-        $vcard = preg_replace(["/\r/", "/\n\\s+/"], '', $vcard);
+        $vcard = str_replace(["\r", "\n ", "\n\t"], '', $vcard);
         $lines = explode("\n", $vcard);
         $result = [];
 

@pabzm
Copy link
Member

pabzm commented Sep 12, 2024

@respiranto Could you maybe put the suggested change into a pull request, so we can properly review and then merge it? That would be great! (If you have solutions for the other topics, please feel free to do the same! We're only a very small team and this issue might take a long time to get resolved unless someone helps.)

respiranto added a commit to respiranto/roundcubemail that referenced this issue Sep 14, 2024
Previously, multiple whitespace characters at the start of a
continuation line would all be dropped, instead of only the first one.

Also,
 - restrict line continuation characters to SPACE and TAB.

Note that, like before, this identifies the CR (`\r`) character with the
empty string, and thereby notably does not require a CRLF (`\r\n`)
sequence (which is mandated by RFCs 2426, 2425) for line termination
(i.e., `\n` suffices).

Fixes: Bug 1 of issue roundcube#9593.
@respiranto
Copy link
Contributor Author

respiranto commented Sep 14, 2024

Pull request for bug 1 submitted.

W.r.t. the others:

@pabzm
Copy link
Member

pabzm commented Sep 15, 2024

I would agree to it, even though I'd prefer another library because vobject appears to be unmaintained (no changes or release since years).

@alecpl What do you think?

@alecpl
Copy link
Member

alecpl commented Sep 16, 2024

It is maintained https://github.com/sabre-io/vobject/releases, and is probably the only option. I'm using the library in Kolab. So, I'm for it, but it won't be a trivial task.

@pabzm
Copy link
Member

pabzm commented Sep 16, 2024

Oh, I looked at the old repository. Thank you for the correction!

@respiranto Would you dare an attempt? We would of course be here to help if you get stuck or would like feedback.

respiranto added a commit to respiranto/roundcubemail that referenced this issue Sep 16, 2024
@respiranto
Copy link
Contributor Author

I do dare an attempt. Should I open a new issue for that (notably as a place to ask questions)?

@pabzm
Copy link
Member

pabzm commented Sep 17, 2024

Great!

Should I open a new issue for that (notably as a place to ask questions)?

Yes, I think that's a good idea!

alecpl pushed a commit that referenced this issue Sep 18, 2024
* vcard: Fix whitespace handling in line cont's

Previously, multiple whitespace characters at the start of a
continuation line would all be dropped, instead of only the first one.

Also,
 - restrict line continuation characters to SPACE and TAB.

Note that, like before, this identifies the CR (`\r`) character with the
empty string, and thereby notably does not require a CRLF (`\r\n`)
sequence (which is mandated by RFCs 2426, 2425) for line termination
(i.e., `\n` suffices).

Fixes: Bug 1 of issue #9593.

* vcard: Add test for #9593/1

* Fix coding style
alecpl pushed a commit that referenced this issue Sep 18, 2024
* vcard: Fix whitespace handling in line cont's

Previously, multiple whitespace characters at the start of a
continuation line would all be dropped, instead of only the first one.

Also,
 - restrict line continuation characters to SPACE and TAB.

Note that, like before, this identifies the CR (`\r`) character with the
empty string, and thereby notably does not require a CRLF (`\r\n`)
sequence (which is mandated by RFCs 2426, 2425) for line termination
(i.e., `\n` suffices).

Fixes: Bug 1 of issue #9593.

* vcard: Add test for #9593/1

* Fix coding style
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants