Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with subject =?UTF-8?Q? (2) #420

Open
paulocardozo opened this issue Jun 27, 2023 · 17 comments
Open

Problem with subject =?UTF-8?Q? (2) #420

paulocardozo opened this issue Jun 27, 2023 · 17 comments
Labels
bug Something isn't working validating Windows

Comments

@paulocardozo
Copy link

Describe the bug
When I'm parsing the e-mail message, I'm facing this issue when trying to getSubject().

Used config
I'm using default package config.

Code to Reproduce
The troubling code section which produces the reported bug.

$client = Client::account('default');
$client->connect();
$inbox = $client->getFolderByPath('INBOX');
$query = $inbox->messages();
$messages = $query->where(['UNSEEN'])->from($email)->get();

foreach ($messages as $message) {    
     echo $message->getSubject();
}

Expected behavior
99% of cases this script work, but to an specific message it's returing the subject as:

=?utf-8?Q?Confirmaci=C3=B3n_reserva_Free_Tour_Flo?= =?utf-8?Q?rencia_Esencial_-_Buendiatours.com?=

Desktop / Server (please complete the following information):

  • OS: Windows 11 Enterprise
  • PHP: XAMPP / PHP 8.2.7
  • Version v5.3
  • Provider Hetzner Online GmbH
@paulocardozo
Copy link
Author

@Webklex I tried what you said but no solutions.. I'm available to explore the issue but all did I try didn't worked.

@Webklex
Copy link
Owner

Webklex commented Jun 27, 2023

Hi @paulocardozo,
thanks a lot for reporting this issue here. I really appreciate it!

Since you are using a windows environment, I'm intriqued if it's related to #413.
If you try to access the text body of a message - do you actually receive it or are the headers included as well?

Additionally, please donate an anonymized version of the troubling mail. This will allow me to create a dedicated test case for this issue.
anonymized = remove all personal information you don't want to share with the world :)

Once again, thanks for taking the time and effort to make this library better!

Best regards and happy coding,

@Webklex Webklex added bug Something isn't working validating labels Jun 27, 2023
@paulocardozo
Copy link
Author

@Webklex I can provide you in private

@paulocardozo
Copy link
Author

paulocardozo commented Jun 27, 2023

@Webklex It seems be a local problem.. I'm creating from scratch an application that verify constantly a mailbox, in the older version of application, your packege is working as well, but on newer not.. I'll further investigate here and post as soon as I have news.

Thanks!

@paulocardozo
Copy link
Author

@Webklex I just noticed, the same code works on unix server (Hostinger), but isn't working locally on Windows. It's seems very strange to me, but still investigating..

@Webklex
Copy link
Owner

Webklex commented Jun 28, 2023

Hi @paulocardozo ,
thanks for the followups. This sounds indeed interesting - but what could it possibly be? Perhaps different default php mods?

@paulocardozo
Copy link
Author

paulocardozo commented Jun 28, 2023

@Webklex I really dont know.. I've checked all php.ini from versions installed here and nothing seems wrong..

It's the old version, working..

Working (Old) - ExtractBookingEmailsJob.zip

It's the newest version, not working..

Not Working (Newest) - ExtractTourBookingsEmailsJob.zip

@Webklex
Copy link
Owner

Webklex commented Jun 28, 2023

Honestly, I have no idea right now.. If anything regarding this pops in my mind I'll let you know for sure.
Thanks again for your help!

@paulocardozo
Copy link
Author

paulocardozo commented Jun 28, 2023

@Webklex Well, as I'm very delayed in a project, I've tested the solution above, it's related to #410 (comment) and it worked.

`private static function decodeSubject($subject) {
$parts = preg_match_all("/(=?[^\?]+?[BQ]?)([^\?]+)(?=)[\r\n\t ]*/i", $subject, $m);

    $joined_parts = '';
    if (count($m[1]) > 1 && !empty($m[2])) {
        // Example: GyRCQGlNVTtZRTkhIT4uTlMbKEI=
        $joined_parts = $m[1][0].implode('', $m[2]).$m[3][0];

        $subject_decoded = iconv_mime_decode($joined_parts, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");

        if ($subject_decoded && trim($subject_decoded) != trim(rtrim($joined_parts, '='))) {
            return $subject_decoded;
        }
    }

    // iconv_mime_decode() can't decode:
    // =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
    $subject_decoded = iconv_mime_decode($subject, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");

    // Sometimes iconv_mime_decode() can't decode some parts of the subject:
    // =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
    // =?iso-2022-jp?B?GyRCQGlNVTtZRTkhIT4uTlMbKEI=?=
    if (preg_match_all("/=\?[^\?]+\?[BQ]\?/i", $subject_decoded)) {
        $subject_decoded = \imap_utf8($subject);
    }

    if (!$subject_decoded) {
        $subject_decoded = $subject;
    }

    return $subject_decoded;

}`

@freescout-helpdesk
Copy link
Contributor

freescout-helpdesk commented Jun 28, 2023

FYI. In our project we've completely replaced $this->decode($header->subject) function with the one we've developed (see #410) because current solution ($this->decode()) often is not able to decode subject properly:
https://github.com/freescout-helpdesk/freescout/blob/dist/overrides/webklex/php-imap/src/Header.php#L208

And it works like a charm now. So we have not seen any subject which this function could not decode.

@daniel89fg
Copy link

I have version 5.5 and the MailHelper library does not exist. Why? I have the same problem, some email subjects don't go well.

@paulocardozo
Copy link
Author

I have created a function to parse subject:

`private static function decodeSubject($subject)
{

    $parts = preg_match_all("/(=\?[^\?]+\?[BQ]\?)([^\?]+)(\?=)[\r\n\t ]*/i", $subject, $m);

    $joined_parts = '';
    if (count($m[1]) > 1 && !empty($m[2])) {
        // Example: GyRCQGlNVTtZRTkhIT4uTlMbKEI=
        $joined_parts = $m[1][0] . implode('', $m[2]) . $m[3][0];

        $subject_decoded = iconv_mime_decode($joined_parts, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");

        if ($subject_decoded && trim($subject_decoded) != trim(rtrim($joined_parts, '='))) {
            return $subject_decoded;
        }
    }

    // iconv_mime_decode() can't decode:
    // =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
    $subject_decoded = iconv_mime_decode($subject, ICONV_MIME_DECODE_CONTINUE_ON_ERROR, "UTF-8");

    // Sometimes iconv_mime_decode() can't decode some parts of the subject:
    // =?iso-2022-jp?B?IBskQiFaSEcyPDpuQC4wTU1qIVs3Mkp2JSIlLyU3JSItahsoQg==?=
    // =?iso-2022-jp?B?GyRCQGlNVTtZRTkhIT4uTlMbKEI=?=
    if (preg_match_all("/=\?[^\?]+\?[BQ]\?/i", $subject_decoded)) {
        $subject_decoded = \imap_utf8($subject);
    }

    if (!$subject_decoded) {
        $subject_decoded = $subject;
    }

    return $subject_decoded;
}`

Then I do that.
image

@daniel89fg
Copy link

Wonderful, works perfectly, thank you very much. We'll hope the imap-php library fixes this in the future.

@devlibfer
Copy link

Hi guys, sometimes the $message->getSubject() method returns text in quoted-printable. How can I detect the format of the email subject?

@blagi
Copy link

blagi commented Nov 15, 2023

this paulocardozo decodeSubject function id doing pretty good job with my test set of subjects, but cannot decode this one:

=?UTF-8?B?VGlja2V0IE5vOiBb7aC97bOpMTddIE1haWxib3ggSW5ib3ggLSAoMTcpIEluY29taW5nIGZhaWxlZCBtZXNzYWdlcw==?=

I had issue with it using some modified Roundcubemail methods (my 10 years old solution), and I had to make some modifications for this subject. Finally I got a solution that decode it to valid UTF-8 string, but it's complicated and I need a better one with Webklex/php-imap. Anybody can modify this nice paulocardozo function above to work with this subject?

@freescout-helpdesk
Copy link
Contributor

freescout-helpdesk commented Nov 15, 2023

this paulocardozo decodeSubject function id doing pretty good job with my test set of subjects, but cannot decode this one:

=?UTF-8?B?VGlja2V0IE5vOiBb7aC97bOpMTddIE1haWxib3ggSW5ib3ggLSAoMTcpIEluY29taW5nIGZhaWxlZCBtZXNzYWdlcw==?=

We've just checked this subject with the latest version of decodeSubject() function, it was decoded into:

Ticket No: [??????17] Mailbox Inbox - (17) Incoming failed messages

@blagi
Copy link

blagi commented Nov 15, 2023

We've just checked this subject with the latest version of decodeSubject() function, it was decoded into:

Ticket No: [??????17] Mailbox Inbox - (17) Incoming failed messages

Great. That's correct subject. In original encoded subject there are a couple of invalid utf-8 characters and that function is replacing them with question mark.

Thanks!

Webklex added a commit that referenced this issue Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working validating Windows
Projects
None yet
Development

No branches or pull requests

6 participants