-
-
Notifications
You must be signed in to change notification settings - Fork 818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow synchronization with Mailkit for yahoo, hotmail #650
Comments
What makes you think they are downloading full messages? You don't need to download all of the messages to show a message-list in an email client. Use the Fetch() method to batch request the summary information. |
We are implementing a mode where the user can choose to download full messages not only the summaries. Most of the email clients have this option. We already have synchronization that download message bodies on request (when the user try to open the actual message) For outlook it is easy to see that the full message is downloaded by disconnecting from internet and check if you have access to the attachments and email text. Almost all email clients have this option. |
They download using multiple connections, not just 1. The problem with offering an API to batch download infinite full messages is, well, you don't have infinite memory and most users of my library do not understand this. I've had complaints from users who download 1 message at a time, adding them to a I'm sure you can understand my hesitation with implementing this feature :) |
Multiple connections was my first guess but I looked at the Outlook connections with TCPView and it was clear there was only one imap connection open. I believe i'll manage to add the functionality I need by myself. |
it should be trivial to do. I modified the internal code to make this easier a few weeks ago. |
This seems to be working ekalchev@af7cd36 I am getting 20-50 times faster synchronization speed with yahoo when downloading full messages. |
Yep, looks like your code is functionally correct. I thought about this a bit last night and realized that returning an I also got to thinking that it may be a good idea, when requesting a list of messages by index, that you also provide the caller with the UID for each of those messages since the client will most likely want to cache these messages by the UID (useless to cache by index and somewhat racey if you use the index to do a UID lookup based on a previously cached IMessageSummary list if you aren't correctly listening to all events and/or if the API did return a list of messages rather than using a real-time callback approach). |
Yeah returning IList is not a good idea. Additional to what you said that is also a lot of bytes allocated. However with streams it is different - if you don't return the streams but raise event when each stream is available the user can read the stream(save it to file or database) and dispose it. (or re-use the same memory with Microsoft.IO.RecyclableMemoryStream). This way you won't waste much memory for large number of messages and still have the flexibility fetching message parts with single FETCH command. Yeah you'll need to pair those stream with UID or index somehow. I can give you another use case that cannot be done efficiently with Mailkit and it doesn't involve download of a full message. See the message preview with red - that is part in most email clients. This preview is actually the first N characters of the TEXT body part of the message. So if I want to implement this with Mailkit I need to pull IMessageSummary on batch of 30 messages - that will be very efficient. However, to obtain the TEXT part of each message I don't have interface to pull those 30 TEXT parts as a single batch and I'll need to execute 30 commands FETCH index (BODY.PEEK[TEXT]). I need to wait those 30 fetches and with yahoo that is about ~40-60 sec. This way the experience of the user will be very slow synchronization even if I am not downloading the entire message. I understand the interface for such functionality is challenging but there are very common use cases which cannot be implemented efficiently with Mailkit . I hope in the future you can come up with something to cover this. |
You probably don't want to use I've been trying to think of a way to do that and have been somewhat waiting for the IMAP working group to standardize a way to get this. There was talk a few years back about an IMAP extension for this but I have yet to see a draft/rfc for it. Maybe it's time to give up waiting... |
I've just committed a patch to add support for batch requesting message streams using a callback approach. I think that will satisfy your needs. About the message text blurb: how many characters do you actually need? I was thinking about adding a It would be a less-awkward API than providing a GetStreams() method that takes a list of uids & BodyPart specifiers. The problem is that it means I either need a new set of Fetch() methods that now also take a "blurbLength" argument or else I hard-code it to something (with the possibility of making it overridable if you subclass ImapClient or something?). |
I was thinking of just requesting 1024 bytes, but that seems overkill. Maybe 256 bytes? |
Thanks for addressing this. My opinion is that you shouldn't try to plug that inside Fetch methods. It won't be clear that Fetch method is actually doing 2 imap fetch commands. Why don't you add a new method GetBlurbs that do that? That method could take IMessageSummary and populate it or something like that - that will solve the problem with index,uid mapping. |
Code was buggy but is now fixed and I've got unit tests as well for it now. |
…ew of the message Fixes #650 (comment)
I've just committed a way to get the "preview text" of a message using the Fetch() API's by passing in |
Great! I am glad you group FETCH commands for mail summaries that share the same part specifier for the TEXT part. That will be very efficient. My concern is the usage of UID FETCH. As you know FETCH is different from UID FETCH https://tools.ietf.org/html/rfc3501#page-73 Note: UID FETCH, UID STORE, and UID SEARCH are different Until now, ImapFolder.Fetch overload that accepts index was guaranteeing that index for the existing items in will not be changed after the call of the method. Now if you pass of MessageSummaryItems.PreviewText that will make hidden UID FETCH which can emit EXPUNGE. That will require to handle EXPUNGE which is unclear from the signature of the method and will lead to many issues with the existing code that rely that index will be preserved after Fetch(index... call |
Yea, I noticed this myself yesterday. It's something that I will need to fix even w/o this new feature since untagged FETCH responses could come in the middle of existing UID FETCH commands as well. I think what I'll have to do is add an EXPUNGE event listener to the FetchSummaryContext and connect it for the UID FETCH requests. In some ways I wish I had designed the Fetch() methods to take a callback instead of tying them to returning a list of summaries (which I did for simplicity of use). ImapFolder does have a MessageSummaryFetched event which could be used (altho I need to make a slight adjustment to when I emit those to make it immune to the EXPUNGE problem, but that's just 1 line of code). |
Ok, I think my latest commit should deal with that scenario. I'll write unit tests later tonight when I have some free time. |
MailKit 2.0.2 has been released with these features. |
We notice significant performance drop when using Mailkit IMAP, compared to Outlook or other email clients. I believe it is related to latency between issuing imap command and actual response from the server for some servers (yahoo, hotmail). For example yahoo imap - there is 1-3 sec delay between executing the command and getting back the response(tag FETCH 1 (BODY[])). We are trying to reduce amount of imap commands to speed up synchronization. Currently we use
GetStream method to download full messages and we execute this method for each message in the mail box. For mailbox with 200 emails that emit 200 imap fetch commands which stack up to 200-400sec delay only for getting response from the server(not accounting the actual data stream).
I can see commercial email clients like Outlook can achieve 10-20 times faster synchronization (when they are configured to download full messages (not only message summaries). I believe they download message bodies using batch like
tag FETCH 1:20 (BODY[])
Some performance figures for yahoo imap using console commands
tag FETCH 1 (BODY[]) - 6 sec to complete
tag FETCH 1:2 (BODY[]) - 7 sec to complete
tag FETCH 1:20 (BODY[]) - 11 sec to complete
tag FETCH 1:50 (BODY[]) - 25 sec to complete
It is clear it is much faster to download bodies on batch but I can't find a way to do this with Mailkit. I tripple checked the mailkit metods and can't find overload that allows me to do that. I would like to know how hard is to implement such overload myself or any advice how to speed up the synchronization with high latency servers?
The text was updated successfully, but these errors were encountered: