top | item 43513967

How IMAP works under the hood

238 points| michidk | 11 months ago |blog.lohr.dev

75 comments

order

rmccue|11 months ago

I started writing a guide to IMAP back when I was working on an email client: https://github.com/rmccue/griffin/tree/master/docs/imap

Pulling large amounts of data for things like threading can be difficult on certain servers; my preferred approach ended up being to pull every ID and thread ID to maintain an in-memory tree. (This was, iirc, partially because Gmail’s implementation was slightly crippled with relation to threading.)

(I never finished the guide because I stopped on the project, alas - if IMAP were easier to work with, I might have finished it! And sadly, no JMAP support on Gmail, and the gateway was broken.)

mjl-|11 months ago

> I started writing a guide to IMAP back when I was working on an email client

I would be very interested in hearing from developers about how they write email clients that need to work with all the servers out there, with the varying levels of IMAP4 (original, rev1, rev2, combinations of at least a dozen extensions) and various levels of buggy behaviour.

I'm assuming a client developer would implement various "profiles". With an "advanced" profile for servers that implement things like CONDSTORE/QRESYNC, and a "very basic" profile that only do the absolute minimum. And probably a profile or two in between. When you encounter unexpected behaviour (eg bad syntax in protocol), you would downgrade the profile for that server for a while (it may get fixed)? If it works like this, I'm curious to the profiles developer choose. If it's not like this, I wonder how client developers work around compatibility issues (can't just keep reconnecting and trying the same if it results in an error!).

jeffbee|11 months ago

> Gmail’s implementation was slightly crippled

Gmail is not "crippled". A tiny but vocal community of old nerds have a petrified mental model of email that they associate with unix IMAP software from the 1990's, but those concepts do not appear in the IMAP standards anywhere.

camgunz|11 months ago

I've been working on some email stuff and I think probably four things are vexing about IMAP:

- The grammar is hard. I built a parser using lpeg and I'm incredibly glad I did--doing it ad hoc won't lead to good results.

- It's an asynchronous protocol. You can send lots of requests to a server and you have to tag them so you can match them up with responses later. You don't generally want to do that in a client; i.e. you don't want to do these transactional things over an async connection and track state across all of it. You want to like, block on deleting things, renaming things, sending things, etc.

- IMAP is multi-user--it's built around the idea of multiple clients accessing the same mailbox at the same time and streaming updates. Another thing you really don't want to deal with when building an email client.

- There's functionality that you basically shouldn't use; the big one is search. Even the specs more or less say "good luck using this".

You can group all this under the a heading of "we thought people would use this over telnet", but attachments and non-plain-text email I think made that non-viable.

I think this all means probably every non-web email client treats IMAP like POP and keeps its own store. I haven't done a survey or anything, but I'd be surprised if that weren't true.

mjl-|11 months ago

> You can send lots of requests to a server and you have to tag them so you can match them up with responses later

Yes, you can match the final OK/NO/BAD responses with the original command based on the tag. The annoying thing is that each command typically sends most of its data in "untagged responses". And IMAP servers can send unsolicited untagged responses at any time too. It's quite tricky to handle all those responses correctly in email clients. Indeed, normally you just want to send a command, and get all the data for that command back until the finishing response. IMAP clients typically open multiple connections to isolate commands.

pferde|11 months ago

I've found the server-side search functionality works very well, if you have good server implementation. Dovecot's, for example.

And as for treating IMAP like POP, yes, there are clients that only pay lip service to "having IMAP support", only so that they can have one more green checkbox in feature list that their present in their marketing. But there are also more serious clients.

userbinator|11 months ago

You can group all this under the a heading of "we thought people would use this over telnet"

No, the grammar is too convoluted for that. It's more like "we thought text-based protocols are better".

POP3 and SMTP are relatively easy to use manually. IMAP is not.

mr_mitm|11 months ago

> I think this all means probably every non-web email client treats IMAP like POP and keeps its own store. I haven't done a survey or anything, but I'd be surprised if that weren't true.

Pretty sure mutt doesn't. It only caches the headers.

jcranmer|11 months ago

> - There's functionality that you basically shouldn't use; the big one is search. Even the specs more or less say "good luck using this".

Message sequence numbers. Every folder in IMAP has its emails numbered from 1-N, with no holes, so if you delete a message, everything after it has its message sequence number decremented to close the hole. Except IMAP is multiclient, and the message can be deleted by other connections than the one you're currently on. But the server is only allowed to tell you about message deletions at certain points, so now the server has to keep essentially per-client message sequence number state and carefully make sure that everyone is kept in sync... and it's a recipe for disasters in practice. Any sane client will instead use UIDs for everything (and any sane server will implement all the UID extensions to let UIDs be used for everything).

The other fun corner case I recall is that IMAP part numbering is a little unclear what happens around body parts of content-type message/rfc822. So I crafted a message that had a multipart/mixed with one leg being a message/rfc822 whose body was a message/rfc822, and tested the output on all 4 IMAP server implementations I had accounts on at the time to see how they handled the part numbering. I got back 4 different results. None of them were considered correct by the IMAP mailing list discussion on the experiment.

> I think this all means probably every non-web email client treats IMAP like POP and keeps its own store.

The distinction I would use is thin client versus thick client. Most clients like Outlook or Thunderbird are thick clients, which need to maintain their own local database for other reasons (like supporting offline mode or having database features not necessarily supported by an IMAP server, like arbitrary tagging). If you've got a local database, it's much saner to use IMAP essentially as a database synchronization protocol rather than trying to build a second implementation of all of your features on top of IMAP's native features, especially given that IMAP server implementation of these features is generally questionable at best.

IMAP was originally designed, it seems to me, to make thin clients easy to write (I can see how something like pine would be a very thin veneer over the protocol itself). But it's been clear over the past few decades that most clients are of the thick variety; things like QRESYNC and CONDSTORE were added to make the database synchronization steps a lot easier.

therein|11 months ago

Interesting no attempt has been made to make it at least be less heavy on networked bytes. Especially since it is old and was meant to be used on a connection with no compression or encryption.

HasChildren could have been Parent, HasNoChildren could have been Leaf or Child. And so many more things.

lotharcable2|11 months ago

IMAP had its day in the sun, but the advent of big webmail providers (especially gmail) has killed off the advancement of email clients. Now all major development is focused on trying to recreate Gmail to varying degrees of success. It all ends up internal to one or another corporation so they are just all endlessly reinventing the wheel with IMAP just being relegated to a afterthought front end to some sort of search-based backend.

Actually having a email client software running on your machine is extremely nitch and is mostly in the realm of self-hosters and legacy holdouts that won't let their clients go.

A most advanced modern approach is to just use POP3 to download your emails to a local Maildir and have them indexed there non-destructively. And then sync between your various machines that you want your email accessible using some sort of file sync or P2P solution.

I use notmuch for this. It automatically indexes and tags emails and thus enables much more advanced email management solutions then what can be offered over something like IMAP.

The main advantage of this is that 'folders' are managed virtually. There is no shuffling or copying or editing of emails done normally. I only have to worry about backing up my emails and notmuch config as all the rest can be regenerated relatively quickly.

This is more or less replicating what Gmail and other webmail providers do server side.

Where as the traditional approach shuffling and moving and deleting of emails on some imap server is fairly dangerous and expensive operation. Mistakes can lead to data loss and are often very difficult to reverse.

zaik|11 months ago

Despite people always complaining about their perceived inefficiencies, standard protocols like IMAP or XMPP always seem to work on a crappy connection, when most of the modern web doesn't.

philipwhiuk|11 months ago

The protocol has ossified and been entrenched. In general more efficient usage of IMAP relies on extensions to the protocol.

A modern replacement (JMAP) hasn't been adopted by major providers.

If you really cared about data transfer size you'd use something like Protobuf.

nirui|11 months ago

Probably wrong context, but the more code I wrote, more I like the these `Has`+Noun style naming than just Noun. Reading `HasChildren` will give you a clearer expectation of what the function would do and return, while `Parent` gives far weaker indication.

Maybe they thought the same when they were designing the protocol.

Also, in the context of email, given the size of each mail (including headers and body), these bytes "waste" maybe insignificant.

avar|11 months ago

Networked bytes generally don't matter, networked packets do. Would your proposed change move the needle on that?

ocdtrekkie|11 months ago

When every new Gmail client ships with an entire web browser embedded to load their hundred megabytes of JavaScript, I think we've long jumped the shark on caring about brevity in the length of information in the protocol itself.

It might have mattered back then but now it would be less than a rounding error.

shiandow|11 months ago

It could have, but for stuff you only do once per session that seems excessive. Better to have names that need no explanation, especially for stuff that I think is completely optional.

tiahura|11 months ago

I've been looking to migrate from Exchange-Outlook, but there really aren't any options. There just isn't an open source solution to have an integrated email / tasks / events / contacts, with consistent labels across item types and reminders.

With Outlook, i can use a custom view to see every item in a category flagged for follow-up. I can also set a reminder on a contact, or drag an email or contact onto my calendar to create an event.

beagle3|11 months ago

Thunderbird has it all. I don’t like the way thunderbird does it, but I like outlook even less… so thunderbird it is.

superkuh|11 months ago

Of course these days the mega-corp walled garden email providers don't really follow standards like IMAP. IMAP will not work with, say, Google's gmail or Microsoft office365, or AT&T ISP email, etc, etc. They have each implemented their own proprietery out-of-band authentication system that only works over HTTPS using the OAuth2.0 toolkit to build it. Any email client that does not explicitly design for each particular OAuth2.0 implementation (each megacorp's is slightly different) will not be able to connect over IMAP (unless they login via HTTPS using a web browser and set up "app passwords" for google, or similar for others).

slightwinder|11 months ago

> IMAP will not work with, say, Google's gmail or Microsoft office365

Except they do, to some degree. It works well enough that my Thunderbird allows me fetching or moving of mails. Not sure about advanced features like search or server-side filtering, never tried them, but this seems to be a bit more wacky with other clients & servicers too.

> They have each implemented their own proprietery out-of-band authentication system that only works over HTTPS using the OAuth2.0 toolkit to build it.

True. Gmail at least had a long while application-passwords. I think they changed this only recently? Or are they still a thing?

calvinmorrison|11 months ago

When I worked at fastmail there was of special fix code. You see it with firefox and chrome too, oh this popular site is breaking lets put a hardcoded if statement in. I specifically remember magic fixes for iCal.

Avamander|11 months ago

> Of course these days the mega-corp walled garden email providers don't really follow standards like IMAP.

Not really true. It's usually the client implementations that violate the standard in some way or another, like Outlook. But there are way more bespoke rare clients that have poor implementations.

> They have each implemented their own proprietery out-of-band authentication system that only works over HTTPS using the OAuth2.0 toolkit to build it.

Well, no. They have implemented OAuth and that's not proprietary. They do it because plain login has massive downsides.

jeffbee|11 months ago

Struggling to think of a way in which "IMAP will not work with gmail". Please explain.

nashashmi|11 months ago

Can IMAP be used as a file server system? Outlook has this functionality, where files can be stored directly outside of emails.

thesuitonym|11 months ago

This is one of those questions where the answer is technically `yes', but for all practical purposes should be considered `no'.

SSLy|11 months ago

most servers would reject entries without any headers as malformed

azhenley|11 months ago

I've been trying to get approval from Google for the sensitive scopes to use IMAP, and they classified us as needing "CASA Tier 3 Security Assessment". It looks like it is going to be a long, tedious, opaque, and expensive process.

isaachinman|11 months ago

What are you building?

accrual|11 months ago

That was a pretty interesting read. I didn't realize one could interact directly with an IMAP server like we can with FTP and telnet + HTTP.

chasil|11 months ago

POP is a little simpler, but IMAP is designed for concurrent access by multiple clients to the same account.

What this article doesn't address is OAUTH, which is required more often now.

steeeeeve|11 months ago

This is the kind of thing I would have expected to read in 2600 back in the day. And why I _always_ looked for 2600 at the bookstore.