top | item 11186508

Outlook 2016’s New POP3 Bug Deletes Your Emails

227 points| luu | 10 years ago |wp.josh.com | reply

72 comments

order
[+] dmbaggett|10 years ago|reply
From a client standpoint, POP is not actually trivial.

The main problem with POP is that unless you do something clever, determining the changes in the mailbox from time t0 to time t1 is both conceptually difficult and computationally expensive. This is because generic POP has no concept of a message UID, so there's no principled way to diff between states. In a very real sense, this means that POP is somewhat broken for the most important use case for the client: syncing to server changes.

The UIDL extension adds a "UID" but it's just an MD5 hash of the contents -- meaning that multiple copies of a message appear to be the same message. And you can ask for just the headers -- which means you can get the message-id header -- but this is still very expensive to do repeatedly (say, every 5 minutes) on a 100,000 message POP store. And you can't ask for just the Message-Id header, which would fix the problem.

Even if you have valid UIDs -- which you won't -- you still have to run a diff algorithm. Typical dynamic programming algorithms are O(N^2), which obviously sucks big time for a 100,000 message POP store.

For Inky (http://inky.com) we use a clever linear-time diff algorithm based in part on [Meyers 86]: An O(ND) Difference Algorithm and Its Variations. [Burns & Long 97] A Linear Time, Constant Space Differencing Algorithm is also a good treatment. But I know Outlook and Thunderbird both use non-linear-time algorithms to diff, so "leave messages on server" gets increasingly (non-linearly) expensive as the mailbox size grows on the server.

A few other points on comments made in this thread:

- POP is still widely used. In the US, for example, Comcast has finally migrated to IMAP, but Verizon is still POP only.

- POP is, from the server standpoint, a very simple protocol, and it is highly amenable to automated testing, as others here have pointed out. For our own testing we generate both patterned and random mailbox modification sequences, then have the test client cooperate with the test server to ensure that the client has (independently) correctly determined what's happened to the (test) mailbox. This is a perfect example of a situation where investing significant effort into automated testing pays off -- and where a TDD approach to development would also work well.

[+] userbinator|10 years ago|reply
The UIDL extension adds a "UID" but it's just an MD5 hash of the contents -- meaning that multiple copies of a message appear to be the same message.

That sounds like a server problem; to quote the RFC,

The server should never reuse an unique-id in a given maildrop, for as long as the entity using the unique-id exists.

Why would you hash the contents? The arrival time should be unique, assuming no two messages could arrive at exactly the same time. That doesn't require any hashing.

I think POP could've been far better designed, without growing into the complexity of IMAP, with just a few little changes like this.

[+] hobarrera|10 years ago|reply
> POP is somewhat broken for the most important use case for the client: syncing to server changes.

That's why we've had IMAP for a couple of decades now (and available anywhere for over a decade). POP simply wasn't designed for this use case.

[+] username223|10 years ago|reply
> UPDATE 02/26/2016: ... TL;DR: Disable automatic updates...

That's just good advice for dealing with modern software, for which fixes, breakage, and feature churn are all mixed in a single awful stream. "Newer" does not mean "better."

[+] AlexTes|10 years ago|reply
I use near weekly updates of a custom Android 6.0 ROM and daily use Arch Linux which is a rolling release. My work is on the web with npm keeping us bleeding edge. Other people often ask why my system does this little extra thing that's useful. Usually the answer is found by --version.

Life with up to date tools can be real good. So unless I'm simply the only one using good tools, updates can be good.

[+] raverbashing|10 years ago|reply
As someone who recently had Android 6.0 available for their phone and decided to upgrade, exactly this

Some genius at Google decided to make the phone vibrate and beep every time there is an open WiFi spot, or you need to sign up to a known one

Really

This kind of crap (not the only one) almost justifies the extra price for iOS

[+] Aoyagi|10 years ago|reply
I'm genuinely surprised Microsoft allows disabling of those automatic updates...
[+] Nutmog|10 years ago|reply
I almost totally agree. If it wasn't for security fixes, there would be no value in updating almost any software. If it's already working OK, just leave it alone.

15 years ago when people had problems with their graphics cards, the standard "fix" was to update the driver. Now we're still updating drivers. Weren't the problems supposed to have been fixed many years ago? It seems they introduce as many new bugs as they fix, making the net effect of updates useless as far as bugs go.

For security related bugs, stop using C++ for internet facing software.

[+] duncan_bayne|10 years ago|reply
Many years ago I worked on the DPOP POP3 server and DList mailing list server, and did a little work on our company's SMTP and IMAP servers too.

My experience was that most clients sucked.

For example, one (Eudora?) would move IMAP messages by copy and delete (which may have been idiomatic, as I say it was a long time ago). But it wouldn't check the success of the copy operation, so a failure to copy would result in a delete, not a move.

Seems they still suck in 2016, for what seem like trivial reasons.

I mean, surely this sort of protocol interaction is very, very amenable to automated testing of some sort. We had a bunch of automated regression tests for our mail servers, written in C, back in the late nineties.

I'm genuinely uncertain whether to blame incompetence or another attempt at the strategies spelled out in the Halloween Documents: http://www.catb.org/esr/halloween/faq.html

[+] singlow|10 years ago|reply
I wasted a couple hours trying to help one of my clients who had this problem last week. What a pain it was. He had 5 devices connected to his account, 2 on pop and 3 on imap. It took 20 minutes just so listen to his explanation of emails appearing and disappearing and re-appearing based on which account saw them first. We spent the next hour turning on each device one at a time until we determined that his outlook pop client was the one deleting them, even though it was configured to leave them.
[+] Aloha|10 years ago|reply
I was actually more surprised someone still used POP3 for direct client access - not a bad thing really - I just thought the world had migrated to IMAP.
[+] makecheck|10 years ago|reply
I think greater compartmentalization of software is long overdue.

For a program like this, there ought to be a Sacred Core that Does Very Little and has implementations of key protocols that can’t be touched without the blessing of about 5 senior engineers and the personal seal of the CEO or some such.

In other words, it should be unbelievably hard to screw with parts of the program that are crucial, while making changes that shouldn’t have anything to do with it. (Heck, for all we know, they were adding Windows 10 Tiles™ when this screw-up occurred.)

[+] fanf2|10 years ago|reply
Microsoft keep rewriting their mail protocol implementations and fucking them up in new ways.
[+] _lbaq|10 years ago|reply
Man, I waited for a feature like this for at least 15 years, how do I get outlook in Linux ?!
[+] chris_wot|10 years ago|reply
It's POP3. Seriously, one of the easiest to understand mailbox protocols ever. HOW could Microsoft stuff up something like this so badly?

Look, I know that mistakes can be made. But in this case, I just can't think of a single excuse that would be satisfactory.

[+] frik|10 years ago|reply
Look what Microsoft did with *.odt (OpenOffice/LibreOffice) compatibility. Now they show bogus security and compatibility warning dialogs if you open or save such a file.

Look what they do with IMAP support in Outlook 2016. Pre-loading just the "Subject" (a feature supported since 1990s) got dropped, instead of the whole email incl attachments is downloaded.

And now we learn POP3 got crippled too. The common Microsoft tactics to lock-in end-user and enterprise customers to proprietary ever changing protocols only fully supported by their most recent server software - their Exchange and Outlook.com/Office365 cloud eco-system.

Microsoft wants you to upgrade to Windows 10 on PC, adopt it on smartphones (even it has just 1.1% market share), on servers, SQL-Server 2016, Exchange 2016, Office365 (subscription based Office 2016 client, cloud based SharePoint 2016 = OneDrive for Business on Azure), Skype for Business (=rebranded Lync), etc. Oh, and they ask you to install their telemetry services for Office on your clients, to get a "full picture". The telemetry and other phone home stuff cannot be turned off, except in the expensive LTSB license version. And several IPs and URLs are white lists in the kernel mode network layer, in all versions.

It's up to you, to help Microsoft to create another monopoly. Or you out smart them and say no.

[+] kbenson|10 years ago|reply
Well, it has nothing to do with the protocol. Like you said, POP3 is dead simple. This is within logic that sits above the protocol.

That said, yeah, this is something they definitely should be testing. The fact that the article updated yesterday with a new knowledge base article and the status of the problem says they are looking into it, means they've been notified but not yet fixed it. If they haven't stopped that update from rolling out, someone at MS may come back to very unhappy managers on Monday, and have to explain why they thought leaving a data destroying bug live over the weekend was the right call...

[+] unsignedint|10 years ago|reply
I sometimes speculate if it's coming from difference in focus of attention. I don't use Outlook myself any more, I know some who do. I know a lot of odd behavior in Outlook seems to be caused by where it involves connection with IMAP/POP3 protocol connection. (And I wouldn't think Microsoft employees are dogfooding them, either.)

Empirically, it seems to be less of this problem occurring on the connection to the Exchange server, back when I was forced to use before. (Though at that time didn't have had so much of the connection, but had plenty of issues some involving broken database and such, I wouldn't touch with a ten foot pole myself...)

[+] sixothree|10 years ago|reply
You should see what outlook did to my IMAP folders.
[+] EugeneOZ|10 years ago|reply
Update for those who dreamed about empty inbox.
[+] nine_k|10 years ago|reply
The widespread use of POP (well, any significant use of POP) is what makes me sad the most.
[+] xpda|10 years ago|reply
This is typical of software development in recent years. The emphasis on newer platforms and technology (in the case iMap) results in new user interface limitations, as well as bugs, for existing users of older technologies (i.e. desktop, keyboard access, and Pop3).

I've found eM Client to be a good alternative to Outlook and WLM.

[+] quattrofan|10 years ago|reply
I loath outlook with the power of a 1000 suns. Give me gmail anyday.
[+] slovette|10 years ago|reply
Eh.. What's life without a little salsa.
[+] ommunist|10 years ago|reply
Does this mean testing software before release at MS is second to none and relies on principle 'if users dont complain, we made it right'?

UPD: I was just about to persuade client to move to Outlook 2016 from 2011. Glad I did not.