top | item 46989552

(no title)

> Viva.com's outgoing verification emails lack a Message-ID header, a requirement that has been part of the Internet Message Format specification (RFC 5322) since 2008

> ...

> `Message-ID` is one of the most basic required headers in email.

Section 3.6. of the RFC in question (https://www.rfc-editor.org/rfc/rfc5322.html) says:

    +----------------+--------+------------+----------------------------+
    | Field          | Min    | Max number | Notes                      |
    |                | number |            |                            |
    +----------------+--------+------------+----------------------------+
    |                |        |            |                            |
    |/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

                             ... bla bla bla ...

     /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/|
    | message-id     | 0*     | 1          | SHOULD be present - see    |
    |                |        |            | 3.6.4                      |
    |/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

                             ... more bla bla ...

     /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/|
    | optional-field | 0      | unlimited  |                            |
    +----------------+--------+------------+----------------------------+

and in section 3.6.4:

    ... every message SHOULD have a "Message-ID:" field.

That says SHOULD, not MUST, so how is it a requirement?

discuss

Arnt|17 days ago

SHOULD is a requirement. It means that you have to do it unless you know some specific reason that the requirement doesn't apply in your case. "I don't want to" is not a valid excuse, "I don't see a reason to" isn't either.

IIRC this particular rule is a SHOULD because MUAs often send messages without a Message-ID to their submission server, and the submission server adds one if necessary. https://www.rfc-editor.org/rfc/rfc6409.html#section-8.3 The SHOULD lets those messages be valid. Low-entropy devices that can't generate a good random ID are rare these days, but old devices remain in service, so the workaround is IMO justified.

BeetleB|17 days ago

> SHOULD is a requirement.

I once had a job where reading standards documents was my bread and butter.

SHOULD is not a requirement. It is a recommendation. For requirements they use SHALL.

My team was writing code that was safety related. Bad bugs could mean lives lost. We happily ignored a lot of SHOULDs and were open about it. We did it not because we had a good reason, but because it was convenient. We never justified it. Before our code could be released, everything was audited by a 3rd party auditor.

It's totally fine to ignore SHOULD.

st_goliath|17 days ago

> "I don't want to" is not a valid excuse

for the client. If you're implementing a server, "the client SHOULD but didn't" isn't a valid excuse to reject a client either.

You can do it anyway, you might even have good reasons for it, but then you sure don't get to point at the RFC and call the client broken.

L_226|17 days ago

As someone who does systems engineering, the only valid requirements include the word "shall".

almosthere|17 days ago

The original email RFC is also completely unaware of how bad spam is. Sure it might mention it but it's not really AWARE of the problem. The truth is, companies like Google, Microsoft and a few others have de-facto adjusted the minimum requirements for an email. Signing, anti-spam-agreements, etc.. are the true standard if you want an email to get from point a to b. (none of which are going to be REQUIRED in the RFC)

5o1ecist|16 days ago

That's not what the word "should" means, though.

"Should" is a lot closer to "better do it this way" than "you must do it this way". While "must" implies a mandatory-ness, "should" does not.

Or take it from perplexity:

"Must" normally expresses strong obligation/necessity: something is required, with little or no choice.

"Should" is softer and usually expresses recommendation, expectation, or what is right/appropriate, not a strict requirement.

Aldipower|17 days ago

It isn't a requirement. SHOULD is conditional. MUST is _not_ conditional.

Sure, you can argue, if you require that the email reach their destination, it is required to set this. ;-)

But I am totally with the OP here. SHOULD was never a requirement, just a recommendation that is maybe better to follow.

unknown|17 days ago

[deleted]

croes|16 days ago

>It means that you have to do it unless you know some specific reason that the requirement doesn't apply in your case.

But that means a valid reason could exist and Google would block those mails too.

SecretDreams|17 days ago

Should = internal target

Must = external requirement

I cannot fathom how you think should* would act as a requirement in any sense of the world.

gerdesj|17 days ago

SHOULD is not MUST

ale42|17 days ago

The official definition of SHOULD per RFC2119:

  3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
     may exist valid reasons in particular circumstances to ignore a
     particular item, but the full implications must be understood and
     carefully weighed before choosing a different course.

Not sure how the people at Google interpreted this about the message-id

citrin_ru|17 days ago

You can argue that you not obligated to use message-id but if you don't use it you should blame only yourself that your messages are not accepted. In requiring message-id I would side with google (though in general I think they anti-spam is too aggressive and lacks ways to report false positives). Full RFC compliance (as in not only MUST but also SHOULD unless you have a very good reason) is the easiest part of making sure your emails will be delivered.

eli|17 days ago

You assume that internet standards are prescriptivist; that the document describes how it is to be implemented. In practice it's often descriptivist, with the standards documents playing catch-up with how things are actually going in practice.

Anyway, in general you can expect that doing unusual but technically valid things with email headers will very often get your messages rejected or filtered as spam.

Juliate|17 days ago

For producers, ignoring a SHOULD is riskier because it shifts the burden to every consumer.

For consumers, ignoring a SHOULD mostly affects their own robustness.

But here Google seems to understand it as a MUST... maybe the scale of spam is enough to justify it. Users are stuck between two parties that expect the other to behave.

jacquesm|17 days ago

Google interpreted it that way because it drives more people to use gmail.

ZWoz|17 days ago

My take, as a postmaster for hosting company, who don't have any sympathy to gmail (that should be visible from my comments history): Message-ID is absolutely MUST in production e-mails. You can send your test stuff without it, but real messages always have it. Not having Message-ID's causes lot of fun things. All somewhat competent software is capable to add Message-ID's, so lack of it is good indication of poorly made custom (usually spamming) solution.

Rspamd and spamassassin have missing MID check in their default rules, I am sure that most antispam software is same.

mort96|17 days ago

Your casual use of the word MUST is not the same as a standard document's use of the word MUST. Your real world experience is entirely irrelevant to the conversation about what the standard requires.

stefan_|17 days ago

Why? If I'm writing a mail receiver, and I'm told there is some unique ID generated by the sender in a loosely specified way, the first thing I'm doing is ignoring that value forever. One lesson surely most everyone learns in CS is that unique identifiers are maybe unique to the system generating them, but to rely on foreign generated IDs being unique globally is a terrible idea that will break within the minute.

So at that point the ID has no value to me except being obliged to carry it around with the message, so maybe the originating system can at some point make sense of it. But then there is obviously no reason to ever reject mail without it, it's an ID valid for the sender and the sender didn't care to include one, great, we save on storage.

the_mitsuhiko|17 days ago

Exactly. Message-ID is not required.

An unrelated frustration of mine is that Message-ID really should not be overridden but SES for instance throws away your Message-ID and replaces it with another one :(

dathinab|17 days ago

It is de-facto required and has been for many years.

Should in most RFCs also mean "do it as long as you don't have a very good technical reason not to do it". Like it's most times a "weak must". And in that case the only reason it isn't must is for backward compatibility with older mail system not used for sending automated mails.

And it is documented if you read any larger mail providers docs about "what to do that my automated mails don't get misclassified as spam". And spam rejection is a whole additional non-standardized layer on top of RFCs anyone working with mail should be aware of. In any decades old non centralized communication system without ever green standards having other "industry standard/de-factor" but not "standardized" requirements is pretty normal btw.

elAhmo|17 days ago

I would read this as a requirement for email to be 'legit' and not classified as spam.

Sure, you can send email with whatever headers you want, use weird combos, IP addresses, reply-to, and it might be still a technically valid email, but not something that should land in people's inboxes.

Also, a payment processor not testing their email on the most popular email provider in the world is quite ridiculous.

philipallstar|17 days ago

As indicated in the RFC, it uses another RFC[0] to define those words. Here's the relevant excerpt from that one:

    3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
                may exist valid reasons in particular circumstances to ignore a
                particular item, but the full implications must be understood and
                carefully weighed before choosing a different course.

[0] https://www.rfc-editor.org/rfc/rfc2119

dathinab|17 days ago

yeah, but that RFC isn't the only relevant document

Mail RFCs do not cover at all spam detection and malicious mail rejection, but it's a thing every large mail provider has and you really have to care about when producing automated mails all looking similar. And large mail providers like google tend to document what "base line" of additional requirements they have for accepting (automated mail). Having a Message-Id is in there, and in pretty much any larger mail providers documentation about that topic. Tbh. I have worked with mail before a bunch of years ago and the need for Message-Id was back then really no hidden gotcha but pretty well known.

and the design space mail provides is larger then any client could reasonable support (like it's really a huge mess covering docent of standards which allow all kind of nonsense and hypothetical use cases practical unsupported), so you anyway have to look at "what everyone does" and only then make sure it's also RFC compatible, instead of starting with the RFC. That was a painful lessen to learn.

In addition there are some de-facto standards not pinned down in any RFC, like e.g.:

- Message-Id being required for any automated mails by many mail providers (through how bad the consequences are if you don't have it diverges largely).

- You can't punycode encode the local part of an email address (it would be a different email), and there is no standard way (as far as I remember) to convert non us-ascii local parts to us-ascii. This is based on the fact that iff your mail server allows you to have non us-ascii local parts it should also support "internationalized mail" (SMTPUTF8 and co.). But it's a semi industry standard to give the user with an unicode local part also the mail with the punycode encoding of the local part so it often "just works" and dev are frequently surprised when it fails to work...

- You can have quoted text in local part, like whitespaces. But most of the industry decided to not give users such mail addresses so you can see it as soft deprecated and using it is asking for trouble.

- Attachements. The MIME encoding allows a lot of different ways to put mails together and doesn't force a specific semantic interpretation by mail clients. As such you if you naively use it you might run into surprises how/if your attachments or embedding(s) are displayed. Through today embedded resources often are either not done or uses data URLs. Again which ways work well and which don't is somewhat an industry standard and not in any RFC.

- A lot of different ways to encode Unicode to us-ascii. If you produce any mails you probably should by default use the latest revision (where encoding is often just not needed as things are utf8), but might need a fallback with it often being fully unclear if . But if you are a client you probably have to support older versions. And in some parts of the world/business segments usage of very very old mail servers is a thing which is a major pain if you run into it.

so quoting that something isn't strictly required by mail RFCs is kinda pointless as even many things explicitly allowed won't work well in practice

As a rule of thump: If you can afford it test you system will all widely used mail providers as if you where an external customers. And redo the tests yearly.

oh and as a bonus, if you mails looks too similar to known phishing mails it will also just disappear. That seems irrelevant, but e.g. if you use Keycloak with the default mail templates for password reset and co. there is a high chance of your mails ending up in spam or not even being delivered as scammers have used Keycloak for their means, too. And that isn't just a case for Keycloak but any "decently widely used open source software producing mails and having default templates". So you pretty much always need to change the default templates (you will do so anyway for branding, but skipping it in the earliest stages of a startup where branding might still be in flux isn't that uncommon either).

b00ty4breakfast|17 days ago

I know you're looking for "pedant points" but the specification generally take a backseat to implementation. If Message-ID is expected out here where the rubber meets the road, then you are the squeaky wheel in this scenario for not including it.

bossyTeacher|17 days ago

> the specification generally take a backseat to implementation.

And we should be raising hell for it. Should never happen. Using your popularity to violate protocol should be not be tolerated

OJFord|17 days ago

The only messages I receive without one are spam/phishing. I check because they're not recognised by notmuch, so I don't see them otherwise.

s17n|17 days ago

The reason that European tech sucks is that people in Europe are open to such arguments. If an engineer in the US started talking about SHOULD vs MUST, some PM would just give them that "what the fuck did I just listen to" face, spend the next few minutes gently trying to convince them that the customer experience matters more than the spec, and if they fail, escalate and get the decision they want.

For example, why does Google handle this differently for consumer and enterprise accounts? Well it's Google so the answer could always just be "they are disorganized" but there's a good chance that in both cases, it was the pragmatic choice given the slightly different priorities of these types of customers.

youknownothing|17 days ago

Not my PM (in the US). My PM would try to avoid anything that is not absolutely necessary and therefore ask developers not to develop anything that isn't a MUST. I know that we like making fun of Europe for their alleged lack of innovation but this isn't a Europe thing.

shaan7|17 days ago

Well the current US Administration would agree - the law doesn't matter, we need to be "pragmatic" and do what we think is right. Rules be damned.

Once you deviate a bit from the standard, you're down a slippery slope. Its not that difficult to use pragmatism to justify wrongdoing.

patrickmcnamara|17 days ago

Do bugs and bad implementations not exist in US software? If an US company did this, nobody would be bloviating about how it is a cultural issue or whatever.

bmn__|16 days ago

> some PM would just give them that "what the fuck did I just listen to" face

Some people have become way too comfortable taking for granted that it is okay to treat others in an uncivil fashion. To those I say: keep it up, and we shall revert to the fundamentals of society where violence is an option and one day you are copping punches in the face, or getting shackled and thrown in a ditch somewhere for the ravens. Behave like an animal, get treated like one.

mrweasel|17 days ago

Standards are important, and the meaning assigned to words are important, but if your very important email does go through, because Google thinks you're wrong, you add the bloody message-id. I really do agree, I don't care about the linguistic/legal/standard/technological reason as to why Google might be wrong, if I can't deliver email to Gsuite customers, I add the message-id.

You're precisely right that customer experience matter, but I wouldn't put it past some conservative European company to go: Well Google is wrong, so they should fix that. Google doesn't care, you can't make them care, you can't even contact them. Just make it work for your customer.

quadrifoliate|17 days ago

Seriously, I hope that none of the posters upthread arguing about SHOULD v/s MUST in some standards body document are in charge of real world money making software.

Google and Microsoft's email practices define a pseudo-RFC in practice. As an engineer, I hate this. As a civic participant I can vote against it. But as a person that sells my software services for a living, I am going to implement the Google/Microsoft standards to the letter, not argue about definitions in an RFC.

someonebaggy|17 days ago

[deleted]

hermannj314|17 days ago

You SHOULD follow the wording of the RFC, you MUST follow Google's interpretation of the RFC.

That is the difference.

redeeman|17 days ago

evidently they must not

thatha7777|17 days ago

And the definition of "SHOULD" (from RFC 2119) is "This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course."

Having said that, I regret my original characterization of the Message-ID header as a "requirement" and have updated the blogpost to be fair to all sides.

Thank you for bringing this up.

deepsun|17 days ago

GMail SHOULD handle your messages, not MUST.

jiggawatts|17 days ago

The HTTP User-Agent header is also optional, but if you omit it, something like half of all endpoints will respond with a 500 error code.

dathinab|17 days ago

1. SHOULD means, do it if you can/you have to have a really good reason if you don't do it. The only reason it's SHOULD and not MUST is backward compatibility. Mostly in context of "personal send mails", i.e. not automated mails. (For automated mail sending the expectations of you running somewhat up to date software is much higher).

2. You can't really implement mail stuff just based on RFCs:

- There docent overlapping RFCs which can sometimes influence each other and many of them obsolete older versions why others still relevant RFCs reference this older versions. This makes it hard to even know what actually is required/recommendation.

- Then you have a lot of "irrelevant" parts, which where standardized but are hardly supported/if at all. You probably should somewhat support them as recipient but should never produce them as sender today (mostly stuff related to pro-"everything is utf8" days). Like in general the ideas of "how mail should probably work" in old RFCs and "how it is done IRL today" are in some aspects _very_ far away.

- Lastly RFCs are not sufficient by themself. They don't cover large parts of the system for "spam detection/suspicious mail rejection". So it's a must have to go to the support pages of all large mail providers and read through what they expect of mails. And "automated mails need a message id" is a pretty common requirement. In addition you have to e.g. make sure the domain you use isn't black listed (e.g. due to behavior of a previous user), and that your servers IP addresses aren't black listed (they never should be black listed long term, but happens anyway, and e.g. MS has based on very questionable excuses "conveniently" black listed smaller local data center competition while also being one of the most widely used providers for commercial mail in that area).

unknown|17 days ago

[deleted]

tlogan|17 days ago

SHOULD = You are strongly recommended to do this, but it’s not absolutely required.

- In most cases, you are expected to follow it.

- You can choose not to follow it, but you must have a very good reason.

For example, RFC 7231 say that there should be DATE header but some embedded devices have no real-time clock so it ok not to implement.

unknown|17 days ago

[deleted]

layer8|17 days ago

The reason it’s recommended is that it’s useful for detecting when an email you receive is already in your mailbox, so that you don’t accumulate duplicates. Otherwise one would have to compare the complete email, which probably no MUA does. Another reason is that replies can include a reference to the original message, so that it can be properly threaded by MUAs.

So these are mostly quality-of-life reasons, it’s not a reason to reject an email.

PunchyHamster|17 days ago

> That says SHOULD, not MUST, so how is it a requirement?

Battle with spam has been for long part just trying to algorithmically fingerprint the scam bots and reject the message if it looks like it wasn't sent by "real" mail server/client.

So a lot of things that are optional like SPF/DKIM are basically "implement this else your mail have good chance of being put into spam automatically".

thatha7777|17 days ago

You're totally right. I've updated the blog to reflect this. Thank you!

zokier|17 days ago

Also email as a protocol (SMTP) predates RFC5322 by 25 years or so.

torlavd|17 days ago

Standard RFC naming, optional field.

zoobab|17 days ago

Avoid SHALL, SHOULD and all other crap, use Elon MUST.

roysting|17 days ago

SHALL has been interpreted/clarified by US courts as not being a fancy MUST or REQUIRED that many people were taught it to mean, but SHOULD still has it's purposes, e.g., to provide contractual flexibility in development, i.e., a MUST/REQUIRED requirement was more challenging or complicated and took up more time/resources than anticipated, so SHOULDs can be trimmed due to contingencies.

Another example may be a lightweight implementation of a spec in a limited and/or narrow environment, which remains technically compliant with full implementations of a spec but interaction with such a limited/narrow environment comes with awareness about such limitations.