How (not) to sign a JSON object (2019)

codeflo|1 year ago

I have a feeling that idea 2 is a recipe for disaster:

> Add the tag and the exact string you signed to the object, validate the signature and then validate that the JSON object is the same as the one you got.

In cryptographic practice, redundant information usually spells disaster, because inevitably, someone will use the copy that wasn't verified.

But let's dig into it. If I understand this correctly, the suggestion is to have something like this:

    {
        "object": "with",
        "some": "properties",

        "signatureInfo": {
            "signedString": "{\"object\":\"with\",\"some\":\"properties\"}",
            "signature": "... base64-encoded-signature ..."
        }
    }

It's mentioned that "the downside is your messages are about twice the size that they need to be". In my opinion, this scheme is pointless. To verify "the JSON object is the same as the one you got", you have to do what?

1. Parse the outer object as JSON, extract and remove signatureInfo.

2. Verify the signature.

3. Parse the signedString as JSON.

4. Verify that the object you got in step 1 equal to object you got in step 3 using a some kind of deep equality.

First of all, this is error prone, and as underspecified as JSON is, there are potential exploits if the comparison isn't done carefully. But even worse, if you think about it, the outer JSON is entirely useless, since you need to parse the inner JSON anyway -- so why not just use it directly?

It seems to me that this suggestion is strictly worse than just sending the inner part:

    {
        "signedString": "{\"object\":\"with\",\"some\":\"properties\"}",
        "signature": "... base64-encoded-signature ..."
    }

Yes, it's no longer "in-band", but I don't think it was really in-band before, it was just out-of-band with an outer layer of redundant information.

WorldMaker|1 year ago

This sort of suggestion is for that worst case that you already have brownfield consumers that don't care about the signature using outer fields and you need to add the signature without breaking those consumers.

The redundancy is absolutely a recipe for disaster, but so is the part where you have brownfield consumers that you can't break and know that they also don't care about message security.

Unfortunately, it's an all too common brownfield to find yourself stepping into, which is why it is such a too common ask for "inline JSON signatures" (or other document languages like XML) that don't change the outer shape of the JSON document to break backwards compatibility with dumber consumers.

Also, unfortunately the most correct answer in cryptographic practice is also often the hardest to sell to those consumers (or to business people prioritizing changes to them): break those consumers and force them to care about security so that a rising tide lifts all boats.

hinkley|1 year ago

Yep. I had a dickens of a time making XML-DSIG secure. I don’t know how they didn’t realize that getElementById returns the first element with an id and doesn’t give a shit if there are multiples. If you chose a different parent element you can get different results. I had to roll my own that threw an error on duplicate IDs and rejected the document.

If you only expect one signature I would recommend you wrap the signed content instead of treating it as a sibling. And even if you have multiple, maybe have signatures be siblings but put them all in the same wrapper. This means the recipient has to know signatures exist but honestly tough shit. If you’re adding sigs you’re going to end up expecting them and that’s a fact not an opinion. You don’t want any tools that ignore the signature and make decisions without validating them first. That’s a Confused Deputy attack waiting to happen.

Also in XML you have to canonicalize the document first, so that any formatting changes don’t invalidate the signature. So a couple other parts of what you said are true but there are solutions, even if annoying ones.

treyd|1 year ago

This is one of the deeply dissatisfying parts of the Matrix spec. They didn't have any constraints forcing them to embed signatures within the json objects, but they elected to invent their own signing scheme and do it anyways, despite being a greenfield. It also includes support for a special "unsigned" portion for extra data that comes along (which is often used for the server to inject the age of an event).

I don't think the protocol still injects the signature into event structures, but this weird "unsigned" field is still there looking at the source json for a message I sent today, but it's possible it's removed after processing and Fluffychat is just removing it.

hinkley|1 year ago

I got so much shit for building an API that would not answer any queries about the signed documents until the signature had been verified. Trying to speed up processing and routing by making decisions before the authenticity of the data has been verified is a fool’s errand and false economy. You can’t make decisions based on what might be lies, and malicious ones at that. I spent a lot of time making the signature checks faster rather than buckling and making the signatures a joke.

DarkUranium|1 year ago

Not sure if I'm just misunderstanding the article or not, but it feels like an overengineered solution, reminescent of SAML's replacement instructions (just a hardcoded and admittedly way better option --- but still in a similar vein of "text replacement hacks").

I know it's not the most elegant thing ever, but if it needs to be JSON at the post-signing level, why not just something like `["75cj8hgmRg+v8AQq3OvTDaf8pEWEOelNHP2x99yiu3Y","{\"foo\":\"bar\"}"]`, in other words, encode the JSON being signed as a string. This would then ensure that, even if the "outer" JSON is parsed and re-encoded, the string is unmodified. It'll even survive weird parsing and re-encoding, which the regex replacement option might not (unless it's tolerant of whitespace changes).

(or, for the extra paranoid: encode the latter to base64 first and then as a string, yielding something like `["75cj8hgmRg+v8AQq3OvTDaf8pEWEOelNHP2x99yiu3Y","eyJmb28iOiJiYXIifQ"]` --- this way, it doesn't look like JSON anymore, for any parsers that try to be too smart)

If the outer needs to be an object (as opposed to array), this is also trivially adapted, of course: `{"hmac":"75cj8hgmRg+v8AQq3OvTDaf8pEWEOelNHP2x99yiu3Y","json":"{\"foo\":\"bar\"}"}`.

theamk|1 year ago

You can and this will be simple and reliable.. but that's solving the different (and easier) problem that the post. In the post, author wants to have still have parsable JSON _and_ a signature. Think middleware which can check signature, but cannot alter the contents, followed by backend expecting nice JSON. Or a logging middleware which looks at individual fields. Or a load balancer which checks the "user" and "project" fields. Or a WAF checking for right fields. In other words:

> Anyone who cares about validating the signature can, and anyone who cares that the JSON object has a particular structure doesn’t break (because the blob is still JSON and it still has the data it’s supposed to have in all the familiar places).

As author mentions, you can compromise by having "hmac", "json" and "user" (for routing purposes only), but this will increase overall size. This is approach 2 in the blog.

spankalee|1 year ago

Thats no different than the suggestion at the beginning of the article to serialize the JSON and sign the string.

Someone|1 year ago

> in other words, encode the JSON being signed as a string. This would then ensure that, even if the "outer" JSON is parsed and re-encoded, the string is unmodified. It'll even survive weird parsing and re-encoding, which the regex replacement option might not (unless it's tolerant of whitespace changes).

Would it be guaranteed to survive even standard parsing?

It wouldn’t surprise me at all, for example, if there are json parsers out there that, on reading, map “\u0009" and “\t" to the same string, so that they can only round-trip one of those strings. Similarly, there’s the pair of “\uabcd” and “\uABCD”. There probably are others.

mbreese|1 year ago

There are many ways to represent the JSON as binary… and all are equally valid. The easiest case to think about is with and without whitespace. Because what HMAC cares about are the byte[] values, not alphanumeric tokens.

Then, if you couple this with sending data through a proxy (maybe invisible to the developers), which may or may not alter that text representation, you end up with a mess. If you base64 encode the JSON, you now lose any benefit you might gain from those intermediate proxies, as they can’t read the payload…

38|1 year ago

Json encoded as a string is cursed, no one should do that and stop suggesting it. Base64 is fine or even ascii85

askvictor|1 year ago

Another problem with signing JSON: you can have two different json objects that mean the same thing, and will do exactly the same thing in your code e.g. {"a": "foo", "b": "bar"} vs {"b": "bar", "a": "foo"}. Also, whitespace. Are there any standards for normalising json, so that two equivalent, but differently written JSON files will have the same signature?

busymom0|1 year ago

This is why, in one of my projects, I first stringified the JSON using built in JSON.stringify(your_json) function, then signed that string and sent the string, its signature, and public key to server. Server verifies the signature using the string and if passes, then uses JSON.parse(your_string) to get the original json.

lxgr|1 year ago

That's canonicalization, and the article does mention it (but unfortunately does not offer much insight other than that it's hard).

afiori|1 year ago

Honestly it should not really matter, the regex bait-and-switch solution seem like the most practical one, there is some trickery in checking that the magic key does not appear in the string already but they seems far easier

lxgr|1 year ago

> Unless you have a good reason why you need an (asymmetric) signature, you want a MAC.

Is "I want the server/validating side to be safe even against server-side attackers with read-only permissions" not a good reason? Because that's one thing that asymmetric signatures provide out of the box compared to MACs.

monocasa|1 year ago

I really wish that XMLDSig wasn't such an awful standard that it turned a good third of the security industry against canonicalization in general.

Saying there's "sure there's lots of ways to serialize, but these specific rules get you the same octet and you sign that" is key to sanity in such situations.

For all of ASN.1's many sins, they got that part absolutely right.

saurik|1 year ago

Supposedly, this is the non-code documentation for AWS Version 3 signing.

https://docs.aws.amazon.com/amazonswf/latest/developerguide/...

Zamicol|1 year ago

We addressed these concerns while developing Coze, a cryptographic JSON messaging specification. The specification details how we chose to address these concerns.

https://github.com/Cyphrme/Coze

er4hn|1 year ago

I saw from the main page that you are aware of COSE (RFC 8152), with it's super similar name, but I didn't see anything in https://github.com/Cyphrme/CozeX/blob/master/coze_vs.md comparing it or CBOR.

Is the improvement COZE has over COSE that the body is default human readable, whereas COSE it's in some machine format that needs a reader util?

j-krieger|1 year ago

Very cool! It shares its name with "COSE" (RFC 8152), a signing scheme for CBOR objects :)

dang|1 year ago

Discussed at the time:

How not to sign a JSON object - https://news.ycombinator.com/item?id=20516489 - July 2019 (151 comments)

hughes|1 year ago

What does JSON object signing provide that TLS doesn't?

Does this imply that the application doesn't trust the transport/presentation layers?

hinkley|1 year ago

Caching or other forms or retransmission of the data.

Not all signed content is meant to be confidential. Or two-party confidential. Think about tokens. You have a refresh token that’s private between you and the destination, but you hand out session tokens to your users so they can talk to the destination directly. Or via another server that doesn’t have a cache coherency with the source.

magicalhippo|1 year ago

Our program has to sign XML documents so that the recipient can be certain a specific user signed it, as they're considered legally binding.

The documents are transmitted via a relaying party, as we don't have support for the protocol the recipient requires.

Similar cases could pop up in JSON-land, I imagine.

tgsovlerkhgsel|1 year ago

"it’s OK to sign the exact byte sequence."

Not just "OK". It's the only sane way to do it.

curtisszmania|1 year ago

[deleted]

Muromec|1 year ago

Or just use asn1 like normal people.

42 comments