Representing SHA-256 Hashes as Avatars

[+] nemo1618|5 years ago|reply

The problem with hash avatars in general is that people want to use them for identity verification -- and humans are wired to do so automatically -- but technologically, they cannot provide this. The space of possible avatars (2^256, in this case) is far, far larger than the number of distinct objects that humans can distinguish between. Which means that there will invariably be "collisions:" two avatars that are not identical, but appear identical to humans. As a result, if an attacker can brute-force an avatar that looks very similar to, say, Elon Musk's avatar, they can trivially scam people.

It follows that, since avatars do not provide any proof of identity, there is actually no harm in greatly truncating the hash space when generating them! That is, rather than trying to encode all 256 bits into the avatar, you can use a much more manageable number, like 16. But isn't this too small? Won't there be lots of collisions? Yes -- but that's a feature! If collisions are common, then the average user will be aware that avatar != identity, which makes them less susceptible to scamming. But 16 bits is still enough to meet the real goal of avatars: quickly distinguishing between different people in a conversation (or transaction, or whatever).

(This also shows why making avatars more costly to generate, e.g. with scrypt, can do more harm than good: doing so makes collisions less likely, but still not impossible. Meaning that if a collision does occur, whether accidental or malicious, you are less likely to notice it.)

[+] andromeduck|5 years ago|reply

You might get more milage in if the avatars are unique to the user viewing them rather than identical between users. If the nonce/salt used in generation it itself secure then it'd be phrohibity difficult for adversaries to force a collision without obvious detection, doubly so in communities.

[+] FqOD4xih7Uq6m9Z|5 years ago|reply

There might not be 2^256 distinguishable objects but maybe someone can come up with 2^16 distinguishable objects and just string 16 of them together. If there is one character off in a string of 40 hexadecimal characters it is hard to notice but that would be easier to detect in a set of 16 symbols.

[+] adzm|5 years ago|reply

On a related note, I've been experimenting with using a simple word list (like the eff diceware list) to generate strings of words encoding data. Trickiest part is figuring out how to encode padding, and the eventual size of the word list, and how complicated the final solution should be (eg using word lists that are not even binary numbers and leftover bits and all that). The diceware word list is nice since the words are not ambiguous and don't have homophones.

I assumed there would be existing implementations of something similar but have not found one that fits criteria other than some that use very small word lists. Diceware has 7776 words and pushing that to 8192 should be feasible and is a bit easier to work with.

[+] timClicks|5 years ago|reply

There's no need to distinguish between every object at every comparison. In most applications, you'll only be comparing a few dozen avatars with each other.

[+] sva_|5 years ago|reply

> The space of possible avatars (2^256, in this case) is far, far larger than the number of distinct objects that humans can distinguish between.

That sounds intriguing to me. Are you aware of any research into this?

[+] raphlinus|5 years ago|reply

I still like snowflakes for this: https://levien.com/snowflake-explain.html is a half-finished blog post explaining the motivation and algorithm I came up with. I never did careful user testing, but suspect that the answer would be that some people can reliably distinguish the patterns, others won't be able to.

In any case, there are a lot of variations on this "visual hash" idea, including the original fractal one, and I heard of more recent work to use the hash to seed StyleGAN face generation.

[+] ianopolous|5 years ago|reply

This is a great idea! Trying random ones, I couldn't find two I thought looked confusable.

[+] kop316|5 years ago|reply

As a warning, this would not be good for colorblind people (such as myself).

The "Hello, Hacker News!" Hash's middle ring has half it's ring that looks identical to me, and unless I looked carefully, that entire ring looked the same to me.

[+] franky47|5 years ago|reply

What would you suggest as a solution ? I considered swapping Hue for Lightness in order to increase contrast changes. Would you be interested in testing out some variants ?

[+] sm4rk0|5 years ago|reply

Strange that neither the article nor the comments mention https://gravatar.com/

It hashes the user's email http://en.gravatar.com/site/implement/hash/ and creates an "identicon" from the hash http://scott.sherrillmix.com/blog/blogger/wp_identicon/ or loads a user-defined image.

[+] toastal|5 years ago|reply

I'd recommend the open and compatible Libravatar over Gravatar

https://www.libravatar.org/

[+] leipert|5 years ago|reply

I really like the former article method over the gravatar identicon because the circular shape is not going to end up with „accidental swastikas“

[+] franky47|5 years ago|reply

I discovered robohash from Gravatar actually, but forgot to mention it, thanks for the reminder.

[+] zellyn|5 years ago|reply

Suggestion: shave off two bits, and switch between the variants in the "A bit of fun" section: https://francoisbest.com/posts/2021/hashvatars#a-bit-of-fun

[+] kevincox|5 years ago|reply

+1, this will create much more distinct results than subtle colour variations.

[+] franky47|5 years ago|reply

What do you mean by "shave off 2 bits"?

[+] milkey_mouse|5 years ago|reply

Urbit also developed a solution for turning a number into an avatar, although theirs only have 32 bits of entropy, and to be honest there are many that are difficult to tell apart:

https://urbit.org/blog/creating-sigils/

[+] mvolfik|5 years ago|reply

these are pretty. Do you have any idea if there is a way to use the library with other data (hashes) other than the Urbit 'names' or what it is?

[+] joshbuddy|5 years ago|reply

You should check out this paper where they tested different representations on humans to see what they could tell apart, and came up with a novel representation called Moji.

https://exascale.info/assets/pdf/students/MSc_Thesis_-_Micha...

[+] geoah|5 years ago|reply

One of the prettiest identicons I've seen.

Since it doesn't seem to be lossy, I was wondering if it could be somehow adapted to something that could be scanned as a QR code. I guess the minor color shifts might be hard to get right, but maybe combined/replaced with some form of symbol inside rings to help, a dot/dash combination?

[+] geoah|5 years ago|reply

I'll also leave here this very nice list of identicon implementations: https://github.com/drhus/awesome-identicons

[+] RcouF1uZ4gsC|5 years ago|reply

It would be a lot more work, but it might work better if you picked something which humans are particularly tuned to notice subtle details such as faces.

[+] franky47|5 years ago|reply

Using the hash as a seed for an AI face generator like thispersondoesnotexist would be pretty powerful. Free idea for anyone who wants to give it a shot.

[+] ilammy|5 years ago|reply

OpenSSH's randomart was too visually indistinctive for me so I've patched it to draw TrueColor images of cats. I wanted to actually seed a GAN to generate consistent images, but that turned out to be too much of a bother so I'm just keeping a local cache on a machine. Works nicely for that use-case as I'm able to associate a particular image with a particular location when working at a particular box. Good enough.

https://github.com/ilammy/homebrew-ssh

[+] InfiniteCode|5 years ago|reply

I did this one some time ago, allows custom visual effects using hash and seeded random: https://www.blankjs.com/

[+] tosh|5 years ago|reply

Wow, those are beautiful. Starred.

[+] KingMachiavelli|5 years ago|reply

Despite the issue where it would be trivial to brute force similar looking but not identical 'avatars', I think this still has a few good uses for non-identification.

1. Creating at least some default avatar. Not to be used to verify identity but just somewhat better than having a very limited set of default images. Having rate limits on account creation would prevent most brute force methods. 2. Avatar suitable for partial-identification for very small populations. Imagine a matrix/Element room that as <100,000 people. The hash/math could be modified to drastically trim down the space of the hash (e.g. 2^256) to something similar to the size of the room.

#2 sounds pretty interesting. It could be expanded by making parts of the image/avatar dependent on some other input other than the user ID like the user's role in the chat group. Another segment/ring could something more short lived and relative like just identifying users in recent chat messages.

[+] petee|5 years ago|reply

I always thought ssh randomart representations were visually unique enough; maybe combine smaller, simpler shapes with color too?

The rings are neat, but I found many to be too similar based on color alone, and segments too are really hard pick up on a pattern or something memorable

[+] irdc|5 years ago|reply

How hard would it be to instead generate faces with random facial features? Humans are already hardwired to be able to detect subtle differences between faces.

That would obviously not make it suitable for generating avatars to identify humans, but it would make this really useful to eg identify git commits or hash signatures.

[+] toptoppler|5 years ago|reply

Just feed the hash into the thishumandoesnotexist Neural Network? Boom, human avatar.

[+] mvolfik|5 years ago|reply

That would be so creepy. Imagine a random face you've never seen in your life appearing as your avatar

[+] emsign|5 years ago|reply

As a sidenote, your website breaks in Vivaldi with cookies denied and several ad-blockers. It keeps on reloading, making it impossible to close the tab or the browser. Please fix your site.

[+] rumblefrog|5 years ago|reply

Would be more awesome if it could export as an image! Right now I'm just manually inspecting it and copying the entire <g> section.

[+] franky47|5 years ago|reply

Someone actually built exactly that while I was writing the article :)

https://github.com/wzulfikar/hashvatar

[+] tosh|5 years ago|reply

How about using the variants as well so the avatars also structurally look different from each other (and adding even more variants)?

[+] sneak|5 years ago|reply

72 comments