Cursed Knowledge | WingNews

simpaticoder|6 months ago

I loved this the moment I saw it. After looking at an example commit[1], I love it even more. The cursed knowledge entry is committed alongside the fix needed to address it. My first instinct is that every project should have a similar facility. The log is not just cathartic, but turns each frustrating speedbump into a positive learning experience. By making it public, it becomes both a tool for both commiseration and prevention.

1 - https://github.com/savely-krasovsky/immich/commit/aeb5368602...

delusional|6 months ago

I agree, I usually put this sort of information in the commit message itself. That way it's right there if anybody ever comes across the line and wonders "why did he write this terrible code, can't you just ___".

treve|6 months ago

The '50 extra packages' one is wild. The author of those packages has racked up a fuckload of downloads. What a waste of total bandwidth and disk space everywhere. I wonder if it's for clout.

bikeshaving|6 months ago

The maintainer who this piece of “cursed knowledge” is referencing is a member of TC39, and has fought and died on many hills in many popular JavaScript projects, consistently providing some of the worst takes on JavaScript and software development imaginable. For this specific polyfill controversy, some people alleged a pecuniary motivation, I think maybe related to GitHub sponsors or Tidelift, but I never verified that claim, and given how little these sources pay I’m more inclined to believe he just really believes in backwards compatibility. I dare not speak his name, lest I incur the wrath of various influential JavaScript figures who are friends with him, and possibly keep him around like that guy who was trained wrong as a joke in Kung Pow: Enter the Fist. In 2025, I’ve moderated my opinion of him; he does do important maintenance work, and it’s nice to have someone who seems to be consistently wrong in the community, I guess.

Centigonal|6 months ago

It's probably a clout thing, or just a weird guy (Hanlon's Razor), but a particularly paranoid interpretation is that this person is setting up for a massive, multi-pronged software supplychain attack.

fastball|6 months ago

The author is almost certainly ljharb.

smitty1e|6 months ago

It does raise the idea of managed backward compatibility.

Especially if you could control at install time just how far back to go, that might be interesting.

Also an immediately ridiculous graph problem for all but trivial cases.

unknown|6 months ago

[deleted]

unknown|6 months ago

[deleted]

qdw|6 months ago

One of their line items complains about being unable to bind 65k PostgreSQL placeholders (the linked post calls them "parameters") in a single query. This is a cursed idea to begin with, so I can't fully blame PostgreSQL.

From the linked GitHub issue comments, it looks like they adopted the sensible approach of refactoring their ORM so that it splits the big query into several smaller queries. Anecdotally, I've found 3,000 to 5,000 rows per write query to be a good ratio.

Someone else suggested first loading the data into a temp table and then joining against that, which would have further improved performance, especially if they wrote it as a COPY … FROM. But the idea was scrapped (also sensibly) for requiring too many app code changes.

Overall, this was quite an illuminating tome of cursed knowledge, all good warnings to have. Nicely done!

e1g|6 months ago

Another strategy is to pass your values as an array param (e.g., text[] or int[] etc) - PG is perfectly happy to handle those. Using ANY() is marginally slower than IN(), but you have a single param with many IDs inside it. Maybe their ORM didn’t support that.

motorest|6 months ago

> This is a cursed idea to begin with, so I can't fully blame PostgreSQL.

After going through the list, I was left with the impression that the "cursed" list doesn't really refers to gotchas per se but to lessons learned by the developers who committed them. Clearly a couple of lessons are incomplete or still in progress, though. This doesn't take away from their value of significance, but it helps frame the "curses" as persona observations in an engineering log instead of statements of fact.

fdr|6 months ago

that also popped out at me: binding that many parameters is cursed. You really gotta use COPY (in most cases).

I'll give you a real cursed Postgres one: prepared statement names are silently truncated to NAMEDATALEN-1. NAMEDATALEN is 64. This goes back to 2001...or rather, that's when NAMEDATALEN was increased in size from 32. The truncation behavior itself is older still. It's something ORMs need to know about it -- few humans are preparing statement names of sixty-plus characters.

Terr_|6 months ago

> One of their line items complains about being unable to bind 65k PostgreSQL placeholders (the linked post calls them "parameters") in a single query.

I've actually encountered this one, it involved an ORM upserting lots of records, and how some tables had SQL array-of-T types, where each item being inserted consumes one bind placeholder.

That made it an intermittent/unreliable error, since even though two runs might try to touch the same number of rows and columns, you the number of bind-variables needed for the array stuff fluctuated.

burnt-resistor|6 months ago

Or people who try to send every filename on a system through xargs in a single command process invocation as arguments (argv) without NUL-terminated strings. Just hope there are no odd or corrupt filenames, and plenty of memory. Oopsie. find -print0 with parallel -0/xargs -0 are usually your friends.

Also, sed and grep without LC_ALL=C can result in the fun "invalid multibyte sequence".

Aeolun|6 months ago

I don’t think that makes intuitive sense. Whether I send 50k rows or 10x5k rows should make no difference to the database. But somehow it does. It’s especially annoying with PG, where you just cannot commit a whole lot of small values fast due to this weird limit.

burnt-resistor|6 months ago

- Windows' NTFS Alternate Data Streams (ADS) allows hiding an unlimited number of files in already existing files

- macOS data forks, xattrs, and Spotlight (md) indexing every single removable volume by default adds tons of hidden files and junk to files on said removable volumes. Solution: mdutil -X /Volumes/path/to/vol

- Everything with opt-out telemetry: go, yarn, meilisearch, homebrew, vcpkg, dotnet, Windows, VS Code, Claude Code, macOS, Docker, Splunk, OpenShift, Firefox, Chrome, flutter, and zillions of other corporate abominations

kirici|6 months ago

>opt-out telemetry: go

By default, telemetry data is kept only on the local computer, but users may opt in to uploading an approved subset of telemetry data to https://telemetry.go.dev.

To opt in to uploading telemetry data to the Go team, run:

    go telemetry on

To completely disable telemetry, including local collection, run:

    go telemetry off

https://go.dev/doc/telemetry

TheBicPen|6 months ago

Opt-out telemetry is the only useful kind of telemetry

bigyabai|6 months ago

> Some phones will silently strip GPS data from images when apps without location permission try to access them.

That's no curse, it's a protection hex!

8organicbits|6 months ago

I think this is written unclearly. Looking at the linked issues, the root cause seems to be related to a "all file access" permission, not just fine grained location access.

It seems great that an app without location access cannot check location via EXIF, but I'm surprised that "all file access" also gates access to the metadata, perhaps one selected using the picker.

https://gitlab.com/CalyxOS/platform_packages_providers_Media...

nulld3v|6 months ago

On the other hand, one particular app completely refuses to allow users to remove location information from their photos: https://support.google.com/photos/answer/6153599?hl=en&co=GE...

mynegation|6 months ago

I have no idea what that means but to me it looks like it works as designed.

Muromec|6 months ago

A ward even

thorum|6 months ago

> npm scripts make a http call to the npm registry each time they run, which means they are a terrible way to execute a health check.

Is this true? I couldn’t find another source discussing it. That would be insane behavior for a package manager.

jrasm91|6 months ago

https://docs.npmjs.com/cli/v6/using-npm/config#update-notifi...

https://github.com/npm/cli/blob/5d82d0b4a4bd1424031fb68b4df7...

dgoldstein0|6 months ago

It might be referring to the check if whether npm is up to date so it can prompt you to update if it isn't?

skacekamen|6 months ago

probably an update check? It definitely sometimes shows an update banner

unknown|6 months ago

[deleted]

godelski|6 months ago

Looks like they're missing one. I'm pretty sure the discussion goes further back[0,1] but this one has been on going for years and seems to be the main one[2]

  05/26/23(?) Datetimes in EXIF metadata are cursed

[0] https://github.com/immich-app/immich/discussions/2581

[1] https://github.com/immich-app/immich/issues/6623

[2] https://github.com/immich-app/immich/discussions/12292

a96|6 months ago

Datetimes in general have have a tendency to be cursed. Even when they work, something adjacent is going to blow up sooner or later. Especially if it relies on timezones or DST being in the value.

joshdavham|6 months ago

This is awesome! Does anyone else wanna share some of the cursed knowledge they've picked up?

For me, MacOS file names are cursed:

1. Filenames in MacOS are case-INsensitive, meaning file.txt and FILE.txt are equivalent

2. Filenames in MacOS, when saved in NFC, may be converted to NFD

qingcharles|6 months ago

I created one of the first CDDBs in 1995 when Windows 95 was in beta. It came with a file, IIRC, cdplayer.ini, that contained all the track names you'd typed in from your CDs.

I put out requests across the Net, mostly Usenet at the time, and people sent me their track listings and I would put out a new file every day with the new additions.

Until I hit 64KB which is the max size of an .ini file under Windows, I guess. And that was the end of that project.

archagon|6 months ago

I'm in the process of writing up a blog post on how network shares on macOS are kind of cursed. Highlights:

* Files on SMB shares sometimes show up as "NDH6SA~M" or similar, even though that's not their filename on the actual network drive. This is because there's some character present in the filename that SMB can't work with. No errors or anything, you just have to know about it.

* SMB seems to copy accented characters in filenames as two Unicode code points, not one. Whereas native macOS filenames tend to use single Unicode code point accents.

* SMB seems to munge and un-munge certain special characters in filenames into placeholders, e.g. * <-> . But not always. Maybe this depends on the SMB version used?

* SMB (of a certain version?) stores symlinks as so-called "XSym" binary files, which automatically get converted back to native symlinks when copied from the network share. But if you try to rsync directly from the network drive instead of going through SMB, you'll end up with a bunch of binary XSym file that you can't really do anything with.

I only found out about these issues through integrity checks that showed supposedly missing files. Horrible!

account42|6 months ago

> 1. Filenames in MacOS are case-INsensitive, meaning file.txt and FILE.txt are equivalent

It's much more cursed than that: filenames may or may not be case-sensitive depending on the filesystem.

burnt-resistor|6 months ago

Yep. Create a case-sensitive APFS or HFS+ volume for system or data, and it guarantees problems.

mdaniel|6 months ago

1 is only true by default, both HFS and APFS have case sensitive options . NTFS also behaves like you described, and I believe the distinction is that the filesystems are case-retentive, so this will work fine:

  $ echo yup > README.txt
  $ cat ReAdMe.TXT
  yup
  $ ls
  README.txt

Maybe the cursed version of the filesystem story is that goddamn Steam refuses to install on the case sensitive version of the filesystem, although Steam has a Linux version. Asshats

Ygg2|6 months ago

Why is the YAML part cursed? They serialize to same string, no? Both [1] and [2] serialize to identical strings. This seems like the ancient YAML 1.1 parser curse strikes again.

[1] https://play.yaml.io/main/parser?input=ICAgICAgdGVzdDogPi0KI...

[2]https://play.yaml.io/main/parser?input=ICAgICAgdGVzdDogPi0KI...

binary132|6 months ago

This would be a fun github repo. Kind of like Awesome X, but Cursed.

maxbond|6 months ago

> Fetch requests in Cloudflare Workers use http by default, even if you explicitly specify https, which can often cause redirect loops.

This is whack as hell but doesn't seem to be the default? This issue was caused by the "Flexible" mode, but the docs say "Automatic" is the default? (Maybe it was the default at the time?)

> Automatic SSL/TLS (default)

https://developers.cloudflare.com/ssl/origin-configuration/s...

motorest|6 months ago

> This is whack as hell but doesn't seem to be the default?

I don't think so. If you read about what Flexible SSL means, you are getting exactly what you are asking for.

https://developers.cloudflare.com/ssl/origin-configuration/s...

Here is a direct quote of the recommendation on how this feature was designed to be used:

> Choose this option when you cannot set up an SSL certificate on your origin or your origin does not support SSL/TLS.

Furthermore, Cloudflare's page on encryption modes provides this description of their flexible mode.

> Flexible : Traffic from browsers to Cloudflare can be encrypted via HTTPS, but traffic from Cloudflare to the origin server is not. This mode is common for origins that do not support TLS, though upgrading the origin configuration is recommended whenever possible.

So, people go out of their way to set an encryption mode that was designed to forward requests to origin servers that do not or cannot support HTTPS connections, and then are surprised those outbound connections to their origin servers are not HTTPS.

bo0tzz|6 months ago

It was indeed the default at the time.

egruy|6 months ago

Reminds me a lot of phenomenal Hadoop and Kerberos: Madness beyond the gates[1], which coincidentally saved me many times from madness. Thanks Steve, I can't fathom what you had to go through to get the cursed knowledge!

1 - https://steveloughran.gitbooks.io/kerberos_and_hadoop/conten...

tonyhart7|6 months ago

ok but this one is not cursed tho (https://github.com/immich-app/immich/discussions/11268)

its valid privacy and security on how mobile OS handle permission

account42|6 months ago

It is cursed because now the photo management app needs to ask for the permission to constantly track you instead of only getting location of a limited set of past points where you specifically chose to take a photo. Besides giving malicious photo app developers an excuse for these permissions, it also contributes to permission fatigue by training to give random applications wide permissions.

LeoPanthera|6 months ago

"Some phones will silently strip GPS data from images when apps without location permission try to access them."

Uh... good?

steve_adams_86|6 months ago

I'm torn. Maybe a better approach would be a prompt saying "you're giving access to images with embedded location data. Do you want to keep the location data in the images, or strip the location data in this application?"

I might not want an application to know my current, active location. But it might be useful for it to get location data from images I give it access to.

I do think if we have to choose between stripping nothing or always stripping if there's no location access, this is the correct and safe solution.

a96|6 months ago

Kind of. But that means any file that goes through that mechanism may be silently modified. Which is evil.

account42|6 months ago

It is cursed because now the photo management app needs to ask for the permission to constantly track you instead of only getting location of a limited set of past points where you specifically chose to take a photo. Besides giving malicious photo app developers an excuse for these permissions, it also contributes to permission fatigue by training to give random applications wide permissions.

zzo38computer|6 months ago

> Zitadel is cursed because its custom scripting feature is executed with a JS engine that doesn't support regex named capture groups.

I think sufficiently old version of JavaScript will not have it. It does not work on my computer either. (You should (if you had not already) report this to whoever maintains that program, in order to fix this, if you require that feature.)

> Git can be configured to automatically convert LF to CRLF on checkout and CRLF breaks bash scripts.

Can you tell git that the bash script is a binary file and therefore should not automatically convert the contents of the file?

> Fetch requests in Cloudflare Workers use http by default, even if you explicitly specify https, which can often cause redirect loops.

Is that a bug in Cloudflare? That way of working does not make sense; it should use the protocol you specify. (I also think that HTTP servers should not generally automatically redirect to HTTPS, but that is a different problem. Still, since it does that it means that this bug is more easily found.) (Also, X.509 should be used for authentication, which avoids the problem of accidentally authenticating with an insecure service (or with the wrong service), since that would make it impossible to do.)

> There is a user in the JavaScript community who goes around adding "backwards compatibility" to projects. They do this by adding 50 extra package dependencies to your project, which are maintained by them.

It is a bad idea to add too many dependencies to your project, regardless of that specific case.

> The bcrypt implementation only uses the first 72 bytes of a string. Any characters after that are ignored.

There is a good reason to have a maximum password length (to avoid excessive processing due to a too long password), although the maximum length should still be sufficiently long (maybe 127 bytes is good?), and it should be documented and would be better if it should be known when you try to set the password.

> Some web features like the clipboard API only work in "secure contexts" (ie. https or localhost)

I think that "secure contexts" is a bad idea. I also think that these features should be controlled by user settings instead, to be able to disable and otherwise configure them.

mdaniel|6 months ago

> Can you tell git that the bash script is a binary file and therefore should not automatically convert the contents of the file?

That'd be swatting a fly with a sledgehammer; if you do that, $(git diff) will no longer work which smells important for shell scripts that evolve over time. But I think you were in the right ballpark in that .gitattributes is designed for helping it understand the behavior you wish with eol=lf just for either that file or *.sh *.bash etc https://git-scm.com/docs/gitattributes#Documentation/gitattr...

csours|6 months ago

You can load Java Classes into Oracle DB and run them natively inside the server.

Those classes can call stored procedures or functions.

Those classes can be called BY stored procedures or functions.

You can call stored procedures and functions from server-side Java code.

So you can have a java app call a stored proc call a java class call a stored proc ...

Yes. Yes, this is why they call it Legacy.

mdaniel|6 months ago

That's ok, modern nodejs apps are represented, too, so everyone can get in on the legacy train: https://docs.oracle.com/en/database/oracle/oracle-database/2... and https://docs.aws.amazon.com/AmazonRDS/latest/AuroraPostgreSQ...

or, if the modern job postings are indicative, FastAPI to PG to PY https://www.postgresql.org/docs/17/plpython-funcs.html

physicles|6 months ago

Back in 2011, I wasted an entire afternoon on some string handling code that was behaving very strangely (I don’t remember exactly what the code was).

It wasn’t until I loaded the content into a hex editor that I learned about U+00A0, the non-breaking space. Looks like a space, but isn’t.

mdaniel|6 months ago

Ah, yes, the 90s html was jam packed with   (aka  ) to make things not wrap, and that was stellar good fun for copy-pasting

The other "2020s" problem is some leading unicode marks which are also invisible. I thought it was BOM but those do seem to show up to cat but just a few weeks ago I had a file from a vendor's site that wouldn't parse but that both cat and vim said was fine, only to find the wtf? via the almighty xxd

qbane|6 months ago

Love to see this concept condensed! This kind of knowledge will only emerge only after you dive in your project and surprisingly find things do not work as thought (inevitable if the project is niche enough). Will keep a list like that for every future project.

phreack|6 months ago

Love this. I seem to find a new one every day maintaining an Android app with millions of users. We like to call them "what will we tell the kids" moments. It's a great idea to write them down, I'll probably start doing it!

mitioshi|6 months ago

What a cool idea to have such page for a project. I wish more open source projects adopted this. It's always interesting to read how people resolved complex problems

stogot|6 months ago

This is the best thing I’ve read on hacker news all year

hiAndrewQuinn|6 months ago

>The bcrypt implementation only uses the first 72 bytes of a string. Any characters after that are ignored.

Is there any good reason for this one in particular?

mras0|6 months ago

bcrypt is based on the blowfish cipher which "only" support keys up to 576 bits (72 bytes) in length (actually only 448 bits as spec'ed). Wikipedia has all the details.

Havoc|6 months ago

One can really sense the pain just reading the headings

Also a crypto library that limits passwords to 72 bytes? That’s wild

AstralStorm|6 months ago

It's written with constant memory allocation in mind. Silly of them to use such a small buffer though, make it a configuration option.

worik|6 months ago

dd/mm/yyyy date formats are cursed....

Perhaps it is mm/dd/yyyy (really?!?) that is cursed....

armchairhacker|6 months ago

dd/mm/yyyy is most common worldwide (particularly Europe, India, Australia) followed by yyyy/mm/dd (particularly China, Japan, South Korea).

https://wikipedia.org/wiki/Date_and_time_representation_by_c...

IMO the best format is yyyy/mm/dd because it’s unambiguous (EDIT: almost) everywhere.

javcasas|6 months ago

mm/dd/yyyy is cursed. You parse it naively with momentjs, and some times it parses (wrong), other times it doesn't parse.

It's the reason our codebase is filled with momentAmerican, parseDateAmerican and parseDatetimeAmerican.

hollerith|6 months ago

mm.dd.yyyy is cursed, too. The not-cursed options are dd.mm.yyyy and mm/dd/yyyy

burnt-resistor|6 months ago

Install an SP3 or TR4 socketed CPU in a dusty, dirty room without ESD precautions and crank the torque on the top plate and heat sink like truck lug nuts until creaking and cracking noises of the PCB delaminating are noticeable. Also be sure to sneeze on the socket's chip contacts and clean it violently with an oily and dusty microfiber cloth to bend every pin.

c. 2004 and random crap on eBay: DL380 G3 standard NICs plus Cisco switches with auto speed negotiation on both sides have built-in chaos monkey duplex flapping.

Google's/Nest mesh Wi-Fi gear really, really enjoys being close together so much that it offers slower speeds than simply 1 device. Not even half speed, like dial-up before 56K on random devices randomly.

g8oz|6 months ago

This is awesome. Disappointing to hear about the Cloudflare fetch issue.

doctorpangloss|6 months ago

The infallibility of Cloudflare is sacrosanct!

motorest|6 months ago

> Disappointing to hear about the Cloudflare fetch issue.

You mean the one where explicitly configuring Cloudflare to forward requests to origin servers as HTTP will actually send requests as HTTP? That is not what I would describe as disappointing.

157 comments