(no title)
jbosh | 1 year ago
Finally got enough information and realized that the padding of a specific object was wrong (GC expected 16 bytes, object was 12 bytes). This caused dozens of other corruption bugs to disappear that we didn't even think were GC related.
stingraycharles|1 year ago
rwmj|1 year ago
Linux/x86-64 expects the stack to always be 16 byte aligned (although the ABI documentation at the time didn't make this assumption very clear). OCaml called into C with a non-aligned stack. GCC-generated code, assuming the stack was 16 byte aligned, used some strange Intel AVX instruction that only works on aligned data, unlike every other Intel instruction ever that can work on any alignment (albeit maybe more slowly).
This manifested itself as rare and totally unreproducible crashes (because stack alignment differed between runs). It was a bit of a nightmare to solve.
jbosh|1 year ago
taspeotis|1 year ago
packetlost|1 year ago
npalli|1 year ago