I think most AI players rely on high performance GEMM. But most people would be satisfied with cutlass or cublas, and the others implement gemm themselves, but not necessarily use undocumented features?
Using undocumented features is not rare. People reverse engineered Apple's undocumented AMX instructions on their CPU, and I know people use undocumented/private extensions for several different kinds of GPUs.
creato|1 year ago