(no title)
sifar | 1 month ago
IIRC, Depthwise is memory bound so the bar might be lower. Perhaps you can try some thing with higher compute intensity like a matrix multiply. I have observed, it trips up with the columnar accesses for SIMD.
sifar | 1 month ago
IIRC, Depthwise is memory bound so the bar might be lower. Perhaps you can try some thing with higher compute intensity like a matrix multiply. I have observed, it trips up with the columnar accesses for SIMD.
fotcorn|1 month ago
Also important to have a few test cases the agent can quickly check against, it will often generate wrong code, but if that is easily detectable the agent can fix it and continue quickly.