(no title)
alexandercheema | 1 year ago
This looks like potentially some promising research that I'm looking into reproducing now. We want to lower the barrier to running large models as much as possible so if this works, it would be a potential addition to the exo offering.
tgtweak|1 year ago
It is also possible some of these optimizations could help optimize distribution based on latency and bandwidth between nodes.