(no title)
mochomocha | 1 year ago
> DBRX uses only 36 billion parameters at any given time. But the model itself is 132 billion parameters, letting you have your cake and eat it too in terms of speed (tokens/second) vs performance (quality).
mochomocha | 1 year ago
> DBRX uses only 36 billion parameters at any given time. But the model itself is 132 billion parameters, letting you have your cake and eat it too in terms of speed (tokens/second) vs performance (quality).
hexomancer|1 year ago
sangnoir|1 year ago