(no title)
hyferg | 3 years ago
If that’s not helpful, were you getting at having the model return some rich data about the attention weights that went into generating some token?
hyferg | 3 years ago
If that’s not helpful, were you getting at having the model return some rich data about the attention weights that went into generating some token?
jsemrau|3 years ago