(no title)
tylerekahn | 2 years ago
From the LoRa paper:
>When the pre-trained model is GPT-3 175B, the number of train- able parameters |Θ| can be as small as 0.01% of |Φ0|.
tylerekahn | 2 years ago
From the LoRa paper:
>When the pre-trained model is GPT-3 175B, the number of train- able parameters |Θ| can be as small as 0.01% of |Φ0|.
No comments yet.