sqreept | 2 months ago | on: Proxmox "shared" LVM CSI plugin
sqreept's comments
sqreept | 2 years ago | on: Grok
sqreept | 2 years ago | on: Fine tune a 70B language model at home
sqreept | 2 years ago | on: LoRA for fixing Romanian diacritics open sourced
1. A LoRA that adds Romanian diacritics to texts that don't have them: huggingface.co/sqreept/ro_dia… (includes an example of using it)
2. The dataset used to build the above LoRA: huggingface.co/datasets/sqree…
3. The Colab used to fine-tune the above LoRA: colab.research.google.com/drive/1GMg9fS3…
This is what I call open source in Machine Learning: model weights, dataset and code. Anything less is not open source.
sqreept | 2 years ago | on: Genie: Generative Interactive Environments
sqreept | 2 years ago | on: Gemma: New Open Models
I dug a bit deeper and it seems that with transformers==4.37.0 everything works fine with other HF hosted models (like Llama) but you'll rightfully get this when trying to use Gemma:
ImportError: cannot import name 'GemmaForCausalLM' from 'transformers'
After installing transformers==4.38.0 the fine-tunning speed of Llama drops to 25% (?!?) of what used to be for a reason that I think HF should fix. Testing Gemma it seems I'm hitting a hardware limit as Gemma has a hidden size which is bigger than the available CUDA cores. This seems to make both inference & fine-tunning about 25 times slower than similarly sized Llama 7B. I guess some operations have to be broken down in multiple round trips to the GPU due to my low CUDA core count.
All in all, even if HF fixes the recently introduced slowdown, Gemma seems to be fine-tuneable in reasonable amount of time only by the lucky ones with access to A100/H100.
EDIT: I managed to hack my env to be able to run inference on Gemma with transformers==4.37.0 by keeping the necessary classes in loaded in RAM. It works about 4x faster but still very slow. And both the 7B and the 2B versions behave the same way.
EDIT2: I tried latest transformers from main branch (4.39.0.dev) and behaves the same as 4.38.0.
sqreept | 2 years ago | on: Gemma: New Open Models
sqreept | 2 years ago | on: Gemma: New Open Models
sqreept | 2 years ago | on: A programming language coding in a grid
sqreept | 3 years ago | on: An aggressive, stealthy web spider operating from Microsoft IP space
We used a combination of ASN and UA to block them.
sqreept | 3 years ago | on: An aggressive, stealthy web spider operating from Microsoft IP space
sqreept | 4 years ago | on: Donation page "not compliant with Google Play Policies"
sqreept | 5 years ago | on: The Feedback Loop of Productivity
sqreept | 5 years ago | on: Apple accuses Epic of “Willful, brazen, and unlawful” conduct
sqreept | 5 years ago | on: Apple accuses Epic of “Willful, brazen, and unlawful” conduct
sqreept | 5 years ago | on: MIA – Ubuntu 19.04 Disco Dingo
sqreept | 6 years ago | on: Ask HN: What interesting problems are you working on?
sqreept | 6 years ago | on: Apache Pulsar is an open-source distributed pub-sub messaging system
sqreept | 6 years ago | on: How Can a Star Be Older Than the Universe?
sqreept | 6 years ago | on: Apple Removes HKmap.live from the App Store
My only contribution was to uphold standards as this is one big thing where LLMs struggle probably because there's so few examples out there.
Hope it helps you!