sqreept's comments

sqreept | 2 months ago | on: Proxmox "shared" LVM CSI plugin

This CSI plugin is fully vibe coded by Claude Sonnet 4.5 & Haiku 4.5 and tested on a real cluster in a loop by claude-code.

My only contribution was to uphold standards as this is one big thing where LLMs struggle probably because there's so few examples out there.

Hope it helps you!

sqreept | 2 years ago | on: Grok

What are the languages supported by it?

sqreept | 2 years ago | on: Fine tune a 70B language model at home

M1, M2, M3 still have very low number of GPU cores. Apple should release some better hardware to take advantage of their recently released MLX library.

sqreept | 2 years ago | on: LoRA for fixing Romanian diacritics open sourced

I just published 3 things:

1. A LoRA that adds Romanian diacritics to texts that don't have them: huggingface.co/sqreept/ro_dia… (includes an example of using it)

2. The dataset used to build the above LoRA: huggingface.co/datasets/sqree…

3. The Colab used to fine-tune the above LoRA: colab.research.google.com/drive/1GMg9fS3…

This is what I call open source in Machine Learning: model weights, dataset and code. Anything less is not open source.

sqreept | 2 years ago | on: Genie: Generative Interactive Environments

I've read twice the announcement and I can't tell what this is good for. Can you please dumb it down for me?

sqreept | 2 years ago | on: Gemma: New Open Models

First of all, I'm using 2 x 4090 for testing. 4090 has 16384 CUDA cores which will become relevant a bit later.

I dug a bit deeper and it seems that with transformers==4.37.0 everything works fine with other HF hosted models (like Llama) but you'll rightfully get this when trying to use Gemma:

ImportError: cannot import name 'GemmaForCausalLM' from 'transformers'

After installing transformers==4.38.0 the fine-tunning speed of Llama drops to 25% (?!?) of what used to be for a reason that I think HF should fix. Testing Gemma it seems I'm hitting a hardware limit as Gemma has a hidden size which is bigger than the available CUDA cores. This seems to make both inference & fine-tunning about 25 times slower than similarly sized Llama 7B. I guess some operations have to be broken down in multiple round trips to the GPU due to my low CUDA core count.

All in all, even if HF fixes the recently introduced slowdown, Gemma seems to be fine-tuneable in reasonable amount of time only by the lucky ones with access to A100/H100.

EDIT: I managed to hack my env to be able to run inference on Gemma with transformers==4.37.0 by keeping the necessary classes in loaded in RAM. It works about 4x faster but still very slow. And both the 7B and the 2B versions behave the same way.

EDIT2: I tried latest transformers from main branch (4.39.0.dev) and behaves the same as 4.38.0.

sqreept | 2 years ago | on: Gemma: New Open Models

Tried inference with the 7B model and without flash attention this is soooooo slow. With flash attention the fine-tunning requires A100 or H100. Also the inference doesn't always stop generating resulting in garbage being added to the response.

sqreept | 2 years ago | on: Gemma: New Open Models

What are the supported languages of these models?

sqreept | 2 years ago | on: A programming language coding in a grid

In the era of AI, naming variables can and should be automated. Without good names, the code is very hard to read, and code should be, before anything else, readable.

sqreept | 3 years ago | on: An aggressive, stealthy web spider operating from Microsoft IP space

I see it stopped crawling our website last night after hammering it for over a week.

We used a combination of ASN and UA to block them.

sqreept | 3 years ago | on: An aggressive, stealthy web spider operating from Microsoft IP space

It is aggressive in what content is trying to access. It looks for security vulnerabilities and normal bots don't do that (with the notable exception of some security testing software). Also it's not spidering, somehow it knows very old URLs which are not even public which were probably obtained from a malicious browser extension.

sqreept | 4 years ago | on: Donation page "not compliant with Google Play Policies"

A duopoly is just a monopoly with two CEOs

sqreept | 5 years ago | on: The Feedback Loop of Productivity

Here's a humble translation of the article in German: https://github.com/adi/deutsch/blob/master/001_rsp.md

sqreept | 5 years ago | on: Apple accuses Epic of “Willful, brazen, and unlawful” conduct

I'm sorry if you find this tasteless and I'll expand a bit more in case there is misunderstanding. My point is that too much power in the hands of anyone is a bad idea that should be opposed. Doesn't matter if it's a country, a political party, a company the size of a country or simply a person with huge funds. Not opposing such behavior amounts to supporting them. I don't like the way Epic does it... but I respect that they do it somehow against their best interest.

sqreept | 5 years ago | on: Apple accuses Epic of “Willful, brazen, and unlawful” conduct

Porque cuando la tiranía es ley, la revolución es orden.

sqreept | 5 years ago | on: MIA – Ubuntu 19.04 Disco Dingo

Not having support is one thing. Breaking apt is what happened here.

sqreept | 6 years ago | on: Ask HN: What interesting problems are you working on?

Leader election in a distributed system of unknown size.

sqreept | 6 years ago | on: Apache Pulsar is an open-source distributed pub-sub messaging system

NATS is a simpler PUB/SUB system that delivers in the UNIX spirit of small composable parts. Apache Pulsar or Apache Kafka deliver the banana, the ape holding it and the rest of the jungle.

sqreept | 6 years ago | on: How Can a Star Be Older Than the Universe?

And and expansion is a lengthening of something within a frame of reference. What is the frame of reference?

sqreept | 6 years ago | on: Apple Removes HKmap.live from the App Store

They should be using Apple Maps to attack Police and then Apple should remove that shady app used obviously for no good.