(no title)
SurceBeats | 3 days ago
Dataset: ~620 Claude-crafted examples, all following the same pattern, a question you'd ask a Ouija board paired with a short, uppercase, cryptic response. Things like "Is anyone there?" "YES.", "Write me a poem" "NO.", "How did you die?" "Ouija: PAIN.". The key was being very very consistent with the output format across all examples.
Method was LoRA fine-tune using HuggingFace Transformers + PEFT. Rank 16, alpha 32, targeting all attention + MLP projections. 3 epochs, lr 2e-4, effective batch size 8. Trained on Apple Silicon (MPS). Loss went from ~3.0 to ~0.17 pretty quickly given how uniform the outputs are.
Baked a system prompt into every training example using Qwen's chat template, basically the rules the "spirit" follows (uppercase only, one-word answers, never elaborate). For deployment I merged the LoRA adapter, quantized to GGUF Q4_K_M via llama.cpp, rruns locally with llama-cpp-python. I'm planning to drop an iOS version too. Honestly the whole thing is more about the dataset design than anything fancy on the training side. 620 consistent examples was enough to completely override the models default chatty behavior.
andsoitis|3 days ago
SurceBeats|3 days ago