The llama.cpp tools and examples download the models by default to a OS-specific cache folder [0]. We try to follow the HF standard (as discussed in the linked thread), though the layout of the llama.cpp cache is not the same atm. Not sure about the plans for RamaLama, but it might be something worth to consider.[0] https://github.com/ggerganov/llama.cpp/issues/7252
sitkack|1 year ago
If there was a contract about how models were laid out on disk, then downloading, managing and tracking model weights could be handled by a different tool or subsystem.
ecurtin|1 year ago