I'm really interested in using this but wonder if the unique architecture means that it will not be able to be converted to a GGUF and used by ollama or llama.cpp? I certainly would understand that the observability features would require some custom tweaks, but I'd just like to try it out on my local ai server (basically just ollama + tailscale) and see how it works as a regular model.
monocasa|7 days ago