For llama3 just ask him to install ollama and serve the model. Ollama has auto memory management and will free the model when not used, and whenever you make a call to the API (do let your friend know before you do this) ollama will reload the model back to memory again.
Not sure whether there are anything similar for SD though.
ttla|1 year ago