top | item 37148221

(no title)

r13a | 2 years ago

Could anyone explain how this can be constructed as a private solution?

I'm not familiar with Azure platform.

Is the inference processed on private instance ? I can't imagine how it could be feasible given the hardware required to run gpt3.5/4.

So the best case scenario is:

1. A web ui runs on a private instances. So any user input (chat or files) are only seen by these instances 2. Any chat historisation or RAG is also done on these instances too. 3. Embeddings compuation may possibly be done on the private instance 4. The embeddings are then sent to the Microsoft GPU farm for inference.

So at one point my data has to leave my private network.

The problem is that the data can easily be retro-engineered from the embeddings.

How can this be presented as a private LLM ?

discuss

No comments yet.