(no title)
dluc | 2 years ago
Initial tests though are showing that summaries are affecting the quality of answers, so we'll probably remove it from the default flow and use it only for specific data types (e.g. chat logs).
There's a bunch of synthetic data scenarios we want to leverage LLMs for. Without going too much into details, sometimes "reading between the lines", and for some memory consolidation patterns (e.g. a "dream phase"), etc.
ddematheu|2 years ago
For synthetic data scenarios are you also thinking about synthetic queries over the data? (Try to predict which chunks might be more used than others)
dluc|2 years ago
For instance, given the user "ask" (which could be any generic message in a copilot), decide how to query one or multiple storages. Ultimately, companies and users have different storages, and a few can be indexed with vectors (and additional fine tuned models). But there's a lot of "legacy" structured data accessible only with SQL and similar languages, so a "planner" (in the SK sense of planners) could be useful to query vector indexes, text indexes and knowledge graphs, combining the result.