(no title)
alexwatson405 | 2 years ago
Firstly, the concept you're hinting at is not purely traditional ML. In traditional machine learning, we often prioritize feature extraction and engineering specific to a given problem space before training.
What you're describing and what we've been working on at Gretel.ai, is leveraging the power of models like Large Language Models (LLMs) to understand and extrapolate from vast amounts of diverse data without the need for time-consuming feature engineering. Here's a link to our open-source library https://github.com/gretelai/gretel-synthetics for synthetic data generation (currently supporting GAN and RNN-based language models), and also our recent announcement around a Tabular LLM we're training to help people build with data https://gretel.ai/tabular-llm
A few areas where we've found tabular or Large Data Models to be really useful are: * Creating privacy preserving versions of sensitive data * Creating additional labeled examples for ML training (much less expensive than traditional data collection/ml techniques) * Augmenting existing datasets with new fields, cleaning data, filling in missing values
Lots of mentions of RLHF here in the threads, one area I think RLHF will be super helpful is in ensuring that LLM data models return diverse and ethically fair results (hopefully better than the data they were trained on). Cheers!
No comments yet.