top | item 41325769

(no title)

ace_zhao | 1 year ago

For filtering, we currently use a two-step process, similar to how humans read information. The first step involves an initial screening through the title and brief description, identifying data that clearly meets the criteria for a direct digest. For data that isn't clearly suitable, we perform a full data evaluation. To save costs, we employ a mentor-apprentice model, where state-of-the-art models (4o and Sonnet3.5) are used for the initial evaluations, and their outputs are recorded as few-shot examples. These examples are then used to guide more cost-effective models in subsequent processing.

discuss

No comments yet.