(no title)
merksittich | 7 months ago
- Are you providing reasoning traces, responses or both?
- Are you evaluating reasoning traces, responses or both?
- Has your work shifted towards multi-turn or long horizon tasks?
- If you also work with chat logs of actual users, do you think that they are properly anonymized? Or do you believe that you could de-anonymize them without major efforts?
- Do you have contact to other evaluators?
- How do you (and your colleagues) feel about the work (e.g., moral qualms because "training your replacement" or proud because furthering civilization, or it's just about the money...)?
unknown|7 months ago
[deleted]