top | item 46037678

(no title)

phh | 3 months ago

80% is catastrophic though. In a classroom of 30 all honest pupils, 6 will get a 0 mark because the software says its AI?

discuss

order

kelseyfrog|3 months ago

80% accuracy could mean 0 false negatives and 20% false positives.

My point is that accuracy is a terrible metric here and sensitivity, specificity tell us much more relevant information to the task at hand. In that formulation, a specificity < 1 is going to have false positives and it isn't fair to those students to have to prove their innocence.

soVeryTired|3 months ago

That's more like the false positive rate and false negative rate.

If we're being literal, accuracy is (number correct guesses) / (total number of guesses). Maybe the folks at turnitin don't actually mean 'accuracy', but if they're selling an AI/ML product they should at least know their metrics.

CaptainNegative|3 months ago

It depends on their test dataset. If the test set was written 80% by AI and 20% by humans, a tool that labels every essay as AI-written would have a reported accuracy of 80%. That's why other metrics such as specificity and sensitivity (among many others) are commonly reported as well.

Just speaking in general here -- I don't know what specific phrasing TurnItIn uses.

yoavm|3 months ago

The promise (not saying that it works) is probably that 20% of people who cheated will not get caught. Not that 20% of the work marked as AI is actually written by humans.

v9v|3 months ago

I suppose 80% means you don't give them a 0 mark because the software says it's AI, you only do so if you have other evidence reinforcing the possibility.

respondo2134|3 months ago

no, you multiply their result by .8 to account for the "uncertainty"! /s

j45|3 months ago

I think it means every time AI is used, it will detect it 80% of the time. Not that 20% of the class will marked as using AI.

respondo2134|3 months ago

you're missing out on the false positives though; catching 80% of cheaters might be acceptable but 20% false positives (not the same thing as 20% of the class) would not be acceptable. AI generated content and plagarism are completely different detection problems.