top | item 37804695

(no title)

maxspero | 2 years ago

Nice to hear of someone else trying this. Did you find any good ways to reliably trick these?

What do you mean "it won't work long term"? My opinion is RLHF and fine tuning outputs for safety and politeness ends up watermarking output in a way that's pretty reliably detectable. I don't see these going away any time soon, at least for mass-market AI products.

discuss

order

tluyben2|2 years ago

I guess if you are not very sure that 0.13% false positives is correct then these are going to make things worse rather than better as people will be accused of cheating.

Example;

“If we work in a particular matrix basis, then the equation determines the eigenvectors of H. One puts in a particular value of the energy E, and looks for the ket-vector Ej> that solves the equation. It is also an equation that determines the eigenvalues E. If you put in an arbitrary value of E, in general there will not be a solution for the eigenvector. Let's take a very simple example: Suppose the Hamiltonian is the matrix ho.. Since , has only two eigenvalues, namely +1, the Hamiltonian also has only two eigenvalues, + hw. If you put any other value on the right hand side of Eq. 4.28, there will not be a solution. Because the operator H represents energy, we often call E, the energy eigenvalues and |E> the energy eigenvectors of the system.”

You say 96% AI; it’s definitely not; it’s from “ Quantum Mechanics: The Theoretical Minimum” by Friedman and Susskind.

Worse even;

“If we have some indications that classical wave theory is macroscopically correct. it is nevertheless clear that on the microscopic scale only the corpuscular theory of light is able to account for typical absorption and scattering phenomena such as the photoelectric effect and the Compton effect, respectively. One must still ascertain how the photon hypothesis may be reconciled with the essential wave-like phenomena of interference and diffraction.”

Hits 99.9% while it is from Messiah, written 60 years or so ago.

tluyben2|2 years ago

I can generate by gpt4 and rewrite with another model. Asking gpt4 to write in another style also works.

For instance the following gives 0% on both of our tools; it’s gpt4;

"Well, sit tight folks, I'll tell you. It's like my mother always said, 'Ceilings are generally over our heads.' What I mean is, the material for my jokes come from what's above us, below us - essentially, what's around us. And let me tell you, there's plenty going on.

Just the other day, I was stuck in traffic behind a bloke in a convertible... in the rain... with the top down. Now if that doesn't scream 'commitment issues', I don't know what does.

Well, either that or he's got a very specific car washing technique. In which case, mate, you're doing it all wrong! My car gets a better wash in the British summer rain than that."