top | item 43326517

(no title)

hackerknew | 11 months ago

Could we train an AI model on the corpus of physics knowledge up to the year 1905 and then see if we can adjust the prompt to get it to output the theory of relativity?

This would be an interesting experiment for other historical discoveries too. I'm now curious if anybody has created a model with "old data" like documents and books from hundreds of years ago, and see if comes up with the same conclusions as researchers and scientists of the past.

Would AI have been able to predict the effectiveness of vaccines, insulin, other medical discoveries?

discuss

order

Garlef|11 months ago

Great idea!

But there might not be enough text.

And: There's a similar situation to why double blind studies are necessary - The questions we pose to such a system would be contaminated by our cultural background; We'd might be leading the system.

And if the system is autonomous and we wait for something true to appear how would we know that the final system, trained on current data produced something worthwhile?

Take maths: Producing new proofs and new theorems might not be the issue. Rather: Why should we care about these result? Thousands of PhD students produce new mathematics all the time. And most of it is irrelevant.

esafak|11 months ago

That's the ideal, but I think today's models are too crude for that. Relativity is built on differential geometry, which was new at the time. I think inventing or even building that is beyond today's models; there's an infinitely large space of mathematics that can be invented, and barely a gradient to guide the search. Humans don't coin mathematics by gradient descent. The most I've seen is fitting observations using existing mathematics; a technique known as symbolic regression. The E=mc^2 equation could be curve fitted like this, but it would afford no insight.

https://en.wikipedia.org/wiki/Symbolic_regression

ilamparithi|11 months ago

Had the same thought sometime back about AI discovering theory of relativity with only the data before 1905. It would give a definite answer about whether any reasoning involved in the LLM output.