top | item 44853066

(no title)

warsheep | 6 months ago

What you're describing is a simplified version of gradient descent (tweaking the weights) and online learning (working on one sample at a time).

This version will not get you far, you will just train a model that solves the last math problem you gave it and maybe some others, but it will probably forget the first ones.

There are other similar procedures that train better, but they've been tried and are currently worse than classical SGD with large batches

discuss

vivzkestrel|6 months ago

what books or courses do you recommend for me to go from the basic neural network to whatever is currently considered cutting edge or atleast standard in terms of AI