top | item 43292104

(no title)

juxtaposicion | 1 year ago

It’s interesting to see how differentiable logic/binary circuits can be made cheap at inference time.

But what about the theoretical expressiveness of logic circuits vs baselines like MLPs? (And then of course compared to CNNs and other kernels.) Are logic circuits roughly equivalent in terms of memory and compute being used? For my use case, I don’t care about making inference cheaper (eg the benefit logical circuits brings). But I do care about the recursion in space and time (the benefit from CAs). Would your experiments work if you still had a CA, but used dumb MLPs?

discuss

order

scarmig|1 year ago

Well, with all 16 logic gates available, they can express all Boolean circuits (you could get that even with NAND or NOR gates, of course, if you are working with arbitrary as opposed to fixed connectivity). And so you could have a 32 bit output vector which could be taken as a float (and you could create any circuit that computes any bitwise representation of a real).

As for efficiency, it would depend on the problem. If you're trying to learn XOR, a differentiable logic gate network can learn it with a single unit with 16 parameters (actually, 4, but the implementation here uses 16). If you're trying to learn a linear regression, a dumb MLP would very likely be more efficient.

gwern|1 year ago

A MLP must be compilable to some arrangement of logic gates, so you could always try a tack like initializing everything as randomly-wired/connected MLPs, and perhaps doing some pretraining, before compiling to the logic gate version and training the logic gates directly. Or take the MLP random initialization, and imitate its distributions as your logic gate distribution for initialization.