Рет қаралды 12,996

What if we could do away with all the complexities of a neuron and just model neural networks with logic gates? Fundamentally, logic gates are not differentiable, but with some modifications, we can make it differentiable. We can also make the network learn which logic gate to use using a differentiable categorical distribution. This interesting paper at NeurIPS 2022 shows that using logic gates, we can get much faster inference times and similar accuracy to that of neural networks. Scaling it up, though, is an issue, and we discuss some ways which can potentially help to scale in the next phase of improvements.

Some references:

Paper: arxiv.org/pdf/2210.08277

DiffLogic Code Implementation: github.com/Felix-Petersen/dif...

Slides: github.com/tanchongmin/Tensor...

De Morgan's Laws: en.wikipedia.org/wiki/De_Morg...

Universal Logic Gates: www.electronics-tutorials.ws/....

Gated Linear Units (GLU): arxiv.org/abs/1908.07442

medium.com/deeplearningmadeea...

~~~~~~~~~~~~~~~~~~~~~~~~~~~

0:00 Introduction

1:48 Perceptron and Logic Gates

16:08 Differences between Perceptron and Logic Gates

20:10 What Logic Gates to model?

23:26 Logic Gates Network Overall Architecture

36:02 Difficulty in training Logic Gates

37:17 Relaxation 1: Real-valued Logics

38:33 Relaxation 2: Distribution over the choice of parameter

43:55 Training Setup

45:05 Configuring output for classification

49:21 Results

59:04 Exponential Growth of Gates

59:44 Limitations

1:01:43 My thoughts on how to model biological neurons

1:08:40 Discussion

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Discord: discord.gg/fXCZCPYs

Online AI blog: delvingintotech.wordpress.com/.

LinkedIn: www.linkedin.com/in/chong-min...

Twitch: www.twitch.tv/johncm99

Twitter: johntanchongmin

Try out my games here: simmer.io/@chongmin

44:12 The normal distribution is not just for neural networks, but also for the weights of the categorical distribution in order to choose which logic gate to use.

28:54 I actually wanted to refer to Gated Linear Units (GLU), instead of GeLU. This is actually implemented in TabNet (arxiv.org/abs/1908.07442). The idea is to have the input be a "volume control" via a sigmoid gate, which is multiplied to the original output of the neuron, to control how much of the actual output should flow through to the next layer.

1:03:33 Do note that capping output to 1 can also lead to vanishing gradients, if we are at the saturation point of some activation function like sigmoid or tanh. ReLU actually was designed to help with vanishing gradients, but can cause exploding gradients instead as it passes through the entire gradient from the next layer down to the earlier layers. Overall, the vanishing/exploding gradients is actually more of the result of backpropagation through layers, but can be worsened by larger/smaller weights. There is still some merit to limiting the output to prevent large magnitude output causing excessive weight change via backpropagation. The relation of capping output to 1 solving vanishing and exploding gradients is not as direct as I intended, and would also need a fundamental relook at backpropagation.

The obvious solution is to model the architecture like a biological system and use competing feedback loops to find a tunable equilibrium

Or find an algorithm that can jump through relative maxima in n dimensional space to find a lower state in the other side a-la quantum tunneling

5:10 The universal gates are actually NAND and NOR. Refer to www.electronics-tutorials.ws/logic/universal-gates.html#:~:text=Universal%20Logic%20Gates%20using%20only,it%20a%20universal%20logic%20gate Also, De Morgan's Law just helps to simplify boolean/logical expressions, but it does not guarantee expressivity. It is actually the universal gates that guarantee expressivity of any boolean expression from just a single fixed gate.

An interesting topic, eventually it is about the math that most closely matches fast operations on hardware, (like multiply-add). in this problem scenario, it might perhaps not be about the gate types and what they can do but rather the connections themselves, ea where an axion gets the most reward form (ea think of the perspective of the connecting lines, not the neurons), evolves a network based upon the idea of self-connecting axions, maybe guided as in 'food- ant' simulations (those pixel ants leaving trials for where to find food) that is some kind of evolutionary road mapping. (in such a network generic neurons of all types can be added, ea And, Or, NAND XOR, delays, even bit/byte/int array shifters ).

I agree. Modifying the strength of connections, rather than changing the gates, makes more sense to me. The ant simulation theory is interesting, wonder if that can also work for traditional neural network training.

@John Tan Chong Min well maybe it can mimic the growing structure of the brain too, though I wouldn't know how to code your novel ideas

Interesting work, well done. Those who are asking are just annoying. Let the man finish his idea then ask, most of them are commenting not asking.

Haha thanks:) No worries, these ppl are my friends, they are just very eager to clarify.

You are Underrated sir

Are neurons accessed the same way a logic gate wold be? I assume gates would need to be accessed through the initial source of the chain whereas neurons seem to be accessed independently. Maybe I misunderstand my, English second language

Thanks for the question. A neuron has inputs, just like logic gate. The only difference is that the neuron has a learnable function that maps inputs into outputs y=learnable_fn(x). A logic gate has a fixed function, y = fixed_fn(x). The paper tries to make the function "learnable" by choosing one of the 16 logic gates, which I feel may be the cause of training instability because the change between logic gates can be very drastic.

I see! Thank you for the response! This is all very fascinating.

with this information i can now understand minecraft redstone :)

haha how are you planning to do this?

aah quality content

not A and not B = not (A or B)

Genius

John Tan Chong Min

Рет қаралды 45 М.

John Tan Chong Min

Рет қаралды 14 М.

Касымбай VS Мамиев. Хадис VS Родригес. Юсупов VS Найман. Азизхан VS Жуман. Тигр VS Черкес. КАЗАХСТАН

HFC MMA

Рет қаралды 2,3 МЛН

John Tan Chong Min

Рет қаралды 4,1 М.

° ๑ Чудо-Чай ๑ °

Рет қаралды 2,7 МЛН

Scary Teacher Türkiye

Рет қаралды 29 МЛН

John Tan Chong Min3 ай бұрын