Large Language Models: Part 2

How do language models like GPT and Palm work?
Part 1: • Large Language Mo...
See next: text-to-image (Parti, Imagen, Dall-E): • Text to Image in ...
0:00 - intro
0:14 - next word prediction
0:20 - word embeddings
1:01 - transformers
3:11 - generating text
4:13 - stacking attention layers
4:47 - training data
5:21 - GPT-3 examples

Tim Hulse
I really enjoyed both of these LLM videos. They are so concise and informative and the pacing is excellent.
He is an Arab from the Middle East. My dog wants a walk.
I [or shall I say, my internal GPT?] first misread your comment as "... and the pancake is excellent". 😃
This is one of the best explainer vids on LLMs I’ve seen yet. Not too long, not too short, good pacing, good visualizations. Great work, thanks!
kahoku
Just like human "brain cramps" or "brain farts". The problem is that the current models aren't self learning or made to analyse their own answers so they don't correct these error before output or can correct the weights in the future when confronted with new data.
It Re-raises old but interesting points about the difference between language and logic. You can make linguistically sensible statements that defy logic. Try the book “Godel, Escher, Bach” if you like this area.
6:02 If you are just just south of the North Pole, turn right and turn left over the North Pole, you would be heading South. I suspect it is getting confused due to the similar riddle where if you head North, turn right and you are now heading South - where are you? It has probably seen this riddle a confusing number of times and weaved that into its weights/response.
Because it doesn't necessarily understand how directions work, it only knows how "north" and "right and left" has been used in language before, and is only an estimation. People are unlikely to talk about celestial directions and turn right or left, so there was probably not enough similar to your query in its training data that it could draw on and other examples such as "north and south" have taken more weight.
Logic is a whole different thing ig
I like that you used a recipe prompt to demonstrate what a LLM is good at doing, then actually followed the recipe and proved that it actually worked (and tasted good!).
Great stuff. Possibly the best intro material to LLMs that I have seen. Thunbs up!
I think this is the best, most intuitive and most illustrative video describing LLMs/transformers. Thank you so much!
This is the single best explaination I’ve come across on LLM’s
Thanks so much for both of these videos. They are wonderful. I think I understood them a bit more since I’ve done some basic assisted machine learning dev (up to neural networks). If anyone is a bit lost, read up on linear and logistic regression, then onto neural networks.
This is a really good intro indeed! I encourage to make more content like this
Really easy to follow, well paced, easy on the ear, and just the right level thanks!
You really have a talent to teach things.
These are two great videos that introduced how large language model works in a very comprehensive way.👍👍👍
Brilliant overview for a non-technical person like me... and glad to see you tested the recipe!
So pleased to get a clear and credible glimpse under the hood. Thank you.
EXCELLENT Vid, PLS DO MORE, on Deep learning , covering the whole workflow of making an LLM, especially, what os LORa, VEctor embeddings, etc I'm sure you'll get huge interest You have a gift for explaining. Thanks!
Very much enjoyed these two videos. More please! Clear and detailed.
This is my first search for a LLM explanation and very pleased with the video. I am not a mathematician or programmer but I am very interested in learning how LLM works. From my humble perspective I can say we reached a point of no return and this technology is progressing at an exponential rate. With the development of quantum computing, I have no doubt that it will surpass human intelligence in ways we don't understand.
Zuqini
I'm a bit confused by how stacking attention layers works at 4:12. Does the second layer take the first layer's prediction as input? Is the first layer's prediction still "next words" at that point, or is it now some sort of abstract intermediate value? How exactly does that capture higher level reasoning? Would appreciate any clarification!
Thank you! This is excellent. I love the animations. They are helpful!
What a great walk through! Thanks so much for sharing.
Great the concepts transition. Great illustrations. The best of the best this couple of videos. What about more on other networks like r-cnn and audio nets? 😃
Thanks! I'm working on one on reinforcement learning now...
This is awesome. Would love to learn more!
I hope you keep up with these videos, they are seriously great. Already suscribed and I'll check the rest of your channel. Thank you.
Not a second wasted. Just brilliant ❤️
Thanks for the crisp walkthrough of the technology. It is a very good introduction.
Donald Tam
Why dont you ask the language model how to make a language model?
I have! Really great tips!
ask gpt to stack transformer layers and add some input output layers. add some loss functions to optimize and get a large dataset.
Thank you. Those were very clear explanations of just the right length. Loved that you cooked the pancakes too!
This is the best video on Language Models that I seen. Probably the best on the Internet. You should maybe add chatgpt to the title to get more views.
Thanks for an original presentation of Large Language Models. It gave me new insight.
Great overview! People need to see this video pair before freaking out that LLMs are actually intelligent.
Thank you. Just the right level for my tiny organic brain.
nice work, really good to visualize these things even though I already know this.
speicaldark
What I'm wondering is how are they correcting its errors. For traditional NN, we have heat maps but I'd like to see something similar with transformers at the highest level to see what kind of patterns it noticed. Maybe that's what they use to correct its mistakes
@w花b ChatGPT used a process called reinforcement learning from human feedback (RLHF): They used an already trained GPT-3 which already at the time. Humans both submitted new sentences as input prompts to ChatGPT being trained, and also ranked the output (responses) of the model. Then the ranking of the responses were then used as reward targets to continue training the model to obtain more desirable responses (measured by how the response rankings had increased).
Pretty nice explanation in both videos.Thanks!
Ever think about audio synthesis and wave forms?? And how analogue synthesis utilising wave tables can offer a way to both communicate and comput information.
Dormin
Great video, leaving a comment to let you know it was very insightful. Thank you.
eliminating bias and stereotypes from language models is a lost cause, because it's the same as asking the network to lie.
you didn't understand anything did you
Loved this! Thanks for the great video!
Great explanation, glad I stumbled upon it ! Subscribed :)
Very clear. Thanks a lot.
what bothers me is that (if I am to accept your account of them) these models still seem largely sequence based. in my time in university we focused extensively on parsing the grammar before we even thought about letting AI predict the next word. some grammatical structures are triangular in the sense that they wrap around the structure of the previous iteration and put a word in front of AND behind it. think of constructs like "on the one hand X, on the other". things like that seem impossible to learn efficiently on the basis of sequence alone, since the length of X is variable. grammatical structure also simply eliminates a lot of possibilities when it comes to next word prediction. two words that have the same literalization aren't even the same word grammatically in a lot of cases.
Very good content! Keep going! thanks.
Seth Wieder
kahoku 4 ай бұрын
Very well made!
amazing videos!! learnt so much
Very interesting... Now I understand more about how ChatGPT works...
An excellent video on language models
R Dottin
it is learning long term dependencies in sentence structure to approximate what words should be there, but it's not necessarily correct. A good example of this was when I was trying to find research papers on a topic and I asked it for some references, and half of them were not real papers published by real people anywhere. But having a quick glance at it you would think they are real.
@Ms. Chanandler Bong And we all thought that would have been the hard part to solve. From there its trivial to double check its own output, which GPT4 already figured out how to do on its own.
@AI_effect gpt4 is still highly incorrect tho. they report an error rate of 11% for text outputs.
@Ms. Chanandler Bong Ilya Sutskever takes that point seriously, but he thinks reliability will not be a hindrance going forward.
@R DOTTIN was this in a new paper?
I love that you finished cooking the recipe! Great video :)
Sudo
Thanks for the good explanation, very much on time
This was awesome! Thanks
imagine combining GPT 4 with AlphaGo
Very underrated and underappreciated video.
Please do more videos on LLMs!!! But also I need to know, how were the pancakes?
just wow... nothing can match this explanation..
Well done!
great video bud cheers!
What do you think of GPT4?
somethingness
Great videos! Btw, how did you like your avocado cocoa thing? 😄
Just look how far we have come in only eight months.
I loved that you actually cooked that recipe! :-)
Awesome. Thank you
Ro
No. AI, even ones as impressive as GPT, are not intelligent. At the end of the day, it's still just a word prediction function that outputs text, predicting the next most probable word. It's like the auto-correct in your phone just massively scaled up.
depends on what you call intelligence. it is approximating the response you likely want to hear, not one that might be correct or one that is grounded in reality. but the get 4 paper has argues it may have sparks of intelligence. it's a really long paper but they talk about how they used the model trained purely on text and gave it ability to draw, and it could draw rudimentary shapes like pyramids just from description. But in the millions of books it has read there would have been description of how to draw a pyramidal shape on paper. would you say it's "intelligent" or just good at remembering? how do you even measure intelligence in language models like this? it's a philosophical question as much as a scientific one.
They are undoubtedly intelligent, in my opinion. Yes they predict the next word, but that prediction is based on what it has learned, and these LLMs end up learning logic, common sense, geometry, high-level math, and spatial reasoning. They're absolutely intelligent.
@Steven Laczko current LLMs are still orders of magnitude less dense than our brains. Do you think the models have sentience? id also like to know your thoughts on generative networks like stable diffusion for art. In essence, they start by adding random noise to an input image till it's just noise. The network then learns to do the opposite in essence it will generate random noise and tune that noise to come up with an image that would result in an image similar to the input. would you say it's using it's imagination?
berbudy Ай бұрын
you actually made the cookies hahaha that's awesome, great video btw!
Cool video, thanks
GPT-4: Might I suggest dubbing them "Chocomole Pancakes"? 😮😂
Fantastic work, hope those pancakes tasted better than they looked! xD
they were seriously delicious
coastalBrake Ай бұрын
coastalBrake
It looks like GPT hands you a great chocolate guacamole pancake recipe... now I wanted to try too loool
Thumbs-up for actually making the pancakes. 😂
5:55 the german one is correct
fun funny fantastic and I am a fan!
ChatGPT gets the 37 question right now.
Big thanks mate
That language neural network at 0:50 belongs on a tshirt somewhere
Alt Alt
Wonderful stuff. 👍 Also, _please can I get some oven and the oven please let us have to do the run and not a big difference in a bit more about people who have been in touch your own house is the best way of a bit more about people..._ *My phone wrote the part in italics.
5.6K subscribers? 🤔 NOT FOR LONG.
6.3K one day later. Damn.....congrats. #HereBefore100K
😂😂😂 7:19
liked and subscribed
Was the chocolate guacamole pancake any good?
Soo, are cocoa-guacamole pancakes any good?
Believe it or not they are delicious! I actually tricked my son saying I was making chocolate pancakes, and he loved them :-)
Verdict on the guacookies?
pancake approved 🥞👍
Strange Law Ай бұрын
What is a 'parameter'? It seems like a basic concept that wasn't explained.
So how did the pancakes taste?
How did those pancakes taste?! for real!
They are seriously excellent -- I've made them multiple times
John Huang
Haha you made the pancake
Who else confused
How were the pancakes?
the Mona Lisa of LLM explanations .. thanks!
Okay, that's the best comment yet -- thank you :-)
Рет қаралды 3,8 МЛН