this post was submitted on 10 Jul 2023
327 points (91.4% liked)
Technology
59346 readers
7627 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Now as I stated in my first comment in these threads, I don’t know terribly much about the technical details behind current LLM’s and I’m basing my comments on my layman’s reading.
Could you elaborate on what you mean about the development of of deep learning architecture in recent years? I’m curious; I’m not trying to be argumentative.
Transformers. Fun fact, the T in GPT and BERT stands for "transformer". They are a neural network architecture that was first proposed in 2017 (or 2014 depending on how you want to measure). Their key novelty is the method of implementing an attention mechanism and a context window without recursion, which was the method most earlier NNs used for that.
The wiki page I linked above is admittedly a bit technical, this articles explanation might be a bit more friendly to the layperson.
Thanks for the reading material: I’m really not familiar with Transformers other than the most basic info. I’ll give it a read when I get a break from work.