Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
view the rest of the comments
The major difference I see, is that current AIs only provide narrow AI. They have very few sensors and are optimized for very few tasks.
Broad AI or human intelligence involves tons of sensors/senses which may not directly be involved in a given task, but still allow you to judge its success independently. We also need to perform many different tasks, some of which may be similar to a new task we need to tackle.
And humans spend several decades running around with those senses in different situations, performing different tasks, constantly evaluating their own success.
For example, writing a poem. ChatGPT et al can do that. But they can't listen to someone reading their poem, to judge how the rhythm of the words activates their reward system for successful pattern predictions, like it does for humans.
They also don't have complex associations with certain words. When we speak of a dreary sky, we associate coldness from sensing it with our skin, and we associate a certain melancholy, from our brain not producing the right hormones to keep us fully awake.
A narrow AI doesn't have a multitude of sensors + training data for it, so it cannot have such impressions.
Google especially is working on multimodal models that do both language and image, audio, etc understanding in the same model. Their latest work, PaLM-E, demonstrates that learning in one domain (eg images) can indirectly benefit the model's performance in other domains (eg text) without additional training in the other domain.