this post was submitted on 26 Jul 2023
6 points (100.0% liked)

Technology

20 readers
4 users here now

This magazine is dedicated to discussions on the latest developments, trends, and innovations in the world of technology. Whether you are a tech enthusiast, a developer, or simply curious about the latest gadgets and software, this is the place for you. Here you can share your knowledge, ask questions, and engage in discussions on topics such as artificial intelligence, robotics, cloud computing, cybersecurity, and more. From the impact of technology on society to the ethical considerations of new technologies, this category covers a wide range of topics related to technology. Join the conversation and let's explore the ever-evolving world of technology together!

founded 2 years ago
 

Exploring large language models' biases in historical knowledge

you are viewing a single comment's thread
view the rest of the comments
[–] sarsaparilyptus@discuss.online 0 points 1 year ago* (last edited 1 year ago) (1 children)

Here's a condensed version of the article:

Language Learning Model Outputs the Same Kind of Data as its Input

New groundbreaking journalism has discovered that when you analyze a large pluratily of a given language's written works, you get data that reflects the biases of the people who wrote it all down, most of whom died 100+ years ago. No word yet as to when the author plans to publish illuminating, hard-hitting investigative journalism into the effects of microwaves on popcorn.

[–] HeartyBeast@kbin.social 3 points 1 year ago

That’s the kind of ‘no shit Sherlock’ response that is common when people dismiss studies that have superficiallly ‘obvious’ results. It doesn’t mean the study isn’t worthwhile to see whether hunches stand up to scrutiny.

For anyone who wants a short summary - here’s ChatGPT’s attempt:

The article you shared discusses an experiment where the user prompted GPT-4 and Anthropic's Claude to list the top 10 important historical figures in 10 different languages. The analysis revealed several biases present in the language models' responses:

  1. Gender bias: Both models disproportionately predicted male historical figures, with GPT-4 generating female figures 5.4% of the time and Claude doing so 1.8% of the time.

  2. Geographic bias: There was a bias towards predicting Western historical figures, with GPT-4 generating figures from Europe 60% of the time and Claude doing so 52% of the time.

  3. Language bias: Certain languages suffered from gender or geographic biases more than others.

The article also highlights that the models exhibited a skewed view of history, focusing on political and philosophical figures while neglecting the diversity of professions that historical figures can encompass.

Overall, the analysis calls attention to the lack of representation of women in historical figures and the Western-centric perspective encoded in the language models, emphasizing the importance of being mindful of biases when using such models in various settings.