this post was submitted on 05 Dec 2023
121 points (87.6% liked)

Technology

59092 readers
6622 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

It's frustrating when you're not understood — especially when you're trying to speak to Siri, Alexa, or another internet-connected device.

Voice datasets that power voice recognition services are owned by a handful of major companies, and they can wildly underrepresent the voices of non-dominant accents, Black, Indigenous, and other people of color, disabled people and gender marginalised people. In fact, for people speaking other global languages - there may be no datasets at all.

That’s why Mozilla launched Common Voice — the world's largest public voice database, powered by the voices of volunteer contributors. Our goal is to teach machines how real people speak.

Today, we’re asking you to contribute to Common Voice, but we want you to choose how you’ll do it. Will you donate your voice to one of our Common Voice language datasets? Or will you make a $34 donation to Mozilla to support projects like this to reclaim the internet? (Or both!)

I'd be curious about the privacy concerns, but this might help a lot with underrepresented voice data. It might come down to if someone wants more datasets for their particular voice/language more than the other concerns.

If your language/accent is already well documented, it might not help as much?

you are viewing a single comment's thread
view the rest of the comments
[–] FaceDeer@kbin.social 53 points 11 months ago (1 children)

Mozilla: "We'd like to build a dataset of underrepresented languages and accents so that voice recognition works for everyone. It'll be under an open license."

Most of this thread: "GIVE ME MONEY."

Sigh. As soon as it turned out that AI training data was "worth something" everyone turned into a money-grubbing mercenary.

[–] gedaliyah@lemmy.world 33 points 11 months ago (2 children)

I'm not sure why there is so much anti-Mozilla hate. I know they're far from perfect but they do an awful lot for the open source world. Having an open database for voice training seems like something that the world can use to do some good.

[–] angrymouse@lemmy.world 5 points 11 months ago (1 children)

Because everyone knows better how to do open source, and these ppl are usually right ideally, but when you apply some concepts you can starve to death. Much of what mozilla does are not ideal, but are very good, and the only option of things we need today, not in 20-30 years.

[–] InstallGentoo@lemmy.zip -1 points 11 months ago

The problem with mozilla is they forget about their browser and completely ignore it. It's almost like even they have given up on firefox...

[–] Deckweiss@lemmy.world -3 points 11 months ago* (last edited 11 months ago) (1 children)

Because they are shady, extremely profitable and yet ask for donations like they are some greater good opensource nonprofit startup.

https://lunduke.locals.com/post/4387539/firefox-money-investigating-the-bizarre-finances-of-mozilla

[–] gedaliyah@lemmy.world 5 points 11 months ago (1 children)

But when I donate my voice, it's not going to some vault at Mozilla. It becomes part of an open resource that anyone can use to build models, libraries, etc.

Just because it is organized by a company that may or may not have nefarious goals, isn't that still a good thing to exist?

[–] Deckweiss@lemmy.world 1 points 11 months ago

Let me completely exaggerate to illustrate the concept:

If osama bin laden or hitler, mao, a terrorist org etc. start a charity to plant more trees you would feel uncomfortable planting trees for their charity.

If I don't fully trust a company, it discourages me from participating in anything they do, no matter the intention.