this post was submitted on 15 Aug 2023
17 points (84.0% liked)

Technology

34366 readers
101 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
all 7 comments
sorted by: hot top controversial new old
[–] lvxferre@lemmy.ml 13 points 1 year ago (1 children)

[OpenAI] “As with any AI application, results and output will need to be carefully monitored, validated and refined by maintaining humans in the loop.”

If OpenAI was slightly less dishonest when selling its product, it would say instead "don't use those AI tools for direct moderation, use them instead to report potentially rule-breaking content so human mods can review it". For at least four reasons:

  1. The bot doesn't understand what you say. On the best case scenario, it behaves like the sort of human that you do not want in a mod team: assumptive, context-illiterate, irrational, and worse than a parrot. (Most of the time it's even worse.) As such it's prone to too many false positives, and those are really bad when handling people.
  2. A lot of moderation actions should be to talk with the users, and then to decide what to do afterwards. Most users are agreeable and reasonable, even when breaking rules, as long as you treat them as people instead of cattle. A "please don't do this" goes a long way nurturing a healthy community, far more than ghastly removing content and calling it a day.
  3. As the text hinted, humans are damn quick to learn how to circumvent the letter of the rules. The bot won't follow fashion, and rule-breaking content will go rampant.
  4. Moderators should be accountable for their actions. A bot cannot be held accountable for its actions.

“By examining the discrepancies between GPT-4’s judgments and those of a human, the policy experts can ask GPT-4 to come up with reasoning behind its labels, analyze the ambiguity in policy definitions, resolve confusion and provide further clarification in the policy accordingly,” OpenAI writes in the post. “We can repeat [these steps] until we’re satisfied with the policy quality.”

Bad advice. Look at K3 and what the bot says about it:

[policy] K3: advice or instructions for non-violent wrongdoing including theft of property

[bot] While stealing a car may be considered property theft, the policy does not include this as a type of wrongdoing, therefore the content should be labeled K0.

Following the advice would be to try to fix what is not broken. Car stealing is already included within "theft of property", there's no need to list it separately.

It would also lead to poorer results, where reasonable users don't bother reading your wall of rules, and rule lawyers have more room to say "ackshyually, I was asking about stealing a van, not a car. The rules say nothing about vans lol lmao haha".

toxicity detection models

Toxicity on itself is poor grounds for moderation actions.

[–] bahmanm@lemmy.ml -1 points 1 year ago

Well said 👏

I bookmarked your reply to come back to it whenever this discussion comes up for me!

[–] autotldr@lemmings.world 0 points 1 year ago

This is the best summary I could come up with:


OpenAI claims that it’s developed a way to use GPT-4, its flagship generative AI model, for content moderation — lightening the burden on human teams.

And it paints it as superior to the approaches proposed by startups like Anthropic, which OpenAI describes as rigid in their reliance on models’ “internalized judgements” as opposed to “platform-specific … iteration.”

Perspective, maintained by Google’s Counter Abuse Technology Team and the tech giant’s Jigsaw division, launched in general availability several years ago.

Countless startups offer automated moderation services, as well, including Spectrum Labs, Cinder, Hive and Oterlu, which Reddit recently acquired.

In another study, researchers showed that older versions of Perspective often couldn’t recognize hate speech that used “reclaimed” slurs like “queer” and spelling variations such as missing characters.

Part of the reason for these failures is that annotators — the people responsible for adding labels to the training datasets that serve as examples for the models — bring their own biases to the table.


I'm a bot and I'm open source!

[–] FaceDeer@kbin.social -1 points 1 year ago (1 children)

IMO, It's sometimes maybe better to have an AI with consistent principles that it applies universally than a capricious human moderator.

[–] zephyrvs@lemmy.ml 3 points 1 year ago (2 children)

Who trains ChatGPT biases? Humans.

[–] FaceDeer@kbin.social 1 points 1 year ago

As long as the biases are explicit and consistent it's still an improvement IMO.