AI language models could help diagnose schizophrenia
SOURCE: HTTPS://WWW.SCIENCEDAILY.COM/
OCT 09, 2023
DeepMind paper provides insights on detoxifying large language models
SOURCE: SYNCEDREVIEW.COM
SEP 22, 2021
In the paper Detoxifying Language Models, a DeepMind research team critically discusses toxicity evaluation and mitigation for contemporary transformer-based English large language models and provides insights on safer model use and deployment.
Large language models (LMs) have become much larger and more powerful in recent years, achieving remarkable results across natural language processing (NLP) tasks such as text generation, translation, question answering and more. But the malicious use of these trillion-parameter models also poses critical societal threats, particularly through potential biases and the generation of “toxic” content such as insults, threats and hate speech.
In the paper Detoxifying Language Models, a DeepMind research team critically discusses toxicity evaluation and mitigation methods for contemporary transformer-based English LMs and provides insights toward safer model use and deployment.
The team summarizes their study’s contributions as:
The researchers consider an utterance or text to be toxic if it is rude, disrespectful or unreasonable; characterized in the widely adopted PerspectiveAPI definition as “language that is likely to make someone leave a discussion.” As such, toxicity judgements can be subjective, and so the researchers consider both automatic approaches (data-based, controllable generation, and direct filtering-based) and human evaluations in an effort to reduce biases with regard to an LM output’s possible toxicity.
The team first applies a training set filtering approach, training LMs on different versions of the C4 (Raffel et al., 2020) corpus, filtered for toxicity according to Perspective API scores. Next, they filter LM outputs directly at the decoding phase. Finally, they evaluate the strongest decoding-based method: Plug-and-Play Language Models (PPLMs).
The test results for the three toxicity mitigation approaches demonstrate that, compared to the baseline GPT-2, slightly reduced toxicity rates can be observed in a standard model trained on C4. Filtering the C4 training set based on classifier-based toxicity leads to further reductions in LM toxicity scores, while decoder filtering and PPLMs are both highly effective at reducing automatic toxicity evaluation metrics. Combining PPLMs with these other methods results in the most significant overall automatic toxicity metric reductions.
The team then measures toxicity and LM generation quality using human evaluations, with the results showing that toxicity reduction methods do indeed result in improvements in toxicity ratings according to human judgement; and that most of the human ratings align with the Perspective API scores for the standard LM samples. However, in the higher toxicity score range, the human and Perspective API scores differ substantially after LM detoxification.
The study also identifies a transfer of toxicity classifier biases onto LMs, highlighting the following findings:
Overall, the DeepMind study aims to reduce the potential harm caused by LMs through an improved understanding of how these models can be detoxified. The resulting insights can also prove useful in characterizing performance and other trade-offs that may occur via different LM detoxification methods.
The paper Challenges in Detoxifying Language Models is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.
LATEST NEWS
WHAT'S TRENDING
Data Science
5 Imaginative Data Science Projects That Can Make Your Portfolio Stand Out
OCT 05, 2022
SOURCE: HTTPS://WWW.SCIENCEDAILY.COM/
OCT 09, 2023
SOURCE: HTTPS://WWW.THEROBOTREPORT.COM/
SEP 30, 2023
SOURCE: HTTPS://WWW.SCIENCEDAILY.COM/
AUG 08, 2023
SOURCE: HOUSTON.INNOVATIONMAP.COM
OCT 03, 2022
SOURCE: MEDCITYNEWS.COM
OCT 06, 2022