Neil F. Johnson
Adv. Artif. Intell. Mach. Learn., 1 (3):191-202
Neil F. Johnson : The Dynamic Online Networks Lab, George Washington University, Washington D.C. 20052 USA.
DOI: 10.54364/AAIML.2021.1112
Article History: Received on: 15-Nov-21, Accepted on: 15-Dec-21, Published on: 23-Dec-21
Corresponding Author: Neil F. Johnson
Email: NEILJOHNSON@GWU.EDU
Citation: Richard F. Sear, Nicholas J. Restrepo, Rhys Leahy, Yonatan Lupu, Neil F. Johnson (2021). Machine Learning Language Models: Achilles Heel for Social Media Platforms and a Possible Solution. Adv. Artif. Intell. Mach. Learn., 1 (3 ):191-202
Any uptick in new misinformation that casts doubt on COVID-19 mitigation strategies, such as vaccine boosters and masks, could reverse society’s recovery from the pandemic both nationally and globally. This study demonstrates how machine learning language models can automatically generate new COVID-19 and vaccine misinformation that appears fresh and realistic (i.e. human-generated) even to subject matter experts. The study uses the latest version of the GPT model that is public and freely available, GPT-2, and inputs publicly available text collected from social media communities that are known for their high levels of health misinformation. The same team of subject matter experts that classified the original social media data used as input are then asked to categorize the GPT-2 output without knowing about its automated origin. None of them successfully identified all the synthetic text strings as being a product of the machine model. This presents a clear warning for social media platforms: an unlimited volume of fresh and seemingly human-produced misinformation can be created perpetually on social media using current, off-the-shelf machine learning algorithms that run continually. We then offer a solution: a statistical approach that detects differences in the dynamics of this output as compared to typical human behavior.