Software Engineer

Turning GPT-2 into Alex Jones

This post is me riding the coattails of the GPT-2 hype train. Now for those that don’t know, GPT-2 is a machine learning model that is very effective at text generation. It can maintain context over a long period of text making it more ‘human’ than previous algorithms.

For those that are interested read this article on transformers and of course openai’s site

Now for the memes. To get GPT-2 talking like Alex Jones, we need text and a lot of it. Unfortunately, Alex Jones was largely purged from social media in 2019 so we’ll have to get creative. First, wikiquotes was a decent starting point. I used python w/ requests and bs4 to scrape the data (your smart, do it yourself).

I’ve looked for transcripts of the Alex Jones show to no avail. But what I did find was Alex Jones’s archived twitter on webrecorder.io. Not sure what kind of wizardry those guys are running for this web app, but I couldn’t get the HTML of his twitter feed for the life of me. So instead I just selected all text and wrote it to a file. Genius. There was some manual formatting I did to group them into tweets and then clean up the actual contents, but once that’s done it was somewhat useable for my purposes.

Now that we’ve collected around 900 lines of the good stuff we can train a model. Here’s the code I used (stolen from some medium post, probably):

#!/usr/bin/python3

import gpt_2_simple as gpt2
import os

model_name = "124M"
# model is saved into current directory under /models/124M/
if not os.path.isdir(os.path.join("models", model_name)):
	gpt2.download_gpt2(model_name=model_name)   

file_name = "./src/alex_jones.txt"

sess = gpt2.start_tf_sess()
gpt2.finetune(sess,
              file_name,
              model_name=model_name,
			  run_name='run2',
              steps=1000,
			  save_every=50,
			  print_every=5,
			  sample_every=10,
			  learning_rate = 0.0001
)

gpt2.generate_to_file(sess)


I’ve let this train in my homelab for a few hours while I slept, and checked back when I thought it was ready.

Here are some of the results:


Organizer: Planned Parenthood Ships Guns To The US  View all other

Open book censorship represents a poison pill for the people and a soulless dollop of
propaganda for the corporate media. #FreeInfowars #1A

These people are the future. They're such zombies that you can  not only read the news,
but look at these people! They're so freaking fastidious they have time organs  in their 
knees that they can hardly walk! They have no awareness of what's going on  and go "OOH 
O_O O_O"O Oh my gosh! That's evil! The Alex Jones Show, "HILLARY CLINTON IS A GOD DAMN 
DEMON", June 2016.

I'm like a gigantic pumpkinhead, sitting in a  park, and I'm thinking "Man, this is gonna
be a really bad day.  It just so happens, is gonna be really bad. I'm thinking ABOUT 
WHATEVER  I'm thinking about today. Yeah, I'm thinking

A vocal minority of Trump hating registered voters has called off an election for fear of
repercussions from his supporters.     #Trump2020 #MAGA #1a #USA

Video: left wing extremists targeting DC political events with machetes this weekend  

The coordinated corporate attack on Infowars has exposed a deceptic by any measure, but
the silencing of the truth is even more apparent.

Now you need to find a way to stop the Chinese from harvesting our blood.

The Rise Of Alex Jones As A Fake News Crikey, Tune in M F 11am 3pm central at:   << 
#RealNews #WednesdayWisdom #1a #ThursdayMotivation   #FreeInfowars

These are just a few of the hundreds of little gems GPT-2 spat out for us. There were a few artifacts that I didn’t like from the outputs; so in the alex_jones.txt, I went back and cleaned it up a bit more. The rest of the text files are relatively unmodified if you’d like to do the scrubbing yourself. Unlike GPT-2, all the code, output, and finetuning text I used can all be found here. Have fun. And don’t let the globalists win.