GPT-3 Fine-tuning and NLP Basics

dl4c
nlp
gpt
Author

Joe Holmes

Published

November 29, 2022

As I did in my first entry, I’ll combine two lessons into a single post, since I already knew most of the stuff in lesson 3 (basic neural nets) and encountered the new material in 4, which is an intro to NLP. After also spending some time trying to better understand stuff like Word2Vec and transformer architectures the past few weeks, this lecture did a great deal to help me appreciate just how hard this stuff gets when you leave the conceptual examples and get to the state of the art. It’s invigorating and exciting, and the challenge inspires me to keep learning. (Jay Alammar’s blog has helped me a great deal in this regard.)

GPT-3 Fine-tuning and NLP Basics

One thing that struck me was that fine-tuning really seems to be the paradigm for modern NLP AI. I think I was previously under the impression that serious work involved training models from scratch, which is common-sensically impossible for language models unless you have a billion dollars. Seeing the description in the first part of this lecture, in which a LLM is first fine-tuned on a corpus of the kind of text the model will work with, and then fine-tuned again on the exact problem set it’s designed for, really opened my eyes. First the LLM trained on Wikipedia is fine-tuned on text straight from IMDB, so it can learn how people on the movie database talk, then a specific classification problem using IMDB text is fed to it. The teacher goes on to train a model that does a great job identifying the language used in a variety of real-life US patents.

This got me excited to do my own fine-tuning experiment. But first I did get a little more familiar with the basics of NLP.

Tokenization

Very helpful to finally get a clearer idea of how words are broken down. It’s like creating a Set of unique values in a given text. Though DL4C’s material on the relationship between tokenization of long strings and batch training whizzed over my head, everything else came together pretty simply. The key thing to takeaway, after understanding the basic idea of creating a vocabulary and encoding it numerically, is that subword tokenization is the more robust and popular choice today.

Fine-tuning GPT-3

In the code-along notebook for this lesson, you walk through HuggingFace transformers. It was cool to see, but a little bit tricky to implement on my own and understand. I tried out a couple different things–first I figured I’d use the BLOOM model on HuggingFace to experiment with Text Generation, then I thought maybe I’d use BERT on some simple task. None of this worked well and eventually I decided to do the fun, sexy thing and fine-tune GPT-3 like I’ve been meaning to for a while.

neuromancer.txt

I found the entire text of William Gibson’s seminal cyberpunk novel Neuromancer hiding for free in a GitHub Gist, so that gave me the place to start.

OpenAI specifies you need to feed your fine-tuning data to GPT like this:

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

and I discovered in a Reddit post that for general purpose text completion you just leave the "prompt" column totally blank. This makes a kind of intuitive sense, since the model is speaking up without any prompting when it is trained to always talk in a certain style of prose.

I wrote a tiny Python script to cut the novel in half.

neuro = []
f = open('neuromancer.txt')
for line in f.readlines():
    neuro.append(line)
f.close()
trunc = neuro[0:2000]
str_tru_neuro = ""
for line in trunc:
    str_tru_neuro += line
f = open("truncated_neuromancer.txt", "a")
f.write(str_tru_neuro)
f.close()

From there, OpenAI makes it easy to preprocess the data.

openai tools fine_tunes.prepare_data -f truncated_neuromancer.txt

And some simply processed data spits out (which, had I been more serious, I would’ve taken more time to clean.)

{"prompt":"","completion":" The sky above the port was the color of television, tuned"}
{"prompt":"","completion":" to a dead channel."}
{"prompt":"","completion":" \"It's not like I'm using,\" Case heard someone say, as he"}
{"prompt":"","completion":" shouldered his way through the crowd around the door of the"}
{"prompt":"","completion":" Chat. \"It's like my body's developed this massive drug defi-"}
{"prompt":"","completion":" ciency.\" It was a Sprawl voice and a Sprawl joke. The Chatsubo"}
{"prompt":"","completion":" was a bar for professional expatriates; you could drink there"}
{"prompt":"","completion":" for a week and never hear two words in Japanese."}

I tried out the truncated model on Ada, one of the cheapest GPT-3 models. It did OK, but didn’t wow me, so I decided to splurge $13.50 and fine tune the new Davinci model (the most powerful) on the entire book. Four epochs later we were ready to load the fine tuned model directly from GPT-3 playground.

Some results

Here was my prompt:

`The following is a cyberpunk flash fiction about eating breakfast.

I woke up with a terrible hangover and the burning desire for scrambled eggs.

And here’s what GPT-3, fine-tuned on Neuromancer, continued with:

I checked the fridge, but all I had was a gallon of soy-curd. I hate soy-curd. It's just not natural, you know? But there was no way I'd be able to get scrambled eggs and coffee in this part of town before the headache killed me. So I decided to suck it up and eat the curd.

Anyway, the thing about curds is that they're grown in vats by these little factory cells, see? And every now and then one of these little factories decides that instead of making more curd, they're just gonna make more factory.... It's kindof a virus thing. Anyway, so after a few days you've got this huge amorphous blob of self-replicating factory tissue with nothing much left over for producing actual food product.... Nowadays it's illegal to even possess them without government clearance...but this guy Freak we knew back in Atlanta used to keep some around as kindof a novelty item.... So anyway he'd pull out one o'clump o'factory and toss it on his countertop along with like ten pounds of other stuff he was selling at the time, throw on some water and let nature take its course for about twenty minutes or so....and when he'd scoop up all that stuff into bowls an hour later there would be all these perfect yellow globi; each one completely identical to its next door neighbor an' sitting at exactly tha same stage o' growth....he called 'em "Sunrise" eggs because thas what they looked like when he first pulled 'em outa th' water.....so anyway we kept them around until we found out how fast they breed.....I think ol Freak still has about ten pounds in Atlanta somewhere--he said once that

It’s so specific that I can’t help but wonder if it was plaigarized. Other experiments showed it has a remarkable ability to call back to your prompt’s subject from several paragraphs ago, for instance making a reference to scrambled eggs after several paragraphs of unrelated rambling. (My studies in transformer architectures indicate this is one of the special abilities of the attention-based models writ large.)

This one turned out much better, with a sillier prompt:

The following is a creative, groundbreaking short story about Joe Brandon, United States President, and his secretive double life as an elite hacker.

President Brandon put his feet up in the Oval Office, logging into cyberspace and getting ready to hack into the system. 

Some of this stuff appears to be direct bits of text from the novel, but it does a pretty awesome job of weaving the prompt back into the neuromantic story:

In a minute he'd be in the matrix, inside the computer's world. In there, it was easy to forget that you were just a kid from nowhere, no money and few friends. You were somebody else then, somebody who had it all. Somebody with a name like Joe. A handle like Ice or Silver. He'd log into his bank account and take out some money, buy some new software for his deck back at the White House.... Maelcum coughed politely from behind him in the yellow room. "Thanks, man," said President Brandon without turning around: "You're looking good." The Zionite grinned through his blond dreads and scratched at one of his many scabs; brown eyes sparkling above high cheekbones as he shuffled across the carpet to pull down one of Garvey's green leather chairs; sitting carefully on its edge with crippled legs crossed before him like sticks of kindling wood.... His name wasn't really Maelcum McNeill; but everybody called him that now because once he'd been part of an Irish navvy team working for an off-shore drilling company near Freeside 's core. Part pirate and part construction worker; but mostly pirate.... President Brandon stood up from Garvey's chair and walked across thick carpet to look out through one of Garvey's wide windows at Washington . He could see Virginia highrises across Rock Creek park-ing lots where red fire trucks danced around overturned fuel tanks.... The air smelled faintly sweet here in Garvey's study, despite slashes of gray ash that drifted down outside every time a fusion bomb went off somewhere over Babylon . The view was incredible: green trees heavy with pink flowers breaking up concrete avenues crowded with small shops beneath tall white office buildings ringed by razor wire fences topped by coils

It definitely sounds like Gibson’s style, too!

I was laying in bed on my ThinkPad, OpenAI playground with a fine-tuned model glistening in front of me. Just one more example prompt and I could publish this blog, I told myself. Just one more job before I'm out of the game.

GPT:

The prompt appeared. I delegated the task to my model and instantly fell asleep, dreaming of the body-modded starlets of the Rio de Janeiro nightclubs. When I woke up, a week had passed and my model had written this blog post for me. Now it's time to turn off my ThinkPad for good. Goodbye, cruel world! Or at least goodbye to you....

Sweet dreams.