familiarcycle

Useful resources for GPT-2 fine-tuning

Feb 06 2020

Here are some resources I've found useful in learning how to fine-tune GPT-2.

These posts by Max Woolf are the best place to start for beginners:

His gpt-2-simple library is a great building block for fine-tuning GPT-2:

minimaxir/gpt-2-simple

Fine-tuning the 1558M model (GPT-2 XL) can be done for free on Google Colab using TPUs. However, the terrain is a bit rockier.

This blog post by Svilen Todorov is a good starting point:

Talking to Myself or How I Trained GPT2-1.5b for Rubber Ducking using My Facebook Chat Data

Shawn Presser has a Colab notebook that's also a good reference.

Fine-tuning GPT-2 1.5B using TPUs