Jobs In Data - How to learn LLMs - a subjective roadmap

How to learn LLMs - a subjective roadmap

Updated on December 7, 2023

In this article, I explore the pragmatic approach to getting into the field of Large Language Models from the perspective of an ML practitioner with most of my prior experience in tabular data. My ultimate objective was to train a useful model to predict a salary based on the job description, using the state-of-the-art technology available. Below, I present a collection of materials (ordered according to my thought-flow) that I gathered during my journey.

Summary - the curent GPT SOTA & Roadmap

State of GPT - a talk by Karpathy, May 2023

Winning solutions' writeups in Kaggle NLP competitions 2022-2023

The free market - let the best model win. Thousands of programmers compete for fame on a reasonable-scale infrastructure. Models that win here are the true State of the Art (given the limitations of a single A100 GPU).

Papers on the most succesful models on Kaggle (the "free market") - working & cheap

Understanding the Foundations of Transformers

The classic, foundational paper - Attention is all you need
Intuition behind the Transformer architecture explained by Rachel Thomas and Jeremy Howard
Key advantages of the Transformer architecture explained in 8 mins by Karpathy to Fridman
Intro to transformers explained as an illustrative Kaggle notebook

Understanding LLMs

Chatgpt
LLama - An open source and free for commercial applications model from Meta
Bard - Google response to ChatGPT powered by lamda

Other materials

Experiments with GPT 4 - Sparks of AGI
Karpathy - how to recreate chat gpt
From ML to autonomus intelligence by LeCun
Limitations of LLMs cited by Yann LeCun
LLama available on Azure
Reddit discussion accompanying the release of Llama 2 on the SOTA in LLMs