← Back

1. Building an LLM from scratch

Machine Learning

Overview

I am building an LLM from scratch. No frameworks, no libraries, no nothing. Just pure math and code.

Approach

The first thing I learned was the structure of an LLM. We know that most modern LLMs are based on the transformer architecture.

transformer

Transformer architecture

  1. Step 1: Describe the first major step you took.
  2. Step 2: Describe the second step.
  3. Step 3: And so on...

Challenges

What was the hardest part? What bugs or blockers did you hit?

💡 Key insight: Use this box to highlight an important discovery or design decision.

Results

What did you achieve? Include metrics, screenshots, or output examples.

Reflection

What would you do differently? What did you learn? What are next steps?