← Back

1. Building an LLM from scratch

Machine Learning

GitHub ↗ Live Demo ↗

Overview

I am building an LLM from scratch. No frameworks, no libraries, no nothing. Just pure math and code.

Approach

The first thing I learned was the structure of an LLM. We know that most modern LLMs are based on the transformer architecture.

transformer

Transformer architecture

Step 1: Describe the first major step you took.
Step 2: Describe the second step.
Step 3: And so on...

Challenges

What was the hardest part? What bugs or blockers did you hit?

💡 Key insight: Use this box to highlight an important discovery or design decision.

Results

What did you achieve? Include metrics, screenshots, or output examples.

Reflection

What would you do differently? What did you learn? What are next steps?

Improvement idea 1
Improvement idea 2