Biao's Blog

ML Systems

Notes on LLM systems, training infrastructure, and the machinery underneath.

2026

2025

2024

2023

How JAX Allocates Memory

Blogs

Technical notes and implementation writeups.

2025

What is Flash Attention?

What is Flash Attention?

A visual explanation of Flash Attention and how IO-aware tiling reduces memory traffic for modern attention kernels.

2024

2023

Template for a blog post

Template for a blog post

A compact typography and Markdown sample for checking how the blog theme renders common writing elements.