Tokenization is a Compression Codec Nobody Uses That WayVector databases store raw UTF-8 text payloads with zero compression. BPE tokenization + entropy coding gives you 5x lossless compression using infrastructure already in every ML stack.Published onMay 12, 2026
Why I'm Obsessed With SearchSearch is everywhere and is one of the hardest problems in CS. Here's why I'm obsessed with it.Published onApril 9, 2026
2025: Year in ReviewReflections on a year of growth, travel, and pursuing masteryPublished onJanuary 16, 2026