Latest

Kumar Shivendu's blog

  • I tried to build fake embeddings that match real ones. Matching all five popular structure metrics was easy — and the result was still a useless fog. The thing that actually worked was copying the geometry locally, which also tells you what a transformer's 24 layers are each doing.
    Published on
  • High-dimensional random vectors are nearly equidistant — the 'nearest neighbor' is barely nearer than nothing, and a search index has nothing to navigate by. I unpack what that means, what 'structure' is the cure for, and what structure actually buys: mostly compute, not recall.
    Published on
  • A friend said they expand SPLADE terms into an OpenSearch BM25 text field and it works. We tested it — 0.36 NDCG@10, worse than plain BM25. Here is why, and how to mostly fix it.
    Published on
Subscribe to the newsletter