Modern search engines combine multiple retrieval techniques: lexical search (BM25), semantic vector search, caching, and ranking.
I wanted to understand how these components interact, so I implemented a miniature search pipeline from scratch.
Key parts:
• Bloom filter to skip zero-result queries • LSM-tree backed inverted index • HNSW graph for semantic vector search • W-TinyLFU admission-aware caching • Reciprocal Rank Fusion to merge rankings
One interesting optimization was using skip pointers in the posting lists to reduce intersection complexity from O(n*m) to roughly O(n * sqrt(m)).
Another was using deterministic N-gram embeddings to avoid external embedding APIs.
Full writeup + code: https://github.com/AyushSuri8/nexus-search-engine
[link] [comments]









![The Gang Republic: Inside Haiti’s New Order (2026) - ~3 million people living in the grips of all-out gang war. France24 spent a fortnight filming in and around the Haitian capital, speaking to a population held hostage by this drawn-out crisis (CC) [00:52:38]](https://external-preview.redd.it/0j1B98qWy2MAsjLEwjT10EbknBToMVuWRJ-tUeZsTso.jpeg?width=320&crop=smart&auto=webp&s=041d55dee546ef807e7eda2e0d1d013111f02a25)

English (US) ·