10 comments

  • fsckboy 23 hours ago
    these days i find myself yearning to type "Beatles abbey rd" and find only "Beatles abbey rd"
    • storystarling 22 hours ago
      I learned this the hard way on a book platform I'm working on. While semantic search is useful for discovery, we found that prioritizing exact matches is critical. It seems users get pretty frustrated if they type a specific title and get a list of conceptually similar results instead of the actual book. We ended up having to tune the ranking to heavily favor literal string matches over the vector distance to keep people from bouncing.
      • fsckboy 21 hours ago
        everything you are saying rings perfectly true to me but there's an additional problem I encounter. (i'm going to make up my example because i'm lazy to check but you'll get the idea) say you want to look up "Alexander the Great"...

        ...God help you if Brad Pitt and or the Jonas Brothers ever played a role with exactly that name-match. The web and search (and the culture?) have become super biased toward video especially commercial offerings, and the sorting ranked by popularity means pages and pages of virtually identical content about that which you are not interested in.

      • drsalt 17 hours ago
        why did you have to learn this the hard way?
    • qingcharles 15 hours ago
      I remember eBay 30 years ago when it would showed you whatever you typed in. Compared to 2026 where it only shows you everything except the thing you typed in.
    • Manfred 23 hours ago
      Especially with small datasets it’s more important to be exact at the expense of a user having to fix a typo.
  • gingerlime 23 hours ago
    Great post. Explains the concepts just enough that they click without going too deep, shows practical implementation examples, how it fits together. Simple, clear and ultimately useful. (to me at least)
  • pinkmuffinere 23 hours ago
    The rewritten title is confusing imo. Can I propose:

    “Finding ‘Abbey Road’ given ‘beatles abbey rd’ search with Postgres”

    • pinkmuffinere 23 hours ago
      (The missing close-apostrophe, and the use of “type” are what really confuse me in the original submission)
  • augusteo 17 hours ago
    On the API vs local model question:

    We went with API embeddings for a similar use case. The cold-start latency of local models across multiple workers ate more money in compute than just paying per-token. Plus you avoid the operational overhead of model updates.

    The hybrid approach in this article is smart. Fuzzy matching catches 80% of cases instantly, embeddings handle the rest. No need to run expensive vector search on every query.

    • TurdF3rguson 16 hours ago
      Those text embeddings are dirt cheap. You can do around 1M titles on the cloudflare embedding model I used last time without exceeding daily free tier.
      • augusteo 16 hours ago
        yeah exactly. even OpenAI/Gemini are really cheap too
  • timlod 20 hours ago
    FWIW, the performance considerations section is a little simplistic, and probably assumes that exact dataset/problem.

    For GIN for example, perfomance depends a lot on the size of the search input (the fewer characters, the more rows to compare) as well as the number of rows/size of the index.

    It also mentions GiST (another type of index which isn't mentioned anywhere else in the article)..

  • lbrito 23 hours ago
    I was just starting to learn about embeddings for a very similar use on my project. Newbie question: what are pros/cons of using an API like gpt Ada to calculate the embeddings, compared to importing some model on Python and running it locally like in this article?
    • storystarling 23 hours ago
      The main trade-off I found is the RAM footprint on your backend workers. If you run the model locally, every Celery worker needs to load it into memory, so you end up needing much larger instances just to handle the overhead.

      With Ada your workers stay lightweight. For a bootstrapped project, I found it easier to pay the small API cost than to manage the infrastructure complexity of fat worker nodes.

    • alright2565 23 hours ago
      Do you want it to run on your CPU, or someone else's GPU?

      Is the local model's quality sufficient for your use case, or do you need something higher quality?

  • TeamDman 23 hours ago
    for 50,000 rows I'd much rather just use fzf/nucleo/tv against json files instead of dealing with database schemas. When it comes to dealing with embedding vectors rather than plaintext then it gets slightly more annoying but still feels like such an pain in the ass to go full database when really it could still be a bunch of flat open files.

    More of a perspective from just trying to index crap on my own machine vs building a SaaS

  • danielfalbo 23 hours ago
    > Abbey Road

    > The Dark Side of the Moon

    > OK Computer

    Those are my 3 personal records ever. I feel so average now...

    • tialaramex 19 hours ago
      The other two are popular but "Dark Side of the Moon" in particular was extremely popular. Like, top 10 albums ever level popular.
  • cess11 23 hours ago
    I found fuzzy search in Manticore to be straightforward and pretty good. Might be a decent alternative if one perceives the ceremony in TFA as a bit much.
  • esafak 22 hours ago
    tl,dr: A demo of pg_trgm (fuzzy matcher) + pgvector (vector search).
    • TurdF3rguson 16 hours ago
      Sounds nice but I'm not sure that trigram brings anything to the table that vector didn't already bring.