How does this work? I thought it was probably powered by embeddings and maybe some more traditional search code, but I checked out the linked github repo and I didn't see any model/inference code. The public code is a wrapper that communicates with your commercial API?
Some searches work like magic and others seem to veer off target a lot. For example, "sculpture" and "watercolor" worked just about how I'd expect. "Lamb" showed lambs and sheep. But "otter" showed a random selection of animals.
It is powered by Mixedbread Search which is powered by our model Omni. Omni is multimodal (text, video, audio, images) and multi vector, which helps us to capture more information.
The search is in beta and we improving the model. Thank you for reporting the queries which are not working well.
Edit: Re the otter, I just checked and I did not found otters in the dataset. We should not return any results if the model is not sure to reduce confusion.
There's at least a little bit of otter in the data. The one relevant result I saw was "Plate 40: Two Otters and a Beaver" by Joris Hoefnagel.
I also expected semantic search to return similar results for "fireworks" and "pyrotechnics," since the latter is a less common synonym for the former. But I got many results for fireworks and just one result for pyrotechnics.
This is still impressive. My impulse is to poke at it with harder cases to try to reason about how it could be implemented. Thanks for your Show HN and for replying to me!
If you find more such cases please feel free to send them over to aamir at domain name of the Show HN. I would love to see those cases and see how we can improve on them. Thank you so much for the feedback.
This is neat, not sure how to report queries that are working poorly as you have mentioned. But when I search "Waltz" I am presented with Kitchen Utensils and only one piece of dancing folks. Presumably this is due to the Artist's name being 'Walton'.
We will add a feedback form tomorrow morning. For now please feel free to write to aamir at domain name of the page. thank you so much! this helps us a lot.
The results for "Mark Rothko", "Paintings by Mark Rothko", "Paintings similar to mark rothko" etc does not bring up anything that I was expecting. NGA has a large collection of Rothko paintings but none of them come up.
We are right now not including the artist name. Which will be done in the next iteration of the model (next week). Right now the search is only based on what the model can "see". And it seems like that the model does not understand the art of Mark Rothko.
The next version can see the image and read the metadata.
A bit more context: We are include everything in the latent space (embeddings) without trying to maintain multiple indexes and hack around things. There is still a huge mountain to climb. But this one seems really promising.
A search for : "character studies of old farmers" yielded good results.
The results are drawings / engravings, which may reflect the balance of the collection, and perhaps this subject is more used in practice than in marketable oil paintings.
Since this is a semantic search, using a vector embedding, it will handle meanings better than a text search, which would handle names better.
Would be interesting to know how relevant that approach is now.
Some searches work like magic and others seem to veer off target a lot. For example, "sculpture" and "watercolor" worked just about how I'd expect. "Lamb" showed lambs and sheep. But "otter" showed a random selection of animals.
The search is in beta and we improving the model. Thank you for reporting the queries which are not working well.
Edit: Re the otter, I just checked and I did not found otters in the dataset. We should not return any results if the model is not sure to reduce confusion.
I also expected semantic search to return similar results for "fireworks" and "pyrotechnics," since the latter is a less common synonym for the former. But I got many results for fireworks and just one result for pyrotechnics.
This is still impressive. My impulse is to poke at it with harder cases to try to reason about how it could be implemented. Thanks for your Show HN and for replying to me!
Colnomic and nvidia models are great for embedding images and MUVERA can transform those to 1D vectors.
“the pipeline” - seems like this is just a personal hackathon project?
Why these models vs other multimodals? Which “nvidia models”?
"Images of german shepherds" never fails to provide some humor.
This NGA link returns over a thousand pieces by Rothko: https://www.nga.gov/artists/1839-mark-rothko/artworks
The next version can see the image and read the metadata.
A bit more context: We are include everything in the latent space (embeddings) without trying to maintain multiple indexes and hack around things. There is still a huge mountain to climb. But this one seems really promising.
Since this is a semantic search, using a vector embedding, it will handle meanings better than a text search, which would handle names better.