Image search with user-provided embeddings
This article shows you the main steps for performing multimodal searches where you can use text to search through a database of images with no associated metadata.
Requirements
- A database of images
- A Meilisearch project
- An embedding generation provider you can install locally
Configure your local embedding generation pipeline
First, set up a system that sends your images to your chosen embedding generation provider, then integrates the returned embeddings into your dataset.
The exact procedure depends heavily on your specific setup, but should include these main steps:
- Choose a provider you can run locally
- Choose a model that supports both image and text input
- Send your images to the embedding generation provider
- Add the returned embeddings to the
_vector
field for each image in your database
In most cases your system should run these steps periodically or whenever you update your database.
Configure a user-provided embedder
Configure the embedder
index setting, settings its source to userProvided
:
curl \
-X PATCH 'MEILISEARCH_URL/indexes/movies/settings' \
-H 'Content-Type: application/json' \
--data-binary '{
"embedders": {
"EMBEDDER_NAME": {
"source": "userProvided",
"dimensions": MODEL_DIMENSIONS
}
}
}'
Replace EMBEDDER_NAME
with the name you wish to give your embedder. Replace MODEL_DIMENSIONS
with the number of dimensions of your chosen model.
Add documents to Meilisearch
Next, use the /documents
endpoint to upload the vectorized images.
In most cases, you should automate this step so Meilisearch is up to date with your primary database.
Set up pipeline for vectorizing queries
Since you are using a userProvided
embedder, you must also generate the embeddings for the search query. This process should be similar to generating embeddings for your images:
- Receive user query from your front-end
- Send query to your local embedding generation provider
- Perform search using the returned query embedding
Vector search with user-provided embeddings
Once you have the query's vector, pass it to the vector
search parameter to perform a semantic AI-powered search:
curl -X POST -H 'content-type: application/json' \
'localhost:7700/indexes/products/search' \
--data-binary '{
"vector": VECTORIZED_QUERY,
"hybrid": {
"embedder": "EMBEDDER_NAME",
}
}'
Replace VECTORIZED_QUERY
with the embedding generated by your provider and EMBEDDER_NAME
with your embedder.
If your images have any associated metadata, you may perform a hybrid search by including the original q
:
curl -X POST -H 'content-type: application/json' \
'localhost:7700/indexes/products/search' \
--data-binary '{
"vector": VECTORIZED_QUERY,
"hybrid": {
"embedder": "EMBEDDER_NAME",
}
"q": "QUERY",
}'
Conclusion
You have seen the main steps for implementing image search with Meilisearch:
- Prepare a pipeline that converts your images into vectors
- Index the vectorized images with Meilisearch
- Prepare a pipeline that converts your users' queries into vectors
- Perform searches using the converted queries