Share the article

Roadmap roundup: where Meilisearch is heading

We’ve been thinking a lot about where Meilisearch needs to go next. Not incremental improvements, but fundamental shifts in what Meilisearch can do, who it can serve, and how it fits into the modern AI stack.

Today, we’re sharing our updated roadmap. It’s organized around four themes:

Meilisearch for any workload
Hybrid by default
From API to platform
RAG at scale

Here’s the full picture:

Meilisearch for any workload

Meilisearch works great for most use cases today. But there’s a growing segment we want to serve better: large SaaS platforms with millions of tenants that need per-tenant search isolation. The economics simply don’t work when every tenant, active or not, requires dedicated compute.

We’re solving this with two complementary efforts:

1. Distributed search

We're well into this one already. Sharding lets Meilisearch distribute indexes across multiple nodes, breaking through the single-machine ceiling that currently limits dataset size. Replication ensures high availability: if a node goes down, search continues to work.

For our customers, this means Meilisearch can handle datasets that were previously too large. For SaaS companies, it means they can finally trust Meilisearch as the backbone for search at real scale.

2. Serverless indexes

This is the bigger bet. The idea: indexes that aren’t being queried should cost almost nothing.

Today, whether an index serves 10,000 queries per second or zero, it allocates the same compute and memory. That’s wasteful. With serverless, inactive indexes get moved to cheap object storage (like S3) and only spin back up when a query arrives. Active indexes behave exactly as they do today, no performance compromise.

We're targeting Q3 2026 for the first release of serverless indexes, starting with Cloud.

What this unlocks is significant. A SaaS with 1 million tenants, of which only 5% are active at any given time, would pay for 50,000 warm indexes instead of 1 million. The other 950,000 sit on S3 for fractions of a cent. Per-tenant search isolation becomes affordable at any scale.

It also unlocks something we’ve long wanted: a real free tier. Today, offering free plans means provisioning compute for users who may never come back. With serverless, a dormant free user’s index costs us almost nothing in storage. When they return, it wakes up and works. We can finally offer a generous free plan without burning money on idle infrastructure.

Serverless also compounds beautifully with geo-replication. Without it, replicating across five regions would cost 5x as much. With serverless, replicas in low-traffic regions scale to zero and only wake when needed. Global distribution becomes affordable instead of multiplicatively expensive.

Hybrid by default

Meilisearch already supports hybrid search, combining keyword matching with semantic understanding powered by AI embeddings. But setting it up requires too many decisions, too much configuration, and too much reliance on third-party providers. We want to change that.

This initiative is about making hybrid search the default, effortless experience, and building a genuine quality moat around it.

Simpler setup for embeddings and reranking

Today, getting semantic search working requires choosing a provider, picking a model, understanding dimensions, and writing a document template. Most developers just want to “make my search smarter.”

We’re building a guided setup in the Cloud dashboard that walks you through enabling semantic search in under two minutes. Sensible defaults at every step. We’re also adding AI-generated document templates: Meilisearch samples your documents, analyzes them, and automatically generates the optimal template. No more guessing what text to embed.

On the reranking side, we’re opening support for more providers beyond Cohere (Jina, Voyage, and a generic REST option), and making the composite embedder pattern (a fast local model for queries, a high-quality remote model for indexing) a simple toggle rather than a manual configuration.

A unified AI Gateway

Every embedding and reranking call currently goes directly to a third-party provider. Meilisearch has zero visibility: no cost tracking, no caching, no fallback when a provider goes down.

We’re building an AI gateway that sits between Meilisearch and all AI providers. It handles provider translation, retry with fallback, authentication, metering, and most importantly, aggressive caching. Embedding the same text always produces the same vector, so the gateway caches duplicate requests instead of paying for them again. This is especially valuable during re-indexing.

For Cloud users, this becomes the default. You enable semantic search, pick "meilisearch" as a source, and everything just works. No provider API keys needed. For self-hosted users, the gateway is available as a paid API with the same benefits. Switching providers becomes a configuration change; no re-indexing is required.

Our own models

This is the long-term play. We’re training and hosting our own embedding and reranking models, purpose-built for search.

General-purpose models from OpenAI or Cohere are trained for broad similarity tasks. They often duplicate what keyword search already does well, instead of filling in what keywords miss. Our models will be trained specifically to complement Meilisearch’s keyword engine, using co-optimization techniques that teach the model to capture semantic meaning where keyword search falls short.

We’re planning multiple tiers (small, large, multilingual) and a local reranking model that runs inside the Meilisearch binary itself. No external API calls, no network latency.

The quality bar: match established providers on standard benchmarks within 5%, and outperform all general-purpose models on Meilisearch’s hybrid search quality. We’ll publish these benchmarks openly. This creates a real moat: models co-optimized with Meilisearch’s ranking pipeline deliver results that competitors can’t replicate by just calling the same API.

From API to Platform

Meilisearch has grown far beyond a simple search engine. It supports sharding, index swaps, webhooks, document functions, detailed performance profiling, and more. But many of these capabilities are invisible in the Cloud dashboard. A feature that can’t be found might as well not exist.

This initiative is about closing the gap between what the engine can do and what customers can actually access, configure, and debug from the UI.

Full engine access from the dashboard

We’re surfacing the full power of the Meilisearch engine in the Cloud dashboard. Many features exist in the engine today but have no UI: shard management, index swaps for zero-downtime reindexing, webhook configuration, data transfer between projects, document functions (in-place transforms via code), search performance breakdowns, and proper index management with disk usage and compaction.

All of these will get first-class dashboard interfaces. The goal is that everything you can do via the API can also be done from the Cloud UI, with visual feedback and safety checks built in.

Built-in debugging and observability

Today, when something goes wrong, most customers file a support ticket. We want to change that by giving them the same tools our internal team uses.

The biggest one: we’re opening up our internal “top 50 slowest requests” dashboard to customers. It’s the same tool our team reaches for first when something feels slow. It will come with pattern detection and optimization suggestions for each slow query.

This builds on the per-request performance tracing we already shipped in v1.35.0, where adding showPerformanceDetails: true to any search query returns a full breakdown of where time is spent: tokenization, keyword search, semantic search, formatting, and more. If you've used the search preview in the Cloud dashboard, you've already seen this in action.

Beyond that, we’re building a better batch view (structured debugging instead of raw JSON), full engine telemetry with configurable alerts (latency percentiles, throughput, resource usage), an AI diagnostic helper that can analyze a slow request and explain what’s going on in plain language, an auto-optimize tool that suggests settings changes based on your data and usage patterns, and snapshot management for backups and point-in-time restore.

Together, these tools transform the Cloud dashboard from a basic management interface into a full observability and operations platform.

RAG at scale

Since launching Meilisearch Chat to general availability last year, we've seen strong adoption and learned a lot about what teams actually need when putting conversational search in front of real users. This next phase is squarely focused on experience: making it faster to configure Chat directly from the Cloud UI, reducing the setup friction that still requires touching code, and giving teams more control over how their chat interface looks and behaves directly from the dashboard.

A smarter chat engine

The engine-level changes make the chat feature fundamentally more capable.

The biggest win is parallel multi-search. Today, the LLM must make sequential tool calls, one round-trip at a time. For a question like “compare Nike and Adidas running shoes under $150,” that means search, wait, search again, wait again. We’re replacing this with a single tool call that fires multiple searches in parallel. All results come back at once.

We’re also simplifying configuration dramatically. Today, setting up a chat agent requires N+1 API calls across two separate levels. We’re merging everything into a single agent configuration: provider, system prompt, guardrails, few-shot examples, and per-index search parameters in one call.

Other key improvements include dynamic facet discovery (the LLM explores filterable attributes on demand instead of relying on static data baked into the prompt), external tool calling (pass your own tools alongside Meilisearch search for complex agent workflows), conversational memory (the LLM stores and retrieves user preferences across sessions, scoped by tenant token), and better source attribution with real-time citation events streamed alongside the answer.

From API to product

The engine improvements above mean nothing if developers can’t discover and use them.

We’re building a chat playground in the Cloud dashboard where you can test your agent, see which documents were retrieved, and watch the LLM’s internal tool calls in real-time. No code required to start experimenting.

The second big piece is auto-generation. Meilisearch already knows your data. We’ll use AI to analyze sample documents and automatically generate optimal configurations: system prompts, index descriptions, document templates, search parameters, guardrails, and few-shot examples. A developer should go from “I have indexed data” to “I have a working chat agent” in under five minutes.

We’re also building a Chat UI starter kit: a code generator that produces a complete, working chat component for your framework of choice (React, Vue, Next.js, Vanilla JS, Svelte), pre-wired to your Meilisearch endpoint. Copy, paste, ship. And comprehensive documentation covering the full RAG pipeline, prompt design, and integration patterns.

Multi-provider LLM infrastructure

Chat V1 relies on a single hardcoded LLM provider. Any outage means chat is down, any price increase hits margins directly, and customers with specific compliance requirements can’t be served.

We’re building an AI gateway, an internal Rust service deployed as a sidecar on each Cloud region, that sits between Meilisearch and all LLM providers. One unified OpenAI-compatible interface with support for OpenAI, Anthropic, Mistral, Cohere, Google Vertex, AWS Bedrock, and self-hosted models via Ollama.

The gateway provides fallback chains (if one provider fails, the next picks up transparently), per-tenant billing via metering, Bring Your Own Key for enterprise customers, per-tenant cost controls, and prompt optimization to reduce token usage. It’s invisible infrastructure: users interact with Meilisearch’s Chat API, and the gateway handles everything behind the scenes.

The bigger picture

These four themes aren’t isolated. They reinforce each other.

Serverless indexes make the AI gateway economically viable at scale. Millions of tenants can each have their own search and chat agents without burning compute on idle infrastructure. The AI gateway makes our own models reachable as the distribution channel for embeddings, reranking, and LLM access. Our models make hybrid search a genuine differentiator: purpose-built models that outperform generic alternatives. The platform improvements make everything accessible: debugging tools, observability, and a dashboard that surfaces the full power of the engine.

Together, they transform Meilisearch from a search engine into a complete information retrieval platform: search, hybrid AI, RAG-powered chat, and the infrastructure to run it all at any scale.

We’re incredibly excited about what’s ahead. As always, Meilisearch remains open source at its core, and we’ll share progress as we build. If any of this particularly interests you, or if you’re facing a problem that it would solve, we’d love to hear from you.

Stay tuned.

Want to stay up to date on our priorities and direction? Check out our public roadmap or sign up for the Meilisearch newsletter.

Roadmap roundup: where Meilisearch is heading

Roadmap roundup: where Meilisearch is heading

Meilisearch for any workload

1. Distributed search

2. Serverless indexes

Hybrid by default

Simpler setup for embeddings and reranking

A unified AI Gateway

Our own models

From API to Platform

Full engine access from the dashboard

Built-in debugging and observability

RAG at scale

A smarter chat engine

From API to product

Multi-provider LLM infrastructure

The bigger picture

Try it yourself

Quentin de Quelen

Related articles

Building the future of search with Meilisearch AI

A practical guide to search relevance metrics and evaluation

Launch Week wrap-up: everything that shipped in five days, April 2026