Skip to Content

4. Embeddings, Storage, and Search

Embed sentences, store in a vector DB, and query by meaning. Implement search(query) over a small corpus — results should rank by semantic similarity, not keyword overlap.

Embedding Providers

Pick an embedding provider — they’re not interchangeable. OpenAI text-embedding-3-small is the default. Cohere Embed v3 handles multilingual well. Jina and Voyage offer specialised models. For open-source: Sentence Transformers (all-MiniLM-L6-v2 for speed, BGE/GTE for quality) run locally with no API cost. Try at least two — embedding quality is the single biggest lever in retrieval.

Vector Databases

Pick a vector DB. Start with one:

DatabaseBest for
pgvectorAlready have Postgres — add a column. Zero new infrastructure.
ChromaEmbeds in-process, good for prototypes
FAISSMeta’s library, fastest local similarity search
Pinecone / Weaviate / QdrantManaged services with filtering and metadata
LanceDBEmbedded, columnar, good for multimodal
Supabase / MongoDB AtlasVector search added to databases you may already use

Pure semantic search often fails on short queries and proper nouns. Hybrid search (dense vectors + BM25 keyword scoring) is the production standard. Know it exists even if you start semantic-only.

Resources

OpenAI embeddings · Cohere Embed · Sentence Transformers · Jina Embeddings · pgvector · Chroma · FAISS · Pinecone · The Illustrated Word2Vec