Mythosia.VectorDb.Postgres
10.0.0
See the version list below for details.
dotnet add package Mythosia.VectorDb.Postgres --version 10.0.0
NuGet\Install-Package Mythosia.VectorDb.Postgres -Version 10.0.0
<PackageReference Include="Mythosia.VectorDb.Postgres" Version="10.0.0" />
<PackageVersion Include="Mythosia.VectorDb.Postgres" Version="10.0.0" />
<PackageReference Include="Mythosia.VectorDb.Postgres" />
paket add Mythosia.VectorDb.Postgres --version 10.0.0
#r "nuget: Mythosia.VectorDb.Postgres, 10.0.0"
#:package Mythosia.VectorDb.Postgres@10.0.0
#addin nuget:?package=Mythosia.VectorDb.Postgres&version=10.0.0
#tool nuget:?package=Mythosia.VectorDb.Postgres&version=10.0.0
Mythosia.VectorDb.Postgres
PostgreSQL (pgvector) implementation of IVectorStore.
Single-table design with collection column for logical isolation.
Prerequisites
- PostgreSQL 12+
- pgvector extension installed:
CREATE EXTENSION IF NOT EXISTS vector;
Quick Start
using Mythosia.VectorDb.Postgres;
var store = new PostgresVectorStore(new PostgresVectorStoreOptions
{
ConnectionString = "Host=localhost;Database=mydb;Username=postgres;Password=secret",
Dimension = 1536,
EnsureSchema = true, // auto-creates table + indexes
Index = new HnswIndexOptions { M = 16, EfConstruction = 64, EfSearch = 40 }
});
ERD
erDiagram
vectors {
text collection PK "NOT NULL — logical collection name"
text id PK "NOT NULL — unique record ID within collection"
text namespace "NULL — optional tenant/scope isolation"
text content "NULL — original text content"
jsonb metadata "NOT NULL DEFAULT '{}' — arbitrary key-value pairs"
vector embedding "NOT NULL — vector(dimension) for similarity search"
timestamptz created_at "NOT NULL DEFAULT now()"
timestamptz updated_at "NOT NULL DEFAULT now()"
}
Single-table design: All collections share one table. The composite primary key
(collection, id)ensures uniqueness per collection.
Indexes
| Index | Type | Target | Purpose |
|---|---|---|---|
| PK | btree | (collection, id) |
Primary key / upsert conflict |
idx_*_embedding |
hnsw / ivfflat | embedding vector_*_ops |
ANN similarity search (distance strategy dependent) |
idx_*_metadata |
gin | metadata |
jsonb containment filter (@>) |
idx_*_collection_ns |
btree | (collection, namespace) |
Namespace-scoped queries |
Schema
When EnsureSchema = true, the following is created automatically:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS "public"."vectors" (
collection text NOT NULL,
id text NOT NULL,
namespace text NULL,
content text NULL,
metadata jsonb NOT NULL DEFAULT '{}'::jsonb,
embedding vector(1536) NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (collection, id)
);
-- Indexes
CREATE INDEX IF NOT EXISTS idx_vectors_metadata
ON "public"."vectors" USING gin (metadata);
CREATE INDEX IF NOT EXISTS idx_vectors_collection_ns
ON "public"."vectors" (collection, namespace);
-- vector index (default: HNSW)
CREATE INDEX IF NOT EXISTS idx_vectors_embedding
ON "public"."vectors" USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
Notes:
- The vector index SQL changes by
Indextype (HnswIndexOptions/IvfFlatIndexOptions/NoIndexOptions). - The operator class changes by
DistanceStrategy:Cosine→vector_cosine_opsEuclidean→vector_l2_opsInnerProduct→vector_ip_ops
When EnsureSchema = false (recommended for production), the table must already exist.
An InvalidOperationException is thrown with a clear message if the table is missing.
Manual Schema Setup (Production)
For production deployments, create the schema manually before starting the application:
-- 1. Enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;
-- 2. Create table (adjust dimension as needed)
CREATE TABLE public.vectors (
collection text NOT NULL,
id text NOT NULL,
namespace text NULL,
content text NULL,
metadata jsonb NOT NULL DEFAULT '{}'::jsonb,
embedding vector(1536) NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (collection, id)
);
-- 3. Indexes
CREATE INDEX idx_vectors_metadata
ON public.vectors USING gin (metadata);
CREATE INDEX idx_vectors_collection_ns
ON public.vectors (collection, namespace);
-- 4-A. Option A (recommended default): HNSW
CREATE INDEX idx_vectors_embedding
ON public.vectors USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
-- 4-B. Option B: IVFFlat (create after loading data)
-- ivfflat requires rows to exist for training
-- CREATE INDEX idx_vectors_embedding
-- ON public.vectors USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
-- 5. Analyze for query planner (recommended)
ANALYZE public.vectors;
Options
| Option | Default | Description |
|---|---|---|
ConnectionString |
(required) | PostgreSQL connection string |
Dimension |
(required) | Embedding vector dimension (e.g., 1536 for OpenAI) |
SchemaName |
"public" |
Database schema |
TableName |
"vectors" |
Table name |
EnsureSchema |
false |
Auto-create extension/table/indexes |
DistanceStrategy |
Cosine |
Similarity metric (Cosine, Euclidean, InnerProduct) |
Index |
new HnswIndexOptions() |
Vector index settings object (HnswIndexOptions, IvfFlatIndexOptions, NoIndexOptions) |
HnswIndexOptions.M |
16 |
HNSW build param (m), typical range 8-64 |
HnswIndexOptions.EfConstruction |
64 |
HNSW build param (ef_construction), typical range 32-400 |
HnswIndexOptions.EfSearch |
40 |
HNSW runtime ef_search default |
IvfFlatIndexOptions.Lists |
100 |
Number of IVF lists for the ivfflat index |
IvfFlatIndexOptions.Probes |
10 |
IVFFlat runtime probes default |
FailFastOnIndexCreationFailure |
true |
Throw when vector index creation fails (recommended for production) |
Runtime Tuning Guide (DX)
IvfFlatSearchRuntimeOptions.Probes: increase for better recall, decrease for lower latency.HnswSearchRuntimeOptions.EfSearch: increase for better recall, decrease for lower latency.IvfFlatIndexOptions.Lists: start aroundsqrt(total_rows)and tune from there.
Use runtime options matching your index settings:
Index = new HnswIndexOptions(...)→HnswSearchRuntimeOptionsIndex = new IvfFlatIndexOptions(...)→IvfFlatSearchRuntimeOptions
Recommended starting points:
| Goal | IvfFlatSearchRuntimeOptions.Probes |
HnswSearchRuntimeOptions.EfSearch |
|---|---|---|
| Fast | 4 | 16 |
| Balanced | 10 | 40 |
| HighRecall | 32 | 120 |
These are practical ranges, not strict hard limits. Final values should be chosen from production latency/recall measurements.
Collection & Filter Behavior
- Collections are stored as a
collectioncolumn in a single shared table (not separate tables). CreateCollectionAsyncis a no-op — collections are implicitly created on upsert.DeleteCollectionAsyncdeletes all rows matching the collection.- Namespace filter:
WHERE namespace = @ns - Metadata filter:
WHERE metadata @> @jsonb(jsonb containment, AND logic) - MinScore filter (distance-strategy dependent):
Cosine:1 - (embedding <=> @q::vector) >= @minScoreEuclidean:1 / (1 + (embedding <-> @q::vector)) >= @minScoreInnerProduct:-(embedding <#> @q::vector) >= @minScore
RAG Integration
var store = await RagStore.BuildAsync(config => config
.AddText("Your document text here", id: "doc-1")
.UseLocalEmbedding(512)
.UseVectorStore(new PostgresVectorStore(new PostgresVectorStoreOptions
{
ConnectionString = Environment.GetEnvironmentVariable("MYTHOSIA_PG_CONN")!,
Dimension = 512,
EnsureSchema = true,
Index = new HnswIndexOptions()
}))
.WithTopK(5)
);
Performance Tips
- ivfflat lists: Rule of thumb —
lists = sqrt(total_rows). Default 100 is good for up to ~10K rows. - Run
ANALYZE vectors;after bulk inserts for optimal query plans. - For large datasets (1M+ rows), consider HNSW index (
CREATE INDEX ... USING hnsw) instead of ivfflat. - Use connection pooling (e.g.,
Npgsqlconnection stringPooling=true;Maximum Pool Size=20).
EnsureSchema Guidance
EnsureSchema = true: Development, testing, local Docker — auto-provisions everything.EnsureSchema = false: Production — schema managed by DBA/migration tools; fails fast with clear error if missing.- For
ivfflat, index creation can fail on empty tables (PostgreSQL/pgvector behavior). In that case, useHnswor createivfflatafter loading data. FailFastOnIndexCreationFailure = true(default): throws immediately if vector index creation fails.FailFastOnIndexCreationFailure = false: startup continues even if vector index creation fails.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Mythosia.VectorDb.Abstractions (>= 1.0.0)
- Npgsql (>= 10.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
v10.0.0: Initial release. Fixed SearchAsync/ApplySearchRuntimeSettingsAsync command lifecycle — block-scoped disposal prevents Npgsql concurrency errors. SET LOCAL now uses interpolation (PostgreSQL does not support parameterized SET LOCAL).