Mythosia.VectorDb.Postgres
10.1.0
dotnet add package Mythosia.VectorDb.Postgres --version 10.1.0
NuGet\Install-Package Mythosia.VectorDb.Postgres -Version 10.1.0
<PackageReference Include="Mythosia.VectorDb.Postgres" Version="10.1.0" />
<PackageVersion Include="Mythosia.VectorDb.Postgres" Version="10.1.0" />
<PackageReference Include="Mythosia.VectorDb.Postgres" />
paket add Mythosia.VectorDb.Postgres --version 10.1.0
#r "nuget: Mythosia.VectorDb.Postgres, 10.1.0"
#:package Mythosia.VectorDb.Postgres@10.1.0
#addin nuget:?package=Mythosia.VectorDb.Postgres&version=10.1.0
#tool nuget:?package=Mythosia.VectorDb.Postgres&version=10.1.0
Mythosia.VectorDb.Postgres
PostgreSQL (pgvector) implementation of IVectorStore.
Single-table design with namespace column for logical isolation.
Migration from v10.0.0
If upgrading from v10.0.0, run the following SQL migration before deploying:
-- 1. Rename columns (order matters: rename 'namespace' first to avoid conflict)
ALTER TABLE "public"."vectors" RENAME COLUMN namespace TO scope;
ALTER TABLE "public"."vectors" RENAME COLUMN collection TO namespace;
-- 2. Recreate composite index
DROP INDEX IF EXISTS idx_vectors_collection_ns;
CREATE INDEX idx_vectors_ns_scope ON "public"."vectors" (namespace, scope);
-- 3. Recreate primary key
ALTER TABLE "public"."vectors" DROP CONSTRAINT vectors_pkey;
ALTER TABLE "public"."vectors" ADD PRIMARY KEY (namespace, id);
Prerequisites
- PostgreSQL 12+
- pgvector extension installed:
CREATE EXTENSION IF NOT EXISTS vector;
Quick Start
using Mythosia.VectorDb;
using Mythosia.VectorDb.Postgres;
var store = new PostgresStore(new PostgresOptions
{
ConnectionString = "Host=localhost;Database=mydb;Username=postgres;Password=secret",
Dimension = 1536,
EnsureSchema = true, // auto-creates table + indexes
Index = new HnswIndexOptions { M = 16, EfConstruction = 64, EfSearch = 40 }
});
// Fluent API (recommended)
var ns = store.InNamespace("my-namespace");
await ns.UpsertAsync(record);
var results = await ns.SearchAsync(queryVector, topK: 5);
// With scope
var scoped = ns.InScope("tenant-1");
await scoped.UpsertAsync(record); // record.Scope set automatically
var scopedResults = await scoped.SearchAsync(queryVector);
ERD
erDiagram
vectors {
text namespace PK "NOT NULL — logical namespace"
text id PK "NOT NULL — unique record ID within namespace"
text scope "NULL — optional sub-namespace isolation"
text content "NULL — original text content"
jsonb metadata "NOT NULL DEFAULT '{}' — arbitrary key-value pairs"
vector embedding "NOT NULL — vector(dimension) for similarity search"
timestamptz created_at "NOT NULL DEFAULT now()"
timestamptz updated_at "NOT NULL DEFAULT now()"
}
Single-table design: All namespaces share one table. The composite primary key
(namespace, id)ensures uniqueness per namespace.
Indexes
| Index | Type | Target | Purpose |
|---|---|---|---|
| PK | btree | (namespace, id) |
Primary key / upsert conflict |
idx_*_embedding |
hnsw / ivfflat | embedding vector_*_ops |
ANN similarity search (distance strategy dependent) |
idx_*_metadata |
gin | metadata |
jsonb containment filter (@>) |
idx_*_ns_scope |
btree | (namespace, scope) |
Scope-scoped queries |
Schema
When EnsureSchema = true, the following is created automatically:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS "public"."vectors" (
namespace text NOT NULL,
id text NOT NULL,
scope text NULL,
content text NULL,
metadata jsonb NOT NULL DEFAULT '{}'::jsonb,
embedding vector(1536) NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (namespace, id)
);
-- Indexes
CREATE INDEX IF NOT EXISTS idx_vectors_metadata
ON "public"."vectors" USING gin (metadata);
CREATE INDEX IF NOT EXISTS idx_vectors_ns_scope
ON "public"."vectors" (namespace, scope);
-- vector index (default: HNSW)
CREATE INDEX IF NOT EXISTS idx_vectors_embedding
ON "public"."vectors" USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
Notes:
- The vector index SQL changes by
Indextype (HnswIndexOptions/IvfFlatIndexOptions/NoIndexOptions). - The operator class changes by
DistanceStrategy:Cosine→vector_cosine_opsEuclidean→vector_l2_opsInnerProduct→vector_ip_ops
When EnsureSchema = false (recommended for production), the table must already exist.
An InvalidOperationException is thrown with a clear message if the table is missing.
Manual Schema Setup (Production)
For production deployments, create the schema manually before starting the application:
-- 1. Enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;
-- 2. Create table (adjust dimension as needed)
CREATE TABLE public.vectors (
namespace text NOT NULL,
id text NOT NULL,
scope text NULL,
content text NULL,
metadata jsonb NOT NULL DEFAULT '{}'::jsonb,
embedding vector(1536) NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (namespace, id)
);
-- 3. Indexes
CREATE INDEX idx_vectors_metadata
ON public.vectors USING gin (metadata);
CREATE INDEX idx_vectors_ns_scope
ON public.vectors (namespace, scope);
-- 4-A. Option A (recommended default): HNSW
CREATE INDEX idx_vectors_embedding
ON public.vectors USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
-- 4-B. Option B: IVFFlat (create after loading data)
-- ivfflat requires rows to exist for training
-- CREATE INDEX idx_vectors_embedding
-- ON public.vectors USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
-- 5. Analyze for query planner (recommended)
ANALYZE public.vectors;
Options
| Option | Default | Description |
|---|---|---|
ConnectionString |
(required) | PostgreSQL connection string |
Dimension |
(required) | Embedding vector dimension (e.g., 1536 for OpenAI) |
SchemaName |
"public" |
Database schema |
TableName |
"vectors" |
Table name |
EnsureSchema |
false |
Auto-create extension/table/indexes |
DistanceStrategy |
Cosine |
Similarity metric (Cosine, Euclidean, InnerProduct) |
Index |
new HnswIndexOptions() |
Vector index settings object (HnswIndexOptions, IvfFlatIndexOptions, NoIndexOptions) |
HnswIndexOptions.M |
16 |
HNSW build param (m), typical range 8-64 |
HnswIndexOptions.EfConstruction |
64 |
HNSW build param (ef_construction), typical range 32-400 |
HnswIndexOptions.EfSearch |
40 |
HNSW runtime ef_search default |
IvfFlatIndexOptions.Lists |
100 |
Number of IVF lists for the ivfflat index |
IvfFlatIndexOptions.Probes |
10 |
IVFFlat runtime probes default |
FailFastOnIndexCreationFailure |
true |
Throw when vector index creation fails (recommended for production) |
Runtime Tuning Guide (DX)
IvfFlatSearchRuntimeOptions.Probes: increase for better recall, decrease for lower latency.HnswSearchRuntimeOptions.EfSearch: increase for better recall, decrease for lower latency.IvfFlatIndexOptions.Lists: start aroundsqrt(total_rows)and tune from there.
Use runtime options matching your index settings:
Index = new HnswIndexOptions(...)→HnswSearchRuntimeOptionsIndex = new IvfFlatIndexOptions(...)→IvfFlatSearchRuntimeOptions
Recommended starting points:
| Goal | IvfFlatSearchRuntimeOptions.Probes |
HnswSearchRuntimeOptions.EfSearch |
|---|---|---|
| Fast | 4 | 16 |
| Balanced | 10 | 40 |
| HighRecall | 32 | 120 |
These are practical ranges, not strict hard limits. Final values should be chosen from production latency/recall measurements.
Namespace & Filter Behavior
- Namespaces are stored as a
namespacecolumn in a single shared table (not separate tables). - There is no explicit namespace-create API. Namespaces are implicitly created on first upsert.
- Delete all rows in a namespace via
store.InNamespace("your-ns").DeleteAllAsync(). - Scope filter:
WHERE scope = @scope - Metadata filter:
WHERE metadata @> @jsonb(jsonb containment, AND logic) - MinScore filter (distance-strategy dependent):
Cosine:1 - (embedding <=> @q::vector) >= @minScoreEuclidean:1 / (1 + (embedding <-> @q::vector)) >= @minScoreInnerProduct:-(embedding <#> @q::vector) >= @minScore
RAG Integration
var store = await RagStore.BuildAsync(config => config
.AddText("Your document text here", id: "doc-1")
.UseLocalEmbedding(512)
.UseVectorStore(new PostgresStore(new PostgresOptions
{
ConnectionString = Environment.GetEnvironmentVariable("MYTHOSIA_PG_CONN")!,
Dimension = 512,
EnsureSchema = true,
Index = new HnswIndexOptions()
}))
.WithTopK(5)
);
Performance Tips
- ivfflat lists: Rule of thumb —
lists = sqrt(total_rows). Default 100 is good for up to ~10K rows. - Run
ANALYZE vectors;after bulk inserts for optimal query plans. - For large datasets (1M+ rows), consider HNSW index (
CREATE INDEX ... USING hnsw) instead of ivfflat. - Use connection pooling (e.g.,
Npgsqlconnection stringPooling=true;Maximum Pool Size=20).
EnsureSchema Guidance
EnsureSchema = true: Development, testing, local Docker — auto-provisions everything.EnsureSchema = false: Production — schema managed by DBA/migration tools; fails fast with clear error if missing.- For
ivfflat, index creation can fail on empty tables (PostgreSQL/pgvector behavior). In that case, useHnswor createivfflatafter loading data. FailFastOnIndexCreationFailure = true(default): throws immediately if vector index creation fails.FailFastOnIndexCreationFailure = false: startup continues even if vector index creation fails.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- Mythosia.VectorDb.Abstractions (>= 2.0.0)
- Npgsql (>= 10.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
v10.1.0: Breaking — namespace now optional (VectorRecord/VectorFilter property). Removed NamespaceExistsAsync/CreateNamespaceAsync/DeleteNamespaceAsync. See RELEASE_NOTES.md for details.