Mythosia.VectorDb.Postgres 10.0.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package Mythosia.VectorDb.Postgres --version 10.0.0
                    
NuGet\Install-Package Mythosia.VectorDb.Postgres -Version 10.0.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Mythosia.VectorDb.Postgres" Version="10.0.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Mythosia.VectorDb.Postgres" Version="10.0.0" />
                    
Directory.Packages.props
<PackageReference Include="Mythosia.VectorDb.Postgres" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Mythosia.VectorDb.Postgres --version 10.0.0
                    
#r "nuget: Mythosia.VectorDb.Postgres, 10.0.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Mythosia.VectorDb.Postgres@10.0.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Mythosia.VectorDb.Postgres&version=10.0.0
                    
Install as a Cake Addin
#tool nuget:?package=Mythosia.VectorDb.Postgres&version=10.0.0
                    
Install as a Cake Tool

Mythosia.VectorDb.Postgres

PostgreSQL (pgvector) implementation of IVectorStore.
Single-table design with collection column for logical isolation.

Prerequisites

  • PostgreSQL 12+
  • pgvector extension installed:
CREATE EXTENSION IF NOT EXISTS vector;

Quick Start

using Mythosia.VectorDb.Postgres;

var store = new PostgresVectorStore(new PostgresVectorStoreOptions
{
    ConnectionString = "Host=localhost;Database=mydb;Username=postgres;Password=secret",
    Dimension = 1536,
    EnsureSchema = true,  // auto-creates table + indexes
    Index = new HnswIndexOptions { M = 16, EfConstruction = 64, EfSearch = 40 }
});

ERD

erDiagram
    vectors {
        text collection PK "NOT NULL — logical collection name"
        text id PK "NOT NULL — unique record ID within collection"
        text namespace "NULL — optional tenant/scope isolation"
        text content "NULL — original text content"
        jsonb metadata "NOT NULL DEFAULT '{}' — arbitrary key-value pairs"
        vector embedding "NOT NULL — vector(dimension) for similarity search"
        timestamptz created_at "NOT NULL DEFAULT now()"
        timestamptz updated_at "NOT NULL DEFAULT now()"
    }

Single-table design: All collections share one table. The composite primary key (collection, id) ensures uniqueness per collection.

Indexes

Index Type Target Purpose
PK btree (collection, id) Primary key / upsert conflict
idx_*_embedding hnsw / ivfflat embedding vector_*_ops ANN similarity search (distance strategy dependent)
idx_*_metadata gin metadata jsonb containment filter (@>)
idx_*_collection_ns btree (collection, namespace) Namespace-scoped queries

Schema

When EnsureSchema = true, the following is created automatically:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE IF NOT EXISTS "public"."vectors" (
    collection  text        NOT NULL,
    id          text        NOT NULL,
    namespace   text        NULL,
    content     text        NULL,
    metadata    jsonb       NOT NULL DEFAULT '{}'::jsonb,
    embedding   vector(1536) NOT NULL,
    created_at  timestamptz NOT NULL DEFAULT now(),
    updated_at  timestamptz NOT NULL DEFAULT now(),
    PRIMARY KEY (collection, id)
);

-- Indexes
CREATE INDEX IF NOT EXISTS idx_vectors_metadata
    ON "public"."vectors" USING gin (metadata);

CREATE INDEX IF NOT EXISTS idx_vectors_collection_ns
    ON "public"."vectors" (collection, namespace);

-- vector index (default: HNSW)
CREATE INDEX IF NOT EXISTS idx_vectors_embedding
    ON "public"."vectors" USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

Notes:

  • The vector index SQL changes by Index type (HnswIndexOptions / IvfFlatIndexOptions / NoIndexOptions).
  • The operator class changes by DistanceStrategy:
    • Cosinevector_cosine_ops
    • Euclideanvector_l2_ops
    • InnerProductvector_ip_ops

When EnsureSchema = false (recommended for production), the table must already exist.
An InvalidOperationException is thrown with a clear message if the table is missing.

Manual Schema Setup (Production)

For production deployments, create the schema manually before starting the application:

-- 1. Enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;

-- 2. Create table (adjust dimension as needed)
CREATE TABLE public.vectors (
    collection  text        NOT NULL,
    id          text        NOT NULL,
    namespace   text        NULL,
    content     text        NULL,
    metadata    jsonb       NOT NULL DEFAULT '{}'::jsonb,
    embedding   vector(1536) NOT NULL,
    created_at  timestamptz NOT NULL DEFAULT now(),
    updated_at  timestamptz NOT NULL DEFAULT now(),
    PRIMARY KEY (collection, id)
);

-- 3. Indexes
CREATE INDEX idx_vectors_metadata
    ON public.vectors USING gin (metadata);

CREATE INDEX idx_vectors_collection_ns
    ON public.vectors (collection, namespace);

-- 4-A. Option A (recommended default): HNSW
CREATE INDEX idx_vectors_embedding
    ON public.vectors USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

-- 4-B. Option B: IVFFlat (create after loading data)
--      ivfflat requires rows to exist for training
-- CREATE INDEX idx_vectors_embedding
--     ON public.vectors USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

-- 5. Analyze for query planner (recommended)
ANALYZE public.vectors;

Options

Option Default Description
ConnectionString (required) PostgreSQL connection string
Dimension (required) Embedding vector dimension (e.g., 1536 for OpenAI)
SchemaName "public" Database schema
TableName "vectors" Table name
EnsureSchema false Auto-create extension/table/indexes
DistanceStrategy Cosine Similarity metric (Cosine, Euclidean, InnerProduct)
Index new HnswIndexOptions() Vector index settings object (HnswIndexOptions, IvfFlatIndexOptions, NoIndexOptions)
HnswIndexOptions.M 16 HNSW build param (m), typical range 8-64
HnswIndexOptions.EfConstruction 64 HNSW build param (ef_construction), typical range 32-400
HnswIndexOptions.EfSearch 40 HNSW runtime ef_search default
IvfFlatIndexOptions.Lists 100 Number of IVF lists for the ivfflat index
IvfFlatIndexOptions.Probes 10 IVFFlat runtime probes default
FailFastOnIndexCreationFailure true Throw when vector index creation fails (recommended for production)

Runtime Tuning Guide (DX)

  • IvfFlatSearchRuntimeOptions.Probes: increase for better recall, decrease for lower latency.
  • HnswSearchRuntimeOptions.EfSearch: increase for better recall, decrease for lower latency.
  • IvfFlatIndexOptions.Lists: start around sqrt(total_rows) and tune from there.

Use runtime options matching your index settings:

  • Index = new HnswIndexOptions(...)HnswSearchRuntimeOptions
  • Index = new IvfFlatIndexOptions(...)IvfFlatSearchRuntimeOptions

Recommended starting points:

Goal IvfFlatSearchRuntimeOptions.Probes HnswSearchRuntimeOptions.EfSearch
Fast 4 16
Balanced 10 40
HighRecall 32 120

These are practical ranges, not strict hard limits. Final values should be chosen from production latency/recall measurements.

Collection & Filter Behavior

  • Collections are stored as a collection column in a single shared table (not separate tables).
  • CreateCollectionAsync is a no-op — collections are implicitly created on upsert.
  • DeleteCollectionAsync deletes all rows matching the collection.
  • Namespace filter: WHERE namespace = @ns
  • Metadata filter: WHERE metadata @> @jsonb (jsonb containment, AND logic)
  • MinScore filter (distance-strategy dependent):
    • Cosine: 1 - (embedding <=> @q::vector) >= @minScore
    • Euclidean: 1 / (1 + (embedding <-> @q::vector)) >= @minScore
    • InnerProduct: -(embedding <#> @q::vector) >= @minScore

RAG Integration

var store = await RagStore.BuildAsync(config => config
    .AddText("Your document text here", id: "doc-1")
    .UseLocalEmbedding(512)
    .UseVectorStore(new PostgresVectorStore(new PostgresVectorStoreOptions
    {
        ConnectionString = Environment.GetEnvironmentVariable("MYTHOSIA_PG_CONN")!,
        Dimension = 512,
        EnsureSchema = true,
        Index = new HnswIndexOptions()
    }))
    .WithTopK(5)
);

Performance Tips

  • ivfflat lists: Rule of thumb — lists = sqrt(total_rows). Default 100 is good for up to ~10K rows.
  • Run ANALYZE vectors; after bulk inserts for optimal query plans.
  • For large datasets (1M+ rows), consider HNSW index (CREATE INDEX ... USING hnsw) instead of ivfflat.
  • Use connection pooling (e.g., Npgsql connection string Pooling=true;Maximum Pool Size=20).

EnsureSchema Guidance

  • EnsureSchema = true: Development, testing, local Docker — auto-provisions everything.
  • EnsureSchema = false: Production — schema managed by DBA/migration tools; fails fast with clear error if missing.
  • For ivfflat, index creation can fail on empty tables (PostgreSQL/pgvector behavior). In that case, use Hnsw or create ivfflat after loading data.
  • FailFastOnIndexCreationFailure = true (default): throws immediately if vector index creation fails.
  • FailFastOnIndexCreationFailure = false: startup continues even if vector index creation fails.
Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
10.1.0 30 3/6/2026
10.0.0 38 3/5/2026

v10.0.0: Initial release. Fixed SearchAsync/ApplySearchRuntimeSettingsAsync command lifecycle — block-scoped disposal prevents Npgsql concurrency errors. SET LOCAL now uses interpolation (PostgreSQL does not support parameterized SET LOCAL).