WebFlux 0.1.3
See the version list below for details.
dotnet add package WebFlux --version 0.1.3
NuGet\Install-Package WebFlux -Version 0.1.3
<PackageReference Include="WebFlux" Version="0.1.3" />
<PackageVersion Include="WebFlux" Version="0.1.3" />
<PackageReference Include="WebFlux" />
paket add WebFlux --version 0.1.3
#r "nuget: WebFlux, 0.1.3"
#:package WebFlux@0.1.3
#addin nuget:?package=WebFlux&version=0.1.3
#tool nuget:?package=WebFlux&version=0.1.3
WebFlux
A .NET SDK for preprocessing web content for RAG (Retrieval-Augmented Generation) systems.
Overview
WebFlux processes web content into chunks optimized for RAG systems. It handles web crawling, content extraction, and intelligent chunking with support for multiple content formats.
Installation
dotnet add package WebFlux
Quick Start
using WebFlux;
using Microsoft.Extensions.DependencyInjection;
var services = new ServiceCollection();
// Register your AI service implementations
services.AddScoped<ITextEmbeddingService, YourEmbeddingService>();
services.AddScoped<ITextCompletionService, YourLLMService>(); // Optional
// Add WebFlux
services.AddWebFlux();
var provider = services.BuildServiceProvider();
var processor = provider.GetRequiredService<IWebContentProcessor>();
// Process a website
await foreach (var result in processor.ProcessWithProgressAsync("https://example.com"))
{
if (result.IsSuccess && result.Result != null)
{
foreach (var chunk in result.Result)
{
Console.WriteLine($"Chunk {chunk.ChunkIndex}: {chunk.Content}");
}
}
}
Features
- Interface-Based Design: Bring your own AI services (OpenAI, Anthropic, Azure, local models)
- Multiple Chunking Strategies: Auto, Smart, Semantic, Intelligent, MemoryOptimized, Paragraph, FixedSize
- Content Formats: HTML, Markdown, JSON, XML, PDF
- Web Standards: robots.txt, sitemap.xml, ai.txt, llms.txt, manifest.json
- Streaming: Process large websites with AsyncEnumerable
- Parallel Processing: Concurrent crawling and processing
Chunking Strategies
| Strategy | Use Case |
|---|---|
| Auto | Automatically selects best strategy based on content |
| Smart | Structured HTML documentation |
| Semantic | General web pages and articles |
| Intelligent | Blogs and knowledge bases |
| MemoryOptimized | Large documents with memory constraints |
| Paragraph | Markdown with natural boundaries |
| FixedSize | Uniform chunks for testing |
Core Interfaces
You provide implementations for AI services:
public interface ITextEmbeddingService
{
Task<double[]> GetEmbeddingAsync(string text, CancellationToken cancellationToken = default);
}
public interface ITextCompletionService // Optional, for content reconstruction
{
Task<string> CompleteAsync(string prompt, CancellationToken cancellationToken = default);
}
WebFlux handles the rest: crawling, extraction, analysis, and chunking.
Configuration
var options = new CrawlOptions
{
MaxDepth = 3,
MaxPages = 100,
RespectRobotsTxt = true,
UserAgent = "MyBot/1.0"
};
var chunkOptions = new ChunkingOptions
{
Strategy = "Auto",
MaxChunkSize = 512,
OverlapSize = 64
};
await foreach (var result in processor.ProcessWithProgressAsync(url, options, chunkOptions))
{
// Handle results
}
Documentation
- Tutorial - Step-by-step guide with practical examples
- Architecture - System design and pipeline
- Interfaces - API contracts and implementation guide
- Chunking Strategies - Detailed strategy guide
License
MIT License - see LICENSE file for details.
Support
- Issues: GitHub Issues
- Package: NuGet
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net8.0
- HtmlAgilityPack (>= 1.12.4)
- Markdig (>= 0.42.0)
- Microsoft.Extensions.Caching.Abstractions (>= 9.0.9)
- Microsoft.Extensions.Caching.Memory (>= 9.0.9)
- Microsoft.Extensions.Configuration (>= 9.0.9)
- Microsoft.Extensions.Configuration.Abstractions (>= 9.0.9)
- Microsoft.Extensions.Configuration.Binder (>= 9.0.9)
- Microsoft.Extensions.DependencyInjection (>= 9.0.9)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.9)
- Microsoft.Extensions.Http (>= 9.0.9)
- Microsoft.Extensions.Logging (>= 9.0.9)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.9)
- Microsoft.Playwright (>= 1.55.0)
- Polly (>= 8.6.4)
- Polly.Extensions.Http (>= 3.0.0)
- System.Text.Json (>= 9.0.9)
- System.Threading.Channels (>= 9.0.9)
- YamlDotNet (>= 16.3.0)
-
net9.0
- HtmlAgilityPack (>= 1.12.4)
- Markdig (>= 0.42.0)
- Microsoft.Extensions.Caching.Abstractions (>= 9.0.9)
- Microsoft.Extensions.Caching.Memory (>= 9.0.9)
- Microsoft.Extensions.Configuration (>= 9.0.9)
- Microsoft.Extensions.Configuration.Abstractions (>= 9.0.9)
- Microsoft.Extensions.Configuration.Binder (>= 9.0.9)
- Microsoft.Extensions.DependencyInjection (>= 9.0.9)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.9)
- Microsoft.Extensions.Http (>= 9.0.9)
- Microsoft.Extensions.Logging (>= 9.0.9)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.9)
- Microsoft.Playwright (>= 1.55.0)
- Polly (>= 8.6.4)
- Polly.Extensions.Http (>= 3.0.0)
- System.Text.Json (>= 9.0.9)
- System.Threading.Channels (>= 9.0.9)
- YamlDotNet (>= 16.3.0)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on WebFlux:
| Package | Downloads |
|---|---|
|
FluxIndex.SDK
FluxIndex SDK - Complete RAG infrastructure with FileFlux integration, FluxCurator preprocessing, and FluxImprover quality enhancement. AI providers are externally injectable. |
GitHub repositories
This package is not used by any popular GitHub repositories.