DeepSeek.OCR2.Core 0.3.2

There is a newer version of this package available.
See the version list below for details.
dotnet add package DeepSeek.OCR2.Core --version 0.3.2
                    
NuGet\Install-Package DeepSeek.OCR2.Core -Version 0.3.2
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="DeepSeek.OCR2.Core" Version="0.3.2" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="DeepSeek.OCR2.Core" Version="0.3.2" />
                    
Directory.Packages.props
<PackageReference Include="DeepSeek.OCR2.Core" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add DeepSeek.OCR2.Core --version 0.3.2
                    
#r "nuget: DeepSeek.OCR2.Core, 0.3.2"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package DeepSeek.OCR2.Core@0.3.2
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=DeepSeek.OCR2.Core&version=0.3.2
                    
Install as a Cake Addin
#tool nuget:?package=DeepSeek.OCR2.Core&version=0.3.2
                    
Install as a Cake Tool

DeepSeek.OCR2 (.NET / NuGet) 封装说明

这个 NuGet 包提供两层封装:

  1. DeepSeekOcr2LocalServer:从包内释放一个轻量 Python HTTP Server 脚本并启动子进程(模型只加载一次,便于多次调用)。
  2. DeepSeekOcr2Client:通过 HTTP 调用 POST /ocr 执行 OCR 推理。

先决条件

  • Windows:无需预装 Python。本包默认会在首次运行时自动下载便携版 Python(默认 3.10.11)并引导 pip/venv。
  • Linux/macOS:建议预装 Python 3.10+(或手动指定 PythonExecutablePath)。
  • 推理依赖(torch/transformers 等)默认会在首次运行时自动创建 venv 并安装(CPU 预设);也支持离线 wheels(见下文)。
  • 如果希望更快推理,可按上游说明安装 flash-attn(否则服务端会自动降级)。

依赖“打包/自带”能做到什么

  • 默认模式:首次运行时自动创建 venv 并通过 pip 安装依赖(Windows 可自动下载便携 Python)。
  • 离线全量模式:把 Python runtime + wheels/torch + 模型权重 一起打进同一个 .nupkg(包体会非常大,通常只能发布到私有 NuGet 源)。

许可证与归属

  • 上游仓库 DeepSeek-OCR-2 的许可证为 Apache License 2.0(见仓库根目录 LICENSE.txt)。
  • 默认包(DeepSeek.OCR2.Core)仅做 .NET 调用封装与启动脚本分发,不包含模型权重;如需离线模型请使用 DeepSeek.OCR2.Assets.Model.*DeepSeek.OCR2.Bundled
  • 本封装仓库地址:https://github.com/ichichchch/DeepSeekOCR2.NET

最小用法(启动本地服务 + 调用 OCR)

using DeepSeek.OCR2;

var result = await DeepSeekOcr2.RecognizeFileAsync(@"D:\test.jpg");
Console.WriteLine(result.Text);

说明:第一次调用 /ocr 时,Python 端可能需要从 HuggingFace 下载模型并完成初始化,耗时可能超过默认的 HttpClient 100 秒超时。可以通过 DeepSeekOcr2LocalServerOptions.OcrRequestTimeout 调大超时(或设为 Timeout.InfiniteTimeSpan)。

using DeepSeek.OCR2;
using System;

var options = new DeepSeekOcr2LocalServerOptions
{
    OcrRequestTimeout = TimeSpan.FromMinutes(30),
    BootstrapDownloadTimeout = TimeSpan.FromMinutes(30),
};

var result = await DeepSeekOcr2.RecognizeFileAsync(@"D:\test.jpg", serverOptions: options);
Console.WriteLine(result.Text);

如需复用同一个模型进程(多次调用更快):

using DeepSeek.OCR2;

await using var session = await DeepSeekOcr2.CreateSessionAsync();

var request = DeepSeekOcr2Request.FromFile(@"D:\test.jpg") with { Prompt = "<image>\nFree OCR." };
var result = await session.Client.RecognizeAsync(request);
Console.WriteLine(result.Text);

Torch 自动安装选项

  • TorchInstallPreset
    • Cpu:按 PyTorch 官方 CPU 索引安装(默认)
    • None:不安装 torch(适合你自行管理 Python 环境)
    • Cuda118:按 PyTorch 官方 cu118 索引安装(与本仓库 README 示例一致)
  • OfflineWheelDirectory:指定离线 wheel 目录(会传给 pip:--find-links <dir>
  • PreferOfflineWheels:为 true 时额外加 --no-index(强制只从离线目录找)
  • TorchVersion/TorchVisionVersion/TorchAudioVersion:默认 2.6.0/0.21.0/2.6.0,可自行改

目标框架

  • DeepSeek.OCR2:netstandard2.0 / net6.0 / net8.0 / net10.0

发布到 nuget.org(建议)

  • Owners:nuget.org 的包所有者最终由你上传时使用的账号/组织决定;建议用你的组织账号作为 owner,并通过 nuget.org 后台添加/移除 owners。工程里的 Owners 字段仅作为元数据展示用(不同站点可能忽略)。
  • RepositoryBranch/Commit:本包在 CI(GitHub Actions)环境下会自动读取 GITHUB_REF_NAME / GITHUB_SHA 并写入包元数据;也支持在打包命令里显式覆盖:
    • dotnet pack -p:RepositoryBranch=main -p:RepositoryCommit=<commitSha>
  • 自动发布(GitHub Actions)
    • 在仓库 Secrets 添加 NUGET_API_KEY(nuget.org 生成的 API Key)
    • 推送 tag v*(例如 v0.1.7)会触发发布工作流 nuget-publish

本地打包/推送

  • 打包(元包+依赖包):pwsh .\pack.ps1(输出到 dotnet/artifacts/
  • 推送到 nuget.org:pwsh .\push.ps1 -ApiKey <key>(默认推送 DeepSeek.OCR2* 相关包;如需跳过超大包可加 -IncludeBundled:$false
  • 如需推送其他包:使用 -PackageGlob 显式指定
  • 本仓库的 GitHub Actions 工作流 nuget-publish 会在推送 tag v* 时打包并推送 DeepSeek.OCR2(元包)以及其依赖包(DeepSeek.OCR2.CoreDeepSeek.OCR2.Assets.*)。
  • 也会打包 DeepSeek.OCR2.Bundled(单包资产方案)。如果你准备了真实的 bundled 资产(模型/torch/python),该包体可能非常大,建议发布到私有 NuGet 源。
  • 发布后会自动把 DeepSeek.OCR2.CoreDeepSeek.OCR2.Assets.* 设为 Unlisted(用户搜索只看到 DeepSeek.OCR2,但依赖仍可正常还原)。
  • 需要在仓库 Secrets 配置 NUGET_API_KEY

也可以本地发布(会打包、推送、并可选 Unlist 内部包):

pwsh .\publish-nuget.ps1 -Version 0.3.0

离线全量包(模型+torch+wheels+Python runtime)

可以拆成多个 NuGet 包来实现“全量离线”,优点是:每个包体积可控、可按需选择(例如不同平台/不同 torch 版本);缺点是:发布/版本管理更复杂,下载包数量更多。

推荐引用方式(win-x64 / 分包资产,C1):

  • 只引用一个包:DeepSeek.OCR2(meta 包,会自动拉起 DeepSeek.OCR2.Core + Python/wheels/模型资源包)
  • DeepSeek.OCR2.Full.win-x64DeepSeek.OCR2 等价,已不再发布新版本
  • 如只要在线安装(不带离线资产):引用 DeepSeek.OCR2.Core
  • 如要手动组合:直接引用 DeepSeek.OCR2.Core + DeepSeek.OCR2.Assets.*(按需选平台/模型包)

推荐引用方式(单包资产,C2):

  • 引用 DeepSeek.OCR2.Bundled:一个包内包含 python runtime + wheels + 模型快照(仍会依赖 DeepSeek.OCR2.Core)。适合私有 NuGet 源/离线分发(包体可能非常大)。

示例(C2):

<PackageReference Include="DeepSeek.OCR2.Bundled" Version="x.y.z" />

目录结构(会被打入 .nupkg 并在引用方输出目录自动复制到 DeepSeek.OCR2/bundled/):

  • dotnet/src/DeepSeek.OCR2/Bundled/python/<rid>/<version>/...
  • dotnet/src/DeepSeek.OCR2/Bundled/wheels/<rid>/*.whl
  • dotnet/src/DeepSeek.OCR2/Bundled/models/DeepSeek-OCR-2/...

准备资产(会下载大量内容):

pwsh .\bundle\prepare-bundled-assets.ps1 -TorchPreset cpu -ModelId deepseek-ai/DeepSeek-OCR-2

随后打包:

pwsh .\pack.ps1

运行时行为:

  • 若检测到 DeepSeek.OCR2/bundled/python/.../python.exe:优先使用随包 Python,不再下载。
  • 若检测到 DeepSeek.OCR2/bundled/wheels/<rid> 且目录内存在 .whl:默认作为离线 wheel 源(--find-links);如显式启用 PreferOfflineWheels=true 则额外加 --no-index(严格离线)。
  • ModelName 仍为默认 deepseek-ai/DeepSeek-OCR-2 且存在 DeepSeek.OCR2/bundled/models/DeepSeek-OCR-2:优先加载离线模型目录。

HTTP 协议

  • GET /health:健康检查(返回 { "ok": true }
  • POST /ocr:JSON 请求体(关键字段)
    • image_base64:图片内容(Base64)
    • prompt:提示词(需要包含 <image>
    • output_dir:可选,输出目录
    • base_size / image_size / crop_mode / save_results:与官方 model.infer 参数一致
Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • .NETStandard 2.0

  • net10.0

    • No dependencies.
  • net6.0

    • No dependencies.
  • net8.0

    • No dependencies.

NuGet packages (5)

Showing the top 5 NuGet packages that depend on DeepSeek.OCR2.Core:

Package Downloads
DeepSeek.OCR2

Meta-package that pulls DeepSeek.OCR2.Core plus offline assets packages.

DeepSeek.OCR2.Assets.Wheels.win-x64

Bundled offline wheels/torch for DeepSeek.OCR2 (win-x64). Copy-to-output via buildTransitive.

DeepSeek.OCR2.Assets.Python.win-x64

Bundled Python runtime for DeepSeek.OCR2 (win-x64). Copy-to-output via buildTransitive.

DeepSeek.OCR2.Bundled

Bundled python runtime + wheels/torch + model snapshot for DeepSeek.OCR2 (offline distribution).

DeepSeek.OCR2.Assets.Model

Bundled DeepSeek-OCR-2 model snapshot for DeepSeek.OCR2. Copy-to-output via buildTransitive.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.3.9 2,351 2/21/2026
0.3.8 326 2/20/2026
0.3.7 349 1/30/2026
0.3.6 300 1/30/2026
0.3.4 261 1/29/2026
0.3.3 264 1/29/2026
0.3.2 217 1/28/2026
0.3.1 228 1/28/2026
0.3.0 216 1/28/2026

- Fix offline wheels auto-detection (avoid --no-index when only placeholder exists)
- Add one-click API (DeepSeekOcr2/DeepSeekOcr2Session)
- Add netstandard2.0 target framework
- Add Windows auto Python+pIP bootstrap (portable python download)
- Add config template as contentFiles with copy-to-output