DeepSeek.OCR2.Core
0.3.2
See the version list below for details.
dotnet add package DeepSeek.OCR2.Core --version 0.3.2
NuGet\Install-Package DeepSeek.OCR2.Core -Version 0.3.2
<PackageReference Include="DeepSeek.OCR2.Core" Version="0.3.2" />
<PackageVersion Include="DeepSeek.OCR2.Core" Version="0.3.2" />
<PackageReference Include="DeepSeek.OCR2.Core" />
paket add DeepSeek.OCR2.Core --version 0.3.2
#r "nuget: DeepSeek.OCR2.Core, 0.3.2"
#:package DeepSeek.OCR2.Core@0.3.2
#addin nuget:?package=DeepSeek.OCR2.Core&version=0.3.2
#tool nuget:?package=DeepSeek.OCR2.Core&version=0.3.2
DeepSeek.OCR2 (.NET / NuGet) 封装说明
这个 NuGet 包提供两层封装:
DeepSeekOcr2LocalServer:从包内释放一个轻量 Python HTTP Server 脚本并启动子进程(模型只加载一次,便于多次调用)。DeepSeekOcr2Client:通过 HTTP 调用POST /ocr执行 OCR 推理。
先决条件
- Windows:无需预装 Python。本包默认会在首次运行时自动下载便携版 Python(默认 3.10.11)并引导 pip/venv。
- Linux/macOS:建议预装 Python 3.10+(或手动指定
PythonExecutablePath)。 - 推理依赖(
torch/transformers等)默认会在首次运行时自动创建 venv 并安装(CPU 预设);也支持离线 wheels(见下文)。 - 如果希望更快推理,可按上游说明安装
flash-attn(否则服务端会自动降级)。
依赖“打包/自带”能做到什么
- 默认模式:首次运行时自动创建 venv 并通过 pip 安装依赖(Windows 可自动下载便携 Python)。
- 离线全量模式:把 Python runtime + wheels/torch + 模型权重 一起打进同一个
.nupkg(包体会非常大,通常只能发布到私有 NuGet 源)。
许可证与归属
- 上游仓库 DeepSeek-OCR-2 的许可证为 Apache License 2.0(见仓库根目录 LICENSE.txt)。
- 默认包(
DeepSeek.OCR2.Core)仅做 .NET 调用封装与启动脚本分发,不包含模型权重;如需离线模型请使用DeepSeek.OCR2.Assets.Model.*或DeepSeek.OCR2.Bundled。 - 本封装仓库地址:https://github.com/ichichchch/DeepSeekOCR2.NET
最小用法(启动本地服务 + 调用 OCR)
using DeepSeek.OCR2;
var result = await DeepSeekOcr2.RecognizeFileAsync(@"D:\test.jpg");
Console.WriteLine(result.Text);
说明:第一次调用
/ocr时,Python 端可能需要从 HuggingFace 下载模型并完成初始化,耗时可能超过默认的 HttpClient 100 秒超时。可以通过DeepSeekOcr2LocalServerOptions.OcrRequestTimeout调大超时(或设为Timeout.InfiniteTimeSpan)。
using DeepSeek.OCR2;
using System;
var options = new DeepSeekOcr2LocalServerOptions
{
OcrRequestTimeout = TimeSpan.FromMinutes(30),
BootstrapDownloadTimeout = TimeSpan.FromMinutes(30),
};
var result = await DeepSeekOcr2.RecognizeFileAsync(@"D:\test.jpg", serverOptions: options);
Console.WriteLine(result.Text);
如需复用同一个模型进程(多次调用更快):
using DeepSeek.OCR2;
await using var session = await DeepSeekOcr2.CreateSessionAsync();
var request = DeepSeekOcr2Request.FromFile(@"D:\test.jpg") with { Prompt = "<image>\nFree OCR." };
var result = await session.Client.RecognizeAsync(request);
Console.WriteLine(result.Text);
Torch 自动安装选项
TorchInstallPresetCpu:按 PyTorch 官方 CPU 索引安装(默认)None:不安装 torch(适合你自行管理 Python 环境)Cuda118:按 PyTorch 官方 cu118 索引安装(与本仓库 README 示例一致)
OfflineWheelDirectory:指定离线 wheel 目录(会传给 pip:--find-links <dir>)PreferOfflineWheels:为 true 时额外加--no-index(强制只从离线目录找)TorchVersion/TorchVisionVersion/TorchAudioVersion:默认2.6.0/0.21.0/2.6.0,可自行改
目标框架
- DeepSeek.OCR2:netstandard2.0 / net6.0 / net8.0 / net10.0
发布到 nuget.org(建议)
- Owners:nuget.org 的包所有者最终由你上传时使用的账号/组织决定;建议用你的组织账号作为 owner,并通过 nuget.org 后台添加/移除 owners。工程里的
Owners字段仅作为元数据展示用(不同站点可能忽略)。 - RepositoryBranch/Commit:本包在 CI(GitHub Actions)环境下会自动读取
GITHUB_REF_NAME/GITHUB_SHA并写入包元数据;也支持在打包命令里显式覆盖:dotnet pack -p:RepositoryBranch=main -p:RepositoryCommit=<commitSha>
- 自动发布(GitHub Actions):
- 在仓库 Secrets 添加
NUGET_API_KEY(nuget.org 生成的 API Key) - 推送 tag
v*(例如v0.1.7)会触发发布工作流nuget-publish
- 在仓库 Secrets 添加
本地打包/推送
- 打包(元包+依赖包):
pwsh .\pack.ps1(输出到dotnet/artifacts/) - 推送到 nuget.org:
pwsh .\push.ps1 -ApiKey <key>(默认推送DeepSeek.OCR2*相关包;如需跳过超大包可加-IncludeBundled:$false) - 如需推送其他包:使用
-PackageGlob显式指定
发布到 nuget.org(NuGet Gallery)
- 本仓库的 GitHub Actions 工作流
nuget-publish会在推送 tagv*时打包并推送DeepSeek.OCR2(元包)以及其依赖包(DeepSeek.OCR2.Core、DeepSeek.OCR2.Assets.*)。 - 也会打包
DeepSeek.OCR2.Bundled(单包资产方案)。如果你准备了真实的 bundled 资产(模型/torch/python),该包体可能非常大,建议发布到私有 NuGet 源。 - 发布后会自动把
DeepSeek.OCR2.Core与DeepSeek.OCR2.Assets.*设为 Unlisted(用户搜索只看到DeepSeek.OCR2,但依赖仍可正常还原)。 - 需要在仓库 Secrets 配置
NUGET_API_KEY。
也可以本地发布(会打包、推送、并可选 Unlist 内部包):
pwsh .\publish-nuget.ps1 -Version 0.3.0
离线全量包(模型+torch+wheels+Python runtime)
可以拆成多个 NuGet 包来实现“全量离线”,优点是:每个包体积可控、可按需选择(例如不同平台/不同 torch 版本);缺点是:发布/版本管理更复杂,下载包数量更多。
推荐引用方式(win-x64 / 分包资产,C1):
- 只引用一个包:
DeepSeek.OCR2(meta 包,会自动拉起DeepSeek.OCR2.Core+ Python/wheels/模型资源包) DeepSeek.OCR2.Full.win-x64与DeepSeek.OCR2等价,已不再发布新版本- 如只要在线安装(不带离线资产):引用
DeepSeek.OCR2.Core - 如要手动组合:直接引用
DeepSeek.OCR2.Core+DeepSeek.OCR2.Assets.*(按需选平台/模型包)
推荐引用方式(单包资产,C2):
- 引用
DeepSeek.OCR2.Bundled:一个包内包含 python runtime + wheels + 模型快照(仍会依赖DeepSeek.OCR2.Core)。适合私有 NuGet 源/离线分发(包体可能非常大)。
示例(C2):
<PackageReference Include="DeepSeek.OCR2.Bundled" Version="x.y.z" />
目录结构(会被打入 .nupkg 并在引用方输出目录自动复制到 DeepSeek.OCR2/bundled/):
dotnet/src/DeepSeek.OCR2/Bundled/python/<rid>/<version>/...dotnet/src/DeepSeek.OCR2/Bundled/wheels/<rid>/*.whldotnet/src/DeepSeek.OCR2/Bundled/models/DeepSeek-OCR-2/...
准备资产(会下载大量内容):
pwsh .\bundle\prepare-bundled-assets.ps1 -TorchPreset cpu -ModelId deepseek-ai/DeepSeek-OCR-2
随后打包:
pwsh .\pack.ps1
运行时行为:
- 若检测到
DeepSeek.OCR2/bundled/python/.../python.exe:优先使用随包 Python,不再下载。 - 若检测到
DeepSeek.OCR2/bundled/wheels/<rid>且目录内存在.whl:默认作为离线 wheel 源(--find-links);如显式启用PreferOfflineWheels=true则额外加--no-index(严格离线)。 - 若
ModelName仍为默认deepseek-ai/DeepSeek-OCR-2且存在DeepSeek.OCR2/bundled/models/DeepSeek-OCR-2:优先加载离线模型目录。
HTTP 协议
GET /health:健康检查(返回{ "ok": true })POST /ocr:JSON 请求体(关键字段)image_base64:图片内容(Base64)prompt:提示词(需要包含<image>)output_dir:可选,输出目录base_size/image_size/crop_mode/save_results:与官方model.infer参数一致
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- System.Text.Json (>= 8.0.5)
-
net10.0
- No dependencies.
-
net6.0
- No dependencies.
-
net8.0
- No dependencies.
NuGet packages (5)
Showing the top 5 NuGet packages that depend on DeepSeek.OCR2.Core:
| Package | Downloads |
|---|---|
|
DeepSeek.OCR2
Meta-package that pulls DeepSeek.OCR2.Core plus offline assets packages. |
|
|
DeepSeek.OCR2.Assets.Wheels.win-x64
Bundled offline wheels/torch for DeepSeek.OCR2 (win-x64). Copy-to-output via buildTransitive. |
|
|
DeepSeek.OCR2.Assets.Python.win-x64
Bundled Python runtime for DeepSeek.OCR2 (win-x64). Copy-to-output via buildTransitive. |
|
|
DeepSeek.OCR2.Bundled
Bundled python runtime + wheels/torch + model snapshot for DeepSeek.OCR2 (offline distribution). |
|
|
DeepSeek.OCR2.Assets.Model
Bundled DeepSeek-OCR-2 model snapshot for DeepSeek.OCR2. Copy-to-output via buildTransitive. |
GitHub repositories
This package is not used by any popular GitHub repositories.
- Fix offline wheels auto-detection (avoid --no-index when only placeholder exists)
- Add one-click API (DeepSeekOcr2/DeepSeekOcr2Session)
- Add netstandard2.0 target framework
- Add Windows auto Python+pIP bootstrap (portable python download)
- Add config template as contentFiles with copy-to-output