Moonbox.PdfTextExtractor
1.1.2
There is a newer version of this package available.
See the version list below for details.
See the version list below for details.
dotnet add package Moonbox.PdfTextExtractor --version 1.1.2
NuGet\Install-Package Moonbox.PdfTextExtractor -Version 1.1.2
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Moonbox.PdfTextExtractor" Version="1.1.2" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Moonbox.PdfTextExtractor" Version="1.1.2" />
<PackageReference Include="Moonbox.PdfTextExtractor" />
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Moonbox.PdfTextExtractor --version 1.1.2
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: Moonbox.PdfTextExtractor, 1.1.2"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Moonbox.PdfTextExtractor@1.1.2
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Moonbox.PdfTextExtractor&version=1.1.2
#tool nuget:?package=Moonbox.PdfTextExtractor&version=1.1.2
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
Moonbox.PdfTextExtractor
Moonbox.PdfTextExtractor is a .NET library designed to extract text from PDF documents using a combination of PDF parsing and OCR (Optical Character Recognition).
π¦ Package Information
- ID:
Moonbox.PdfTextExtractor - Version:
1.1.1 - Authors: Moonbox.PdfTextExtractor
β¨ Features
- Extracts text from native PDFs using PdfPig.
- Performs OCR on scanned PDFs with Tesseract.
- Uses Magick.NET for image preprocessing.
- Supports
.NET 9.0.
π Dependencies
This package relies on the following libraries:
Magick.NET-Q16-AnyCPU (14.7.0)PdfPig (0.1.11)System.Drawing.Common (9.0.8)Tesseract (5.2.0)Tesseract.Drawing (5.2.0)
π Installation
Install via NuGet Package Manager:
dotnet add package Moonbox.PdfTextExtractor --version 1.1.1
Or add to your .csproj file:
<PackageReference Include="Moonbox.PdfTextExtractor" Version="1.1.1" />
π Usage Example
using Moonbox.PdfTextExtractor;
class Program
{
static void Main()
{
var extractor = new PdfTextExtractor();
string text = extractor.ExtractText("sample.pdf");
Console.WriteLine(text);
}
}
π Notes
- For scanned documents, ensure
tessdata(Tesseract language files) is available in your project. - Works best with high-resolution PDFs for accurate OCR.
π License
This projectβs license details were not included in the package. Please check the repository for license information.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
-
net9.0
- Magick.NET-Q16-AnyCPU (>= 14.10.0)
- PdfPig (>= 0.1.11)
- System.Drawing.Common (>= 9.0.8)
- Tesseract (>= 5.2.0)
- Tesseract.Drawing (>= 5.2.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.