Camelot.Sharp.Cmoski
0.0.3
dotnet add package Camelot.Sharp.Cmoski --version 0.0.3
NuGet\Install-Package Camelot.Sharp.Cmoski -Version 0.0.3
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Camelot.Sharp.Cmoski" Version="0.0.3" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Camelot.Sharp.Cmoski" Version="0.0.3" />
<PackageReference Include="Camelot.Sharp.Cmoski" />
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Camelot.Sharp.Cmoski --version 0.0.3
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: Camelot.Sharp.Cmoski, 0.0.3"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Camelot.Sharp.Cmoski@0.0.3
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Camelot.Sharp.Cmoski&version=0.0.3
#tool nuget:?package=Camelot.Sharp.Cmoski&version=0.0.3
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
Camelot.Sharp
This is a maintained fork of BobLd's camelot-sharp with bug fixes and improvements.
Original author: BobLd
Maintained by: cmoski
Recent Changes
- Fixed Lattice parser text splitting for multi-column tables
- Improved vertical proximity detection for text grouping
[Rest of original README...]
Usage
Stream mode
using (PdfDocument doc = PdfDocument.Open(@"Files\foo.pdf", new ParsingOptions() { ClipPaths = true }))
{
Stream stream = new Stream();
var tables = stream.ExtractTables(doc.GetPage(1));
Assert.Single(tables);
Assert.Equal((612, 792), stream.Dimensions);
Assert.Equal(612, stream.PdfWidth);
Assert.Equal(792, stream.PdfHeight);
//Assert.Equal(84, stream.HorizontalText.Count);
var parsingReport = tables[0].ParsingReport();
// parsing_report = {"accuracy": 99.02, "whitespace": 12.24, "order": 1, "page": 1}
parsingReport["order"] = 1;
parsingReport["page"] = 1;
}
Lattice mode
using (var doc = PdfDocument.Open(@"Files\column_span_2.pdf", new ParsingOptions() { ClipPaths = true }))
{
var page = doc.GetPage(1);
Lattice lattice = new Lattice(new OpenCvImageProcesser(), new BasicSystemDrawingProcessor(), line_scale: 40);
var tables = lattice.ExtractTables(page,
layout_kwargs: new DlaOptions[]
{
new DocstrumBoundingBoxes.DocstrumBoundingBoxesOptions()
{
WithinLineMultiplier = 2
}
});
Assert.Single(tables);
Assert.Equal(DataLatticeShiftTextLeftTop.Length, tables[0].Cells.Count);
Assert.Equal(DataLatticeShiftTextLeftTop, tables[0].Data().Select(r => r.Select(c => c).ToArray()).ToArray());
}
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.1 is compatible. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
-
.NETStandard 2.1
- PdfPig (>= 0.1.11)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Camelot.Sharp.Cmoski:
| Package | Downloads |
|---|---|
|
Camelot.Sharp.OpenCvSharp4.Cmoski
A C# library to extract tabular data from PDFs (port of camelot Python version using PdfPig). Contains OpenCvSharp4 for image processing used in the Lattice parser. Maintained fork with fixes for multi-column text splitting. |
GitHub repositories
This package is not used by any popular GitHub repositories.
Fixed Lattice parser text splitting for multi-column tables