Eliassen.Apache.Tika 0.1.85

This package has a SemVer 2.0.0 package version: 0.1.85+2.
dotnet add package Eliassen.Apache.Tika --version 0.1.85                
NuGet\Install-Package Eliassen.Apache.Tika -Version 0.1.85                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Eliassen.Apache.Tika" Version="0.1.85" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Eliassen.Apache.Tika --version 0.1.85                
#r "nuget: Eliassen.Apache.Tika, 0.1.85"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Eliassen.Apache.Tika as a Cake Addin
#addin nuget:?package=Eliassen.Apache.Tika&version=0.1.85

// Install Eliassen.Apache.Tika as a Cake Tool
#tool nuget:?package=Eliassen.Apache.Tika&version=0.1.85                

Eliassen.Apache.Tika

Summary

The Eliassen.Apache.Tika library provides functionality for content type detection and document conversion using Apache Tika. It offers a set of classes and methods for integrating Apache Tika services with .NET applications.

Installation

To use this library in your .NET project, add a reference to the Eliassen.Apache.Tika NuGet package.

Usage

Content Type Detection

The TikaContentTypeDetector class provides methods for asynchronously detecting the content type of a stream using Apache Tika.

using Eliassen.Apache.Tika;

// Detect content type asynchronously
string contentType = await TikaContentTypeDetector.DetectContentTypeAsync(stream);


## Document Conversion

The library includes several conversion handlers for converting documents to HTML format using Apache 
Tika. Each handler supports specific document formats.

### Word Documents (DOCX)

```csharp
using Eliassen.Apache.Tika;

// Convert DOCX document to HTML
TikaDocxToHtmlConversionHandler handler = new TikaDocxToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);

PDF Documents

using Eliassen.Apache.Tika;

// Convert PDF document to HTML
TikaPdfToHtmlConversionHandler handler = new TikaPdfToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);

OpenDocument Text (ODT) Documents

using Eliassen.Apache.Tika;

// Convert ODT document to HTML
TikaOdtToHtmlConversionHandler handler = new TikaOdtToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);

Rich Text Format (RTF) Documents

using Eliassen.Apache.Tika;

// Convert RTF document to HTML
TikaRtfToHtmlConversionHandler handler = new TikaRtfToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);

Extensibility

Developers can extend the functionality by inheriting from TikaToHtmlConversionBaseHandler or TikaConversionHandlerBase classes for custom document conversion requirements.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on Eliassen.Apache.Tika:

Package Downloads
Eliassen.Common.Extensions

Package Description

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
0.1.85 55 10/10/2024
0.1.84 49 10/10/2024
0.1.83 54 9/27/2024
0.1.82 151 8/23/2024
0.1.81 122 8/1/2024
0.1.81-dev-gh-pipline.3 50 8/1/2024