Html2Xhtml is a .NET 4.0 library for converting HTML to XHTML licensed under GPLv2 or above.

I tested Html2Xhtml in the local reconstruction of a large online database of the European Union. Tidy/Tidy.NET would not even produce valid output most of the time, Chilkat's HTML-to-XML was a bit slow and produced strange results (misplaced, missing, unexplainable elements). In attempt to find a free, fast and reliable conversion tool I created this library. It converts 2 - 4x faster than all other libraries I tested.

Html2Xhtml, combined with the power of LINQ to XML, is an excellent tool for all large-scale data extraction and web crawling scenarios.

Install-Package Html2Xhtml -Version
dotnet add package Html2Xhtml --version
<PackageReference Include="Html2Xhtml" Version="" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Html2Xhtml --version
The NuGet Team does not provide support for this client. Please contact its maintainers for support.


This package has no dependencies.

Version History

Version Downloads Last updated 7,238 6/4/2011