32 packages returned for Tags:"scraping"

IronWebScraper - The C# Web Scraping Library
Iron WebScraper is a C# web scraping library, allowing developers to simulate & automate human browsing behavior to extract content, files & images from web applications as native .Net objects. Iron Web Scraper manages politeness & multithreading in the background, leaving a developer’s own... More information
Scraping Framework containing : - a web client able to simulate a web browser. - an HtmlAgilityPack extension to select elements using css selector (like JQuery)
dcsoup HTML Parser
dcsoup is a .NET library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. This library is basically a port of jsoup, a Java HTML parser library. see also: http://jsoup.org/ API reference is... More information
Turn unstructured HTML pages into structured data. The OpenScraping library can extract information from HTML pages using a JSON config file with xPath rules. It can scrape even multi-level complex objects such as tables and forum posts.
SgmlReader for Portable Library. SgmlReader is "SGML" markup language parser, and derived from System.Xml.XmlReader in .NET CLR. But, most popular usage the "HTML" parser. (It's scraper!!) /* Use SgmlReader in Html parse mode. */ XDocument document = SgmlReader.Parse(stream); Done!
A .NET Standard library to extract the main content of a web page based on a port of the Readability library by Mozilla. It also determine and gather metadata about the content, such as language, author, main image, etc.