GroupDocs.Search 24.9.0

dotnet add package GroupDocs.Search --version 24.9.0                
NuGet\Install-Package GroupDocs.Search -Version 24.9.0                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="GroupDocs.Search" Version="24.9.0" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add GroupDocs.Search --version 24.9.0                
#r "nuget: GroupDocs.Search, 24.9.0"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install GroupDocs.Search as a Cake Addin
#addin nuget:?package=GroupDocs.Search&version=24.9.0

// Install GroupDocs.Search as a Cake Tool
#tool nuget:?package=GroupDocs.Search&version=24.9.0                

Advanced Document Search & Indexing .NET API

Version 24.9.0 NuGet .NET

banner


Product Page Docs API Ref Examples Blog Releases Support License


GroupDocs.Search for .NET is a comprehensive library enabling developers to build advanced search and indexing capabilities into their .NET applications. It supports a wide range of document formats and provides features such as semantic search, entity recognition, sentiment analysis, and custom entity extraction. With its flexible API and support for various search types, developers can easily implement powerful search functionalities, enhance data analysis, and gain valuable insights from their documents.

Creating an Index

  • Index Directory: Specify a directory where index data will be stored.
  • Memory-Based Indexing: Option to create an index in memory for faster search (but not persistent).

Adding Documents to an Index

  • Individual Files: Add files one by one using their paths.
  • Directory Scanning: Add all supported documents within a specified directory and its subdirectories.
  • File Streams: Index documents directly from streams for flexibility.

Updating and Maintaining an Index

  • Incremental Updates: Add or remove individual documents without rebuilding the entire index.
  • Index Optimization: Improve search performance by optimizing the index structure periodically.
  • Index Backup and Restore: Create backups of the index for safety and restore them if needed.

Basic Search Functionality

  • Full-Text Search: Index and search within the body text of documents.
  • Metadata Search: Search based on document metadata (author, title, keywords, etc.).
  • Supported File Formats: Search across various file types (DOCX, PDF, HTML, etc.).
  • Simple Query Syntax: Use basic search terms and phrases.

Advanced Search Options

  • Boolean Operators: Combine search terms using AND, OR, NOT for precise queries.
  • Wildcards: Use * to match any character sequence, ? to match a single character.
  • Regular Expressions: Employ powerful pattern matching for complex searches.
  • Fuzzy Search: Find near matches even with typos or slight variations.
  • Proximity Search: Search for words within a specified distance of each other.
  • Field Search: Search within specific document fields or properties.

Filtering and Sorting Search Results

  • Filter by Metadata: Narrow down results based on metadata values.
  • Filter by Date Range: Limit results to documents within a specific time frame.
  • Sort by Relevance: Order results based on their relevance to the search query.
  • Sort by Date: Order results by document creation or modification date.
  • Custom Sorting: Implement custom sorting logic based on specific criteria.

Working with Metadata

  • Metadata Extraction: Automatically extract and index metadata during indexing.
  • Metadata Search: Search and filter results based on extracted metadata.
  • Metadata Display: Include metadata in search results for additional context.

Highlighting and Snippets

  • Search Term Highlighting: Visually highlight search terms within the retrieved documents.
  • Search Result Snippets: Display a short excerpt of text surrounding the search terms to provide context.

Semantic Search and Entity Recognition

  • Semantic Search: Understand the meaning behind search queries and retrieve conceptually relevant results, going beyond keyword matching.
  • Entity Recognition: Identify and extract named entities like people, organizations, locations, dates, and more from text.

Sentiment Analysis

  • Sentiment Determination: Analyze text to determine the overall sentiment expressed, classifying it as positive, negative, or neutral.
  • Sentiment Scoring: Assign sentiment scores to text, indicating the degree of positivity or negativity.

Document Classification

  • Text Classification: Categorize documents into predefined classes or categories based on their content.
  • Custom Classification: Train custom classification models to categorize documents according to specific criteria or taxonomies.

Custom Entity Extraction

  • Custom Entity Definition: Define custom entity types and extraction rules based on specific requirements.
  • Custom Entity Extraction: Extract user-defined entities from text using the defined rules and patterns.

Advanced Search API Features

  • Creating Custom Analyzers: Tailor text processing during indexing and searching. Define tokenization rules, stemming algorithms, stop words, and more.
  • Configuring Indexing Options: Control which file types and parts of documents are indexed. Set indexing depth and update frequency.
  • Implementing Search Result Ranking: Customize the scoring algorithm to prioritize certain results. Factor in metadata, document structure, or other custom criteria.
  • Semantic Search and Entity Recognition: Understand the meaning behind search queries. Identify named entities (people, organizations, locations, etc.) within text.
  • Sentiment Analysis: Determine the overall sentiment (positive, negative, neutral) expressed in text.
  • Document Classification: Categorize documents based on their content.
  • Custom Entity Extraction: Define and extract custom entities from text.

Supported Document Formats

Document Type Document Type Description Searchable Data Supported Versions Notes
Word Processing
DOC Microsoft Word® Document Content and metadata Microsoft Word® 97+
DOT Microsoft Word® Document Template Content and metadata Microsoft Word® 97+
DOCX Office Open XML Document Content and metadata
DOCM Office Open XML Document [Macro-enabled] Content and metadata
DOTX Office Open XML Document Template Content and metadata
DOTM Office Open XML Document Template [Macro-enabled] Content and metadata
TXT Plain text Content and metadata
ODT Open Document Text Content and metadata
OTT Open Document Text Template Content and metadata
RTF Rich Text Format Content and metadata 1.9
PDF
PDF Portable Document Format File Content and metadata
Markup
HTML Hypertext Markup Language File Content and metadata
XHTML Extensible Hypertext Markup Language File Content and metadata
MHTML MIME HTML File Content and metadata Not supported by .NET Core version in Linux
MD Markdown Content and metadata
XML XML File Content and metadata
Ebooks
CHM Compiled HTML Help File Content and metadata 1.4
EPUB Open eBook File Content and metadata 2.0, 3.0, 3.1
FB2 FictionBook 2.0 File Content and metadata 2.0
Spreadsheets
XLS Microsoft Excel® Spreadsheet Content and metadata Microsoft Excel® 97+
XLT Microsoft Excel® Spreadsheet Template Content and metadata Microsoft Excel® 97+
XLSX Office Open XML Spreadsheet Content and metadata
XLSM Office Open XML Spreadsheet [Macro-enabled] Content and metadata
XLSB Office Open XML Spreadsheet [Binary] Content and metadata
XLTX Office Open XML Spreadsheet Template Content and metadata
XLTM Office Open XML Spreadsheet Template [Macro-enabled] Content and metadata
XLA Microsoft Excel® 97-2003 Add-In Content and metadata
XLAM Microsoft Excel® Open XML Add-In Content and metadata
ODS Open Document Spreadsheet Content and metadata
OTS Open Document Spreadsheet Template Content and metadata
CSV Comma Separated Values Content and metadata
TSV Tab Separated Values Content and metadata
XML SpreadsheetML Content and metadata
Presentations
PPT PowerPoint® Presentation Content and metadata Microsoft PowerPoint® 97+
PPS PowerPoint® Slideshow Content and metadata Microsoft PowerPoint® 97+
POT PowerPoint® Template Content and metadata Microsoft PowerPoint® 97+
PPTX Office Open XML Presentation Content and metadata
PPTM Office Open XML Presentation [Macro-enabled] Content and metadata
POTX Office Open XML Presentation Template Content and metadata
POTM Office Open XML Presentation Template [Macro-enabled] Content and metadata
PPSX Office Open XML Presentation Slideshow Content and metadata
PPSM Office Open XML Presentation Slideshow [Macro-enabled] Content and metadata
ODP Open Document Presentation Content and metadata
Emails
PST Outlook Personal Information Store File Content and metadata
OST Outlook Offline Data File Content and metadata
EML E-Mail Message Content and metadata
EMLX Apple Mail Message Content and metadata
MSG Outlook Mail Message Content and metadata
Notes
OneNote® OneNote® Document Content and metadata Local files of Microsoft OneNote® 2010-2016 Not supported by .NET Core version in Linux
Archives
ZIP Zipped File Content and metadata
Audio
MP3 MPEG-2 Audio Layer III Metadata only
WAV Waveform Audio File Format Metadata only
Images
BMP Bitmap Picture Content and metadata
GIF Graphical Interchange Format File Content and metadata
JP2 JPEG 2000 Core Image File Content and metadata
PNG Portable Network Graphics Content and metadata
WEBP WebP Image Format File Content and metadata
TIFF Tagged Image File Format Content and metadata
EMF Enhanced Windows Metafile Content and metadata
WMF Windows Metafile Content and metadata
JPG JPEG Image Content and metadata
PSD Adobe Photoshop Document Content and metadata
DJVU DjVu Image Content and metadata
Project Management
MPP Microsoft Project File Metadata only
Torrents
TORRENT BitTorrent File Metadata only
Diagrams
VSD Visio® Drawing File Metadata only
VSS Visio® Stencil File Metadata only
Medicine
DCM DICOM Image Metadata only
DICOM DICOM Image Metadata only
Videos
AVI Audio Video Interleave File Metadata only
MOV Apple QuickTime Movie Metadata only
QT Apple QuickTime Movie Metadata only
FLV Animate Video File Metadata only
ASF Advanced Systems Format File Metadata only

Supported Search Types

  • Simple word search: Searches for the exact occurrence of a word in the indexed documents.
  • Boolean search: Combines multiple search terms using logical operators like AND, OR, and NOT.
  • Regular expression search: Uses patterns and expressions to search for complex text structures.
  • Faceted search: Filters search results based on specific categories or fields.
  • Case sensitive search: Differentiates between uppercase and lowercase characters in the search query.
  • Flexible fuzzy search: Finds words with similar spelling, allowing for minor typing or spelling errors.
  • Synonym search: Searches for words and their synonyms to expand search results.
  • Homophone search: Finds words that sound the same but have different spellings.
  • Wildcard search: Uses placeholders like * or ? to match varying characters or word fragments.
  • Phrase search with wildcards: Searches for a specific phrase while allowing variations with wildcards.
  • Search for different word forms: Matches different grammatical forms of a word, such as plural or tense variations.
  • Date range search: Filters documents based on a specific date or a range of dates.
  • Numeric range search: Finds data within a specified numeric range.
  • Search by chunks (pages): Searches within specific sections or pages of a document.
  • Search for different object types: Searches across various data types, such as text, numbers, dates, file names, and metadata.
  • Combining different types of search into one search query: Mixes multiple search types, such as combining Boolean and wildcard searches in one query.
  • Alias substitution in search queries: Replaces defined aliases with their full meanings during the search.
  • Spell check during search: Automatically corrects minor spelling mistakes in the search query.
  • Keyboard layout correction during search: Adjusts the search query for different keyboard layouts or language settings.
  • Search queries in text or flexible object form: Accepts both textual and structured object-based search queries.
  • Highlighting search results: Highlights the found terms or phrases directly in the document.
  • Multiple simultaneous thread safe search: Allows multiple searches to be run concurrently without conflicts.
  • Thread safe search during indexing, updating, or merging operations: Ensures safe searching while the index is being modified.
  • Search over several indexes simultaneously: Performs searches across multiple indexes in a single operation.
  • Reverse image search: Finds images based on similarity or matching image characteristics rather than text.

System Requirements

Supported Platforms/Versions
Supported Operating Systems
Windows Microsoft Windows 2003 Server (x64, x86), Microsoft Windows 2008 Server (x64, x86), Microsoft Windows 2012 Server (x64, x86), Microsoft Windows 2012 R2 Server (x64, x86), Microsoft Windows 2016 Server (x64, x86), Microsoft Windows 2019 Server (x64, x86), Microsoft Windows Vista (x64, x86), Microsoft Windows XP (x64, x86), Microsoft Windows 7 (x64, x86), Microsoft Windows 8, 8.1 (x64, x86), Microsoft Windows 10 (x64, x86)
Linux Linux (Ubuntu, OpenSUSE, CentOS, and others)
Supported Frameworks
.NET Frameworks .NET Framework 4.5, 4.5.1, 4.5.2, 4.6, 4.6.1, 4.6.2, 4.7, 4.7.1, 4.7.2, 4.8, .NET Standard 2.1, .NET Core 3.0, .NET Core 3.1, .NET 5.0, .NET 6.0
Development Environments
Visual Studio Versions Microsoft Visual Studio 2012, 2013, 2015, 2017, 2019, 2022

Install via NuGet

Using Package Manager GUI

  1. Open your solution in Visual Studio.
  2. Go to ToolsNuGet Package ManagerManage NuGet Packages for Solution.
  3. In the Browse tab, search for GroupDocs.Search.
  4. Click Install to add it to your project.

Using Package Manager Console

  1. Open your solution in Visual Studio.
  2. Go to ToolsNuGet Package ManagerPackage Manager Console.
  3. Run the command:
    Install-Package GroupDocs.Search
    
  4. GroupDocs.Search will be referenced in your project.

Install from Official Website

  1. Download and unpack the ZIP or use the MSI installer from the official website.
  2. In Visual Studio, right-click References and select Add Reference.
  3. Browse and select GroupDocs.Search.dll, or choose from the installed components.
  4. Click OK to complete the reference.

Indexing Documents from URL Using GroupDocs.Search for .NET

Learn how to index a document from a URL using GroupDocs.Search for .NET. This example demonstrates lazy initialization of documents from a URL and indexing them for efficient search in .NET applications.

    // Class to load a document from a URL with lazy initialization
    private class DocumentLoaderFromUrl : IDocumentLoader
    {
        private readonly string documentKey; // Document identifier (URL in this case)
        private readonly string url; // The URL to fetch the document from
        private readonly string extension; // The file extension of the document
        
        // Constructor to initialize document properties
        public DocumentLoaderFromUrl(string documentKey, string url, string extension)
        {
            this.documentKey = documentKey;
            this.url = url;
            this.extension = extension;
        }

        // Method to load the document from the URL stream
        public Document LoadDocument()
        {
            // Configure security protocols for web requests
            ServicePointManager.Expect100Continue = true;
            ServicePointManager.SecurityProtocol =
                SecurityProtocolType.Ssl3 |
                SecurityProtocolType.Tls |
                SecurityProtocolType.Tls12 |
                SecurityProtocolType.Tls11;

            // Create a web request to access the URL
            WebRequest request = WebRequest.Create(url);
            using (WebResponse response = request.GetResponse())
            using (Stream stream = response.GetResponseStream())
            {
                // Copy the stream into memory
                MemoryStream memoryStream = new MemoryStream();
                stream.CopyTo(memoryStream);
                memoryStream.Position = 0;

                // Create a Document object from the memory stream
                Document document = Document.CreateFromStream(documentKey, DateTime.Now, extension, memoryStream);
                return document;
            }
        }

        // Method to close the document (empty in this case)
        public void CloseDocument()
        {
        }
    }

        // Define the index folder path where the index will be stored
        string indexFolder = @"c:\MyIndex";

        // Define the URL of the document to be indexed
        string url = "http://example.com/ExampleDocument.pdf";

        // Creating an index in the specified folder
        Index index = new Index(indexFolder);

        // Creating a document loader object to fetch the document from the URL
        string documentKey = url;
        IDocumentLoader documentLoader = new DocumentLoaderFromUrl(documentKey, url, ".pdf");

        // Creating a lazy-initialized document object
        Document document = Document.CreateLazy(DocumentSourceKind.Stream, documentKey, documentLoader);

        // Prepare an array of documents for indexing
        Document[] documents = new Document[] { document };

        // Indexing options (default options in this case)
        IndexingOptions options = new IndexingOptions();

        // Add the lazy-loaded document to the index
        index.Add(documents, options);

Source*

Homophone Search Using GroupDocs.Search for .NET

Learn how to perform homophone search using GroupDocs.Search for .NET. This code example demonstrates enabling homophone search to find similar-sounding words like "coal," "cole," and "kohl" in indexed documents.

// Specify the index folder path where the index will be created
string indexFolder = @"c:\MyIndex\";

// Specify the folder path containing documents to be indexed
string documentsFolder = @"c:\MyDocuments\";

// Creating an index in the specified folder
Index index = new Index(indexFolder);

// Adding documents to the index from the specified folder
index.Add(documentsFolder);

// Creating search options to enable homophone search
SearchOptions options = new SearchOptions();
options.UseHomophoneSearch = true; // Enabling homophone search

// Search for the word 'coal' in the indexed documents
// Homophone search will also find words that sound like 'coal', such as 'cole' and 'kohl'
SearchResult result = index.Search("coal", options);

Source*

Perform Shard Optimization using GroupDocs.Search for .NET

Learn how to optimize shards in a search network using GroupDocs.Search for .NET. This example demonstrates how to improve search performance by minimizing the number of index segments on each shard through the optimization process.

// Inform the user that the optimization process is starting
Console.WriteLine("Optimizing shards");

// Access the Indexer class for the current search network node
Indexer indexer = node.Indexer; // Assuming 'node' is defined elsewhere in your search network

// Create optimization options
OptimizeOptions options = new OptimizeOptions();

// Perform the optimization process on all shards
indexer.Optimize(options); // This reduces the number of index segments on each shard

Source*


Product Page Docs API Ref Examples Blog Releases Support License


Tags

Aspose | GroupDocs |Advanced Document Search | Indexing API | .NET Search Library | Semantic Search API | Boolean Search | Fuzzy Search | Metadata Search | Entity Recognition API | Sentiment Analysis | Custom Entity Extraction | Document Classification API | Full-Text Search | Field Search | Regular Expressions Search | Proximity Search | Custom Search Ranking | Indexing Optimization | Distributed Search Network | Reverse Image Search | Search API | .NET Document Search | Document Indexing API | GroupDocs.Search for .NET | Text Search API | Search Results Highlighting | Document Metadata Search | Snippets Extraction | Wildcard Search | Search API for .NET

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.1 is compatible. 
.NET Framework net45 is compatible.  net451 was computed.  net452 was computed.  net46 was computed.  net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
24.9.0 709 9/11/2024
24.8.0 1,035 8/22/2024
24.6.0 1,011 6/20/2024
24.5.0 636 5/14/2024
24.4.0 1,316 4/18/2024
24.3.0 2,570 3/20/2024
24.2.1 886 2/29/2024
24.2.0 694 2/28/2024
24.1.0 538 1/26/2024
23.12.0 2,000 12/4/2023
23.11.0 3,404 11/23/2023
23.10.1 1,999 10/20/2023
23.10.0 1,461 10/2/2023
23.6.0 4,440 6/15/2023
23.2.0 4,293 2/28/2023
22.11.0 2,347 11/24/2022
22.10.1 2,359 10/12/2022
22.10.0 1,438 10/7/2022
21.8.1 38,928 8/23/2021
21.8.0 1,548 8/18/2021
21.3.0 40,929 3/18/2021
21.2.0 26,436 2/18/2021
20.11.0 35,420 11/19/2020
20.8.0 68,126 8/17/2020
20.6.0 62,544 6/23/2020
20.4.0 65,265 4/15/2020
20.1.0 52,476 1/31/2020
19.10.1 56,907 11/6/2019
19.10.0 1,004 10/2/2019
19.5.1 842 7/15/2019
19.5.0 803 5/31/2019
19.3.0 840 3/6/2019
19.2.0 921 2/5/2019
18.12.0 1,086 12/11/2018
18.9.0 1,125 9/6/2018
18.8.0 1,302 8/8/2018
18.7.0 1,177 7/14/2018
18.6.0 1,243 6/14/2018
18.5.0 1,139 5/16/2018
18.4.0 1,287 4/9/2018
18.2.0 1,254 2/8/2018
18.1.0 1,252 1/9/2018
17.12.0 1,472 12/7/2017
17.11.0 1,239 11/9/2017
17.10.0 1,123 10/3/2017