PercolatorMatching 1.1.0
dotnet add package PercolatorMatching --version 1.1.0
NuGet\Install-Package PercolatorMatching -Version 1.1.0
<PackageReference Include="PercolatorMatching" Version="1.1.0" />
paket add PercolatorMatching --version 1.1.0
#r "nuget: PercolatorMatching, 1.1.0"
// Install PercolatorMatching as a Cake Addin #addin nuget:?package=PercolatorMatching&version=1.1.0 // Install PercolatorMatching as a Cake Tool #tool nuget:?package=PercolatorMatching&version=1.1.0
A simple dll that contains a matching class to match strings and to calculate the score of similarity between the two strings using the Ratcliff-Obershelp algorithm.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET Framework | net is compatible. |
This package has no dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
1.1.0 | 19,224 | 4/7/2015 |
I originally built this when I found out that the fuzzy lookup and fuzzy grouping components of SSIS were only available on enterprise editions of SQL server. I've used this to scan over database tables to search for possible duplicate entries, and output the results to another table for a user to look over at a later time, and other applications as well.
Reference the dll and expose the namespace "Percolator.Matching". Make a new instance of "Fuzzylator."
The "ThresholdPercentage" is the threshold that the two strings must meet in order to be deemed as similar. This can be set while creating the new object, or later. If no threshold is set, then it will default to the "Zero" percent.
There are several overloads of the "IsSimilar" method to accomodate a couple different scenarios.
--Durring every check a score is calculated. The optional out parameter can be used to grab that score out of the check if he or she wishes to use it later rather than having to calculate the same score later on. --An optional ThresholdPercentage can be used on a single method to use that percentage rather than the one set by the instance for that one method call.
The "IsUPCSimilar" is a specialized UPC scanner that is streamlined specifically for a upc string. It does not calculate longest common subsequences, rather just looks at each digit in order and returns the score.
"GetScore" returns the score between the two strings, using the Ratcliff/Obershelp algorithm.
"GetUPCScore" again is a streamlined algorithm specifically for a UPC string.
Examples =>
using the similarty bools:
var fuz = new Fuzzylator(ThresholdPercentage.Eighty);
string str1 = "Test String";
string str2 = "A Test String";
if (fuz.IsSimilar(str1, str2))
{
//Do something
}
double score; if (fuz.IsSimilar(str1, str2, out score))
{
//Do something
Console.WriteLine(score); //score now contains the score of the two strings
}
if (fuz.IsSimilar(str1, str2, ThresholdPercentage.Ninety))
{
//Do something
//The IsSimilar check uses a Ninety percent threshold for this one time.
}
double score = fuz.GetScore(str1, str2, true); //the score variable now holds the value of the score between str1 and str2, optionally ignoring the case.