MarketplaceStudioActivityString Similarity Algorithms (Matching Percentage)

Create your first automation in just a few minutes.Try Studio Web

String Similarity Algorithms (Matching Percentage)

String Similarity Algorithms (Matching Percentage)

by Internal Labs

1

Activity

Downloads

479

back button
back button
next button
next button

Summary

Summary

This group of custom activities (String Matching Algorithms) serve the purpose of reducing development time for the user intending to implement one or more string matching logic in their project.

Overview

Overview

The following activities are included in this package:

Edit / Distance Based

  • Levenshtein
  • Damerau Levenshtein (Restricted and Default)
  • Weighted Levenshtein
  • Euclidean Distance
  • Hamming Distance
  • Jaro Distance
  • Jaro Winkler
  • Chapman Mean Length
  • Chapman Length Deviation
  • Minkowski Distance

 

Sequence / Alignment Based

  • Smith Waterman
  • Smith Waterman Gotoh
  • Smith Waterman Gotoh Windowed Affine
  • Needleman–Wunsch
  • Ratcliff Obershelp
  • Longest Common Subsequence (LCS)

 

Token Based

  • Jaccard Index
  • Sørensen–Dice coefficient
  • Block Distance
  • Tversky Index
  • Overlap coefficient.
  • Matching Coefficient
  • Cosine Similarity
  • Q-Grams
  • N-Grams

 

Hybrid Algorithms

  • SIFT4
  • Monge Elkan
  • Generalized Compression Distance

Phonetic / Sound Based

  • Soundex
  • Metaphone
  • Caverphone
  • Double Metaphone
  • Match Rating approach

 

String Hash Based

  • SimHash
  • MinHash

 

Fuzzy Wuzzy Algorithms

  • Fuzzy Ratio
  1. Simple Ratio
  2. Partial Ratio
  3. Token Sort Ratio
  4. Token Set Ratio
  5. Token Initialism Ratio
  6. Weighted Ratio
  • Fuzzy Token Abbreviation Ratio
  • Fuzzy Process Extraction

 

Extension Activities

  • Get Similarities.
  • Get Minimum Similarity.
  • Get Minimum Similarities .
  • Get Maximum Similarity.
  • Get Maximum Similarities.
  • Get Threshold Matches (Contains Fuzzy)

 

String Cleanup Helpers

  • String Cleanup
  • HTML Tag Cleanup
  • Unicode Normalize
  • Clean Accent Characters
  • Graphics Cleanup
  • Whitespace Cleanup
  • Other Cleanup

Note: Some dependencies used are bundled in the package, hence you do not need to install any other package except the current one downloaded from Marketplace.

Features

Features

The use cases for the string-matching algorithms are varied and depend on specific requirements:

  • Introduce similarity recognition between strings.
  • Compare texts and get the percentage of errors.
  • Use the logic to ascertain deviation in similarity.
  • Use more than one logic (algorithm) to exploit different capabilities.
  • Use a combination of algorithms to achieve increased accuracy.
  • Quick POCs to check viability of algorithmic usage or to move over to RPA++ (ML models etc).

Additional Information

Additional Information

Dependencies

No custom installations required in Studio or feed. (C# 8.0 or greater, will work on any .NET framework greater than 4.8, .NET Core 6.0 or greater - Required compression and encryption packages bundled with the download)

Code Language

C#, Visual Basic

Runtime

Windows (.Net 5.0 or higher)

Similar Listings