MarketplaceStudioSnippetJaccard - String Matching Algorithm

Create your first automation in just a few minutes.Try Studio Web

Jaccard - String Matching Algorithm

Jaccard - String Matching Algorithm

by Internal Labs

StarStarStarStarStarStarStarStarStarStar

0

Snippet

Downloads

<100

VersionRelease DateDownload Link
1.0.0
January 27, 2025

The Jaccard String Matching Algorithm has been integrated into our RPA process to enhance string comparison accuracy in cases requiring similarity checks. This algorithm helps evaluate the similarity between two strings(Words) by comparing the intersection of unique token sets, particularly useful for text-based data processing, name matching, and record linkage.

Test set:

Since the Jaccard Index calculates the similarity between two sets, create sets with known intersections:

Test Set Examples:

High Overlap

Set A: {"apple", "banana", "cherry", "date"}

Set B: {"apple", "banana", "cherry", "fig"}

Expected Jaccard Index: 0.6 (3 shared out of 5 unique elements)

Partial Overlap

Set C: {"dog", "cat", "rabbit", "horse"}

Set D: {"cat", "rabbit", "hamster", "turtle"}

Expected Jaccard Index: 0.4 (2 shared out of 5 unique elements)

No Overlap

Set E: {"car", "bus", "train"}

Set F: {"plane", "boat", "bicycle"}

Expected Jaccard Index: 0.0 (no shared elements)

Identical Sets

Set G: {"sun", "moon", "stars"}

Set H: {"sun", "moon", "stars"}

Expected Jaccard Index: 1.0 (3 shared out of 3 unique elements)

Large Set with Small Overlap

Set I: { "apple", "banana", "cherry", "date", "fig", "grape", "honeydew" }

Set J: { "fig", "grape", "kiwi", "lemon" }

Expected Jaccard Index: ~0.18 (2 shared out of 11 unique elements)

Small Set in Large Set

Set K: { "red", "green", "blue", "yellow", "purple" }

Set L: { "red", "green" }

Expected Jaccard Index: ~0.4 (2 shared out of 5 unique elements)

Publisher

Internal Labs

Visit publisher's page

License & Privacy

License Agreement

Privacy Terms

Technical

Version

1.0.0

Updated

January 27, 2025

Works with

Studio: 22.10.12 - 24.10.5

Certification

Silver Certified

Support

UiPath Community Support