MarketplaceStudioActivityText Analysis Activities

Create your first automation in just a few minutes.Try Studio Web

Text Analysis Activities

Text Analysis Activities

by Pawel Wesolowski

StarStarStarStarStarStarStarStarStarStar

0

Activity

Downloads

1.3k

back button
back button
carouselImage0
next button
next button

Summary

Summary

Create structured data out of free text content using Text Analysis

Overview

Overview

Text Analysis (or Text Mining) is the automated process of obtaining information from text. The goal is to create structured data out of free text content, as this can be easier interpreted by a computer.

It can be used for categorizing press articles, user reviews, incoming e-mails, tickets, monitoring comments or social media and many more.

The first release contains several basic, but powerful methods and techniques:

  • Detect Language: allows text to be classified according to its language. Supports 16 languages. It is helpful for setting the OCR engine or allowing you to route e-mails or document to specific people, depending on the detected language.
  • Find Collocations: identify words that commonly co-occur. Returns bigrams, most commonly two adjacent words i.e. invoice date for invoice documents.
  • Prepare String to Analyze: basic activity for text analyzing. Prepares provided string lowercasing it, removing special characters, numbers, single letters. The returned text contains only words separated by spaces.
  • Remove Stopwords: provides a more accurate automated analysis of the text. Supports 16 languages. It is important to remove from play all the words that are very frequent but provide little semantic information or no meaning at all. These words are also known as stopwords.
  • Word Frequency Analyze: finds the most frequently used words in a document. This is useful to identify document type by comparing results with a defined pattern.
  • Word Position Analyze: helps identify the type of a document by analyzing the position of specific words. Invoice keyword position at the top of the document, in the header, increases the possibility of this being an invoice, then if the keyword is mentioned somewhere on the 5th page of the document.

Features

Features

Easy to use, gives your Robots the ability to analyze text - recognize language, categorize text etc.

Additional Information

Additional Information

Dependencies

No dependencies.

Code Language

Visual Basic

Runtime

Windows Legacy (.Net Framework 4.6.1)

Publisher

Pawel Wesolowski

Visit publisher's page

License & Privacy

MIT

Privacy Terms

Technical

Version

1.0.2

Updated

February 18, 2020

Works with

Studio: 21.10 - 22.10

Certification

Silver Certified

Support

UiPath Community Support

Resources

Similar Listings