Create your first automation in just a few minutes.Try Studio Web →
by Internal Labs
9
Activity
60.3k
Summary
Summary
An activity pack used to read and process extra large Excel files.
Overview
Overview
Objectives:
Observation and limitations:
The activity will not keep track of the formats used to display values in the Excel application. For example, a cell with a value of 43006 formatted as date will be displayed as 9/28/2017 in Excel, but this activity will still retrieve it as 43006.
Files in which the row starts with the index 0 (this index can be found in the XML mark-up of the row) will not be processed. The row number also needs to be under 1048576 (the maximum number of rows in an Excel Sheet). If a row number outside of this interval is found, the application will throw an exception. This limitation is also valid for the Read Range in the Workbook activities package.
This package is not compatible with .xls files or .csv files. Only .xlsx files can be read with this activity. Reading Large XLS files can be done in conjunction with the Conversions Package, which will convert the file to an .xlsx file, thus making it compatible with this package.
Some very minor differences when deciding where a range ends between the standard Excel processing activities and this one.
Technical Approach:
The files will be processed in the background using the Open XML SDK library and the SAX (Simple API for XML) approach which instead of reading the entire document and exposing its DOM, will independently read the XML files that comprise the Excel document.
Read Range
Principles
This custom activity reads a Range of cells from an Excel Sheet. It achieves its goal by using the Open XML SDK framework and the OpenXmlReader class to parse the File in a SAX like approach. To get the number of rows, we simply parse the file and count the rows.
The custom activity is created by simply inheriting the CodeActivity class from the Windows Workflow Foundation framework. We define the logic of the function by implementing the Execute method and we define the input and output of the activity by fields of the InArgument and OutArgument types.
This activity can only be placed inside a XL Excel Application Scope activity (not necessarily directly inside, but one of its parents must be the XL Excel Application Scope).
Features
Features
Using SAX, the XML content of the file is loaded gradually, this means that the content is retrieved from the file without overloading the RAM memory or needing to have Microsoft Office installed. XLS files can be processed too, but they need to be converted to .xlsx first.
The package has been successfully tested with files sizes of +200MB and over 1 million rows. There's no reason this activity could not read even bigger files, however, for this it would require to read the file in chunks.
Additional Information
Additional Information
Dependencies
DocumentFormat.OpenXML (> = 2.5.0) Microsoft.Office.Interop.Excel (> = 15.0.4795.1000)
Code Language
C#, Visual Basic
Runtime
Windows Legacy (.Net Framework 4.6.1)
Technical
Version
1.0.68Updated
June 3, 2023Works with
Studio: 21.10 - 22.10
Certification
Silver Certified
Collections
Support
UiPath Community Support