VI Squash Process Object
The Visual Integrator (VI) Squash process object combines rows that are identical for a set of dimension columns.
Optionally, the Squash process object summarizes numeric columns by calculating the total, minimum, maximum, average, or standard deviation. It is equivalent to the Builder squash attribute or an SQL GROUP BY statement.
NOTE:
- There must be enough available memory to contain all the summarized rows at the same time.
- For large amounts of data that require sorting, use a Sort process object before the Squash process object. For more information, see VI Sort Process Object.
The Squash process object has three panes where you set attributes.
You set attributes for the Squash process object in the object attributes pane.
Attribute | Description |
---|---|
Input |
Defines the object from which the data flow arrives. Use one of the following methods to attach the Squash object to an input data flow object:
|
Adjacent |
This optional attribute determines whether or not the Squash object is limited to adjacent rows.
|
Auto_Sort |
This optional attribute determines whether or not the Squash object automatically sorts the input flow according to the squash dimensions, which is needed to squash adjacent rows. This attribute is only recognized if the Adjacent attribute is set to true.
|
Count_Column | Defines an output column that will contain the count of rows with identical dimension values that have been squashed into a single output row. This option is equivalent to totaling a column containing the number 1. |
Warn_Limit |
This optional attribute controls how many data integrity warnings are displayed in the log. By default, this attribute is not set. If a limit is set and reached, the log displays the following message: Further data integrity warnings suppressed for Squash object "<object_name>". Increase the warn_limit attribute to see more. |
Keep_All_Other_Columns |
Determines whether or not all additional columns (beyond those that have check marks in the Dimension or Summary check boxes) are kept and passed to the output data flow.
|
Each object has an area where you can enter comments.
It is a DI best practice to enter a note for every object in a VI script. You can set a VI preference to give a warning for each object without a note.
The Squash column grid displays the input columns and the squash attributes, which you can set.
Attribute | Description |
---|---|
Input Column | Displays the name of each input column. This attribute is read-only. |
Source Object | Displays the name and object type of the source object. Double-click the Source Object for a column to change the task flow focus to that object. |
Dimension |
Defines the input columns used to combine rows. Rows that have identical values for all the dimension columns are combined. Select the Dimension check box for columns that you want to set as dimensions. At least one input column must be marked as a dimension. A number is automatically entered in the row's Dim Order. |
Dim Order | Defines the dimension order for input columns with a Dimension check mark. The Dim Order is automatically set in the order that you select the Dimension check box for input columns. You can edit these numbers to change the order. If you assign a Dim Order value for an input column not already marked as a dimension, a check mark is added to the Dimension check box. |
Summary |
Defines input columns that are totaled. These columns are summed into summarized output rows in the data flow. Select the Summary check box for numeric columns that you want summarized. These columns must be numeric or blank (null). Null values are treated as zeros and do not effect the summary calculation, but return null if all values are null. |
Other |
Defines input columns that are preserved and passed to the output data flow. Select the Other check box for input columns that you want to pass to the output data flow. This optional attribute is useful for keeping data intact. The Other and Remove check boxes are mutually exclusive. If a column has the Remove check mark, selecting the Other check box removes the Remove check mark. |
Remove |
Manages which columns are removed from the output data flow. Select the Remove check box for columns that you want to explicitly suppress from the output data flow. NOTE: Use the Remove check boxes only when no Keep check boxes are checked. The Remove and Other check boxes are mutually exclusive. If a column has the Other check mark, selecting the Remove check box removes the Other check mark. |
Average | Defines input columns that are combined by calculating the average value of the squashed rows. Select the Average check box for numeric columns that you want to calculate the average values for. The data flow output column is named Average <column_name>. These columns must be numeric or blank (null). Null values are treated as having no data and do not effect the average calculation. |
Max | Defines input columns that are combined by calculating the maximum value of the squashed rows. Select the Max check box for numeric columns where you want to calculate the maximum value. The data flow output column is named Max <column_name>. These columns must be numeric or blank (null). Null values are treated as having no data and do not effect the maximum calculation. |
Min | Defines input columns that are combined by calculating the minimum value of the squashed rows. Select the Min check box for numeric columns where you want to calculate the minimum value. The data flow output column is named Min <column_name>. These columns must be numeric or blank (null). Null values are treated as having no data and do not effect the minimum calculation. |
StdDev | Defines input columns that are combined by calculating the standard deviation of the column for the squashed rows. Select the StdDev check box for numeric columns where you want to calculate the standard deviation. The data flow output column is named Std Dev <column_name>. These columns must be numeric or blank (null). Null values are treated as having no data and do not effect the standard deviation calculation. |
NOTE: Keep in mind the following while setting the check boxes in the Squash process object column grid:
- Input columns can be set as one of the following: Dimension, Summary, or Other. At least one dimension is required.
- Only input columns set for Dimension, Summary, Other, Average, Max, Min, or StdDev are passed into the output data flow.
- Summary, Average, Max, Min, and StdDev can be set only for numeric columns (Summary check mark).
- The Other and Remove check boxes are mutually exclusive.