VI Join Process Object
The Visual Integrator (VI) Join process object performs SQL-style joins on inputs to the data flow.
The Join object discards records from one or both flows if they do not join any records from the other flow and duplicates records when both flows contain multiple records. The Join object expands the ability of the Lookup process object in that it implements standard SQL join types (inner, outer, left outer, right outer, and merge) and performs Cartesian products for duplicate records.
Unlike the Lookup object, the Join does not require the lookup flow to completely fit into memory, but the join might be slower if the lookup flow is not read into memory. It does require that both flows are sorted according to the join columns. If the flows are not sorted properly, a warning message displays, and the results of the join might be incorrect. Use the autosort attribute to automatically sort one or both of the Join inputs.
The Join process object has three panes where you set attributes.
You set attributes for the Join process object in the object attributes pane.
Attribute | Description |
---|---|
Input |
Defines the primary input for the Join object. This primary input and the Secondary_Input are required attributes. Defining the primary input
|
Secondary_Input |
Defines the secondary input for the Join object. This secondary input and the Input are required attributes. Defining the secondary input
|
Join_Type |
Defines the join type to perform. These types closely match classical relational database logic. Select one of the following types:
|
Auto_Sort |
Turns on the automatic sort. This sorts an input flow by creating internal Sort object functionality. This sort uses default values for Sort attributes, such as Temp_Directory and Sort_Size. Use an explicit Sort object if you need to set these values. For more information, see VI Sort Process Object. Select one of the following options:
|
Case_Sensitive |
Controls whether comparisons are case-sensitive.
|
Locale |
Controls whether or not comparisons for join columns are performed in a locale-specific manner.
This attribute is similar to the Locale attribute in the Sort process object. For more information, see VI Sort Process Object. |
Each object has an area where you can enter comments.
It is a DI best practice to enter a note for every object in a VI script. You can set a VI preference to give a warning for each object without a note.
The Join column grid displays the primary input on the left, the secondary input on the right, and the join columns that you specify in the middle.
The Join column grid has three parts:
- Primary Input Grid (left)—Lists the columns defined in the primary input object where you can select the columns to join. The Source Object column displays the object name where the columns entered the flow. Double-click the Source Object to change the task flow focus to that source object.
- Join Grid (middle)—Displays the columns that you have designated for the join. See the following section about joining columns.
- Secondary Input Grid (right)—Lists the columns defined in the secondary input object where you can select the columns to join with. The Source Object column displays the object name where the columns entered the flow. Double-click the Source Object to change the task flow focus to that source object.
To set the columns for the join:
-
From the Primary Input grid on the left, select the column, and click the arrow between the left and middle grids.
The column is added to the Left Column column of the middle grid.
-
From the Secondary Input grid on the right, select the column you want to join, and click the arrow between the right and middle grids.
The column is added to the Join With column in the middle grid.
-
Repeat steps 1 and 2 for any additional joins.
-
Edit the Join Order column if necessary.
NOTE: To treat the columns as numeric values, select the check box under the Numeric column.
Given the following two input flows:
Create the following Join object settings where a merge join is performed with Product from the Invoice input and Customer ID from the Customer input.
You get the following merged output data: