VI Filein Input Object

The Visual Integrator (VI) Filein input object brings raw data into the script.

The Filein object accepts input from one or more text files. These text files are described by column headers within the file or by an external dictionary file.

The Filein object has three panes where you set attributes.

VI Filein All Panes

Object Attributes

You set attributes for the Filein object in the object attributes pane.

VI Filein Object Attributes

Attribute	Description
Filename(s) or Starname(s) (required)	Defines one or more input files as follows: Click the browse button for a row in the Filename(s) or Starname(s). The Select File dialog box displays. Browse to the file or to the folder if using starname. Select the input file or type a wildcard character matching string in the File Name box. Click Select. That Filename(s) or Starname(s) row is now populated. Repeat to add additional rows in the Filename(s) or Startname(s) section. NOTE: Alternatively, you can type the path and file name or wildcard character matching string in each row of the Filename(s) or Starname(s) section. For starname, you can use the wildcard characters question mark (?) or asterisk (). ? (question mark)—Matches any single character (asterisk)—Matches a sequence of characters Wildcard character matching is case-insensitive and is limited to one directory only. To match in more than one directory, add additional rows to the section. TIP: If working with case sensitive file names, try using a Directory input object. You can filter on the filename metadata and feed it into a Filein object using the File_List_Input attribute. With the starname attribute, files are returned in the order that they are found in the directory; they are not necessarily sorted alphabetically. The order can vary across systems, and is often related to how files are deleted and added in the directory. Programs should not rely on the order of starnames. NOTE: At least one input file must be indicated. You cannot use Filein Filename(s) or Starname(s) input with a File_List_Input because they are mutually exclusive.
Starname or Filename	Automatically sets based on your selections in the Filename(s) or Starname(s) input. Choices are: starnames—Use to specify multiple wildcard character matching strings filenames—Use to specify multiple file name inputs starname—Use to specify a single wildcard character matching string filename—Use to specify a single file name input
File_Type (required)	Specifies how the columns are named in the input files. Choices are: standard—File is a fixed- or variable-format text file described in a DI dictionary. This type requires either a Dict_File or Dict_Obj attribute. column_headers (default)—File is a variable-format text file with the first line in the file naming the columns. This type requires a Delimiter attribute. ignore_column_headers—First line in the file is ignored and instead uses a DI dictionary to describe the file. This type requires either a Dict_File or Dict_Obj attribute. DBF—File is a standard DBase-2 DBF file. Column names come from the DBF header. TIP: The file type ignore_column_headers allows you to use a dictionary with trim=false to keep leading spaces in the input data.
Delimiter (required for column_headers file type)	Specifies the delimiter that is used to separate columns for variable format files. If not specified, ASCII tab is used. Choices are: space tab \t comma , semicolon ; pipe \|
Dict_File	Specifies the file name for the dictionary that describes the columns. This attribute is used with legacy format dictionaries, which list both the column names and the data categories. Use the browse button to navigate to the dictionary file. Dict_File and Dict_Obj attributes are mutually exclusive.
Dict_Obj	Specifies a Dictionary input object that lists the columns. The Dict_Obj object must be on the task flow. See VI Dict Input Object. Either type the name of the object, or click the browse button and point to the object on the task flow. When you click the browse button and move the pointer onto the task flow, a dotted connecting arrow appears that you attach to the Dict_Obj. Dict_Obj and Dict_File are mutually exclusive.
File_List_Input	Specifies a separate input flow object to generate a list of file names for the Filein object. This input flow object uses a single-column text file for input. The column name must be filename and the object must contain a list of files. This attribute allows programmatic control over the input. For example, this text file can be automatically generated from another VI script. The Filein object uses the values in the filename column as a list of files to open as its input. The input is formed by concatenating all files in the file list. Multiple files listed within the file list must have the same format. If the file formats are different, consider using the Concat process object to prepare them. For more information about concatenating files, see VI Concat Process Object Do not use a Filename(s) or Starname(s) attribute when using the File_List_Input attribute as they are mutually exclusive. Two Filein input objects are required for the File_List_Input attribute: one for the list of files, and one for the actual Filein object that uses the File_List_Input attribute to bring those files into the data flow. To set up the two input objects: Place two Filein objects on the task flow and name them: one for the File_List_Input and one for the actual Filein object. For example: File_List_Input contains the list of files and is used by the Filein object for a File_List_Input attribute From List is used to define the actual Filein object In the File_List_Input Filein object, add the file that has the file list in the Filename(s) or Starname(s) attribute box. The example shows a file named For_file-list-input.txt. Set any other attributes as needed. In the From List Filein object, click in the File_List_Input attribute. Click the browse button that appears, and then move the pointer onto the task flow. A dotted connector arrow appears connected to the actual Filein (From List). Connect this to the other Filein object that contains the text file with the list of files as its input (File_List_Input). The File_List_Input attribute is set for the From List Filein object, and the files listed in the text file are used for the actual input.
Filename_Column	Specifies a name for a new column added to the output flow for example, where-from. The values in this column contain the path and file name indicating where each input row of data originated as shown in the example output here. This attribute is useful for troubleshooting when using multiple input files.
First	Specifies a number to limit how many records are read from each input file. This attribute is useful for script testing on a small number of records. If not used, all rows are read.
Newline	Specifies a newline character as a string containing exactly one character from the input file. The newline character is replaced with a line break. For example: With ~ set as the newline character: 483574387548~4434839~4782939029~ becomes in the output flow: 483574387548 4434839 4782939029 If not specified, the default newline character is a carriage return (ASCII 13), a line feed (ASCII 10), or a carriage return line feed. This attribute cannot be used if the Require_CRLF attribute is set to true.
Require_CRLF	Determines whether or not a carriage return (CR) followed by a line feed (LF) is required to indicate the end of a line. This attribute is useful for handling the output from certain Microsoft software that exports single line feeds without a carriage return from internal fields. true—Both the carriage return and line feed characters must be present to indicate the end of a line false (default)—Either a carriage return, a line feed, or the combination of both indicate the end of a line If set to true, the Newline attribute cannot be used. NOTE: These internal line feeds should be replaced using the translate calc function before they are output. For more information, see String Functions in Integrator. For example: translate (MyColumn, concat (chr(10),chr(13)), "\|%") Where: MyColumn is the name of the column that contains the line endings chr(10) is the character for line feed (LF) chr(13) is the character for carriage return (CR). The line feed is replaced with a pipe (\|) and the carriage return is replaced with percent sign (%). You can use characters other than \| and % that you prefer.
Prefix	Defines a prefix that is added to all column names in the flow. If you want a space between the prefix and the column name, include that space in the prefix string definition. Any columns assigned an alias do not use the prefix; instead, the columns use the alias name.
Rename_Duplicates	Renames duplicate columns so that each column in the output data flow has a unique name. The duplicate naming process occurs before attributes defining aliases, prefixes, or the columns to keep are applied, so these generated column names can be aliases to another name. true—Creates new, unique column names for duplicate columns Subsequent columns with the same name are given the names name_2 ... name_(n) based on the positional order in the input. If for some reason, a column in the input flow already has this name, that number is skipped. For example, if the input flow already has a column named DESC_2, the object names the duplicate column DESC as DESC_3. false (default)—Duplicate column names are not renamed and the duplicate columns appear in the output data flow
Ignore_Extra_Columns	Ignores extra columns in the input following the last column described by the column headers or a DI dictionary. This attribute helps control processing when there are too many data items in a row of the input. true—Ignores extra columns false (default)—Any line that contains extra columns gets reported as a parse error and the line is not processed
Ignore_Line_End	Specifies whether or not parse errors dealing with the end of an input line are ignored. For example, Fixed field occurs past end of line or Too few fields. This attribute helps control processing when there are too few data items in an input row. true—Processes the line false (default)—Prints out a warning message, and skips the line
Ignore_Quotes	Specifies whether or not the beginning and ending quotation marks are ignored while keeping embedded quotes. This optional attribute is intended to be used in special cases only. true—Double quotation marks are passed in for processing, and they appear in the output flow. false (default)—Double quotation marks at the beginning of a field are stripped away, along with any trailing double quotation marks. If the delimiter value is a comma and is within a pair of double quotation marks, it is kept as part of the column value.
Strict_Quotes	Specifies how to handle quotes with delimiters. This optional attribute is intended to be used in special cases only. true—Delimiters found within a quoted string are always treated as part of the quoted string, as opposed to delimiting a new field in a variable format file. This behavior is always the case when the delimiter is a space, a comma (,) or a semi-colon (;). false (default)—Other delimiters (like tabs) are treated as a hard stop for the field, with the expectation that quoting in fields is possibly incorrect.
Union	Combines multiple input files that have different columns, and produces the union of their input columns. This attribute works only when reading files with column headers. The result is similar to using the Concat process object that combines multiple input flows. If a column is requested in the output flow that does not appear in all files, the value of that output column is blank in the appropriate rows. This attribute allows the Filein object to read multiple files that have columns added over time without having to add the columns to earlier files, that is, columns and column order can change over time. true (default)—Files are combined false—Files are not combined Combining files is recommended when using a series of input files.
Encoding	Defines how files names are read and interpreted in terms of character encoding. Values include: auto—The input object sets the encoding based on the file signature and the Unicode state of other objects in the same task. ascii—The characters in the file are interpreted as ISO-8859-1 or Latin1 characters. gb18030—The file is interpreted as Chinese National Standard 18030-2000 characters. The gb18030 encoding option is supported on Windows platforms only. latin1—The characters in the file are interpreted as ISO-8859-1 or Latin1 characters. utf-8—The file is interpreted as UTF-8 Unicode characters. unicode—The file is interpreted as 2-byte Unicode characters (UCS-2) with native byte swapping, unless overridden by a UCS-2 file signature. unicode-be—The file is interpreted as UCS-2 characters in a big-endian fashion. unicode-le—The file is interpreted as UCS-2 characters in a little-endian fashion. UCS-2 and UTF-8 files can include a Byte Order Mark (BOM) at the beginning of the file to denote the file encoding. These file signatures are defined as follows: UCS-2 Big Endian—`FE FF` UCS-2 Little Endian—`FF FE` UTF-8—`EF BB BF` File signatures are common for Unicode files on Windows operating systems. If the file input object reads multiple files, the signature of each file determines its encoding. If the encoding attribute is auto and no signature is found, the encoding is assumed to be latin1 if no other object in the task handles Unicode data and the VI file is not encoded as utf-8 (using the charset 1208 directive). Otherwise, the encoding is assumed to be utf-8. See also Integrator Unicode Data Support.
Alias_Lines	Aliases can be set and edited in both the Alias_Lines attribute and the column grid. The column grid allows for graphical editing, while the Alias_Lines attribute is set at the code level in the following format: OldColumnName=NewColumnName For example: inv_nbr=Invoice Number Where Invoice Number is the alias for inv_nbr. The Alias_Lines method is useful for working with array parameters used as aliases. VI cannot process these array parameters because they are not in a format that VI can interpret. VI considers these array parameters to be malformed aliases and displays a warning message in the Logs tab. For example: Alias definition "$(ExternalAliases)" in object "From List" is not formatted as "OldColumn=NewColumn". When the line contains a parameter Integrator is most likely able to resolve it when the parameter is properly defined. To edit this alias definition use the "Alias_Lines" property. This message indicates that there is a malformed alias named $(ExternalAliases) in the object named From List. The array parameter displays as $(ExternalAliases) in the script. For VI to interpret this array parameter, you must assign an alias in the VI format. To assign an alias and resolve the error: Select the object in the task flow that is mentioned in the warning message. In this example From List. Click in the Alias_Lines field, and then click the browse button that appears. The Alias lines for <object_name> dialog box displays. Edit the alias in the form of: `OldColumnName=NewColumnName` In this example: `$(ExternalAliases)=External Aliases` Click OK. The array parameter has an alias, and the error message no longer appears.

Column Grid

The Filein column grid displays the columns from the input files.

Attribute	Description
Name	Displays the name of each input column. This attribute is read-only.
Alias	Defines alternate names for any of the input columns. Spaces before or after an alias column name are ignored. Spaces within an alias column name are acceptable.
Keep Order	Manages the order that columns display in the output data flow. By default, columns that are passed to the next object in the data flow are displayed in the order that they appear in the Name column. You can change this order by typing a number in the Keep Order column. When you assign a Keep Order number, the Keep column is checked automatically. The Keep Order numbers might reorder to accommodate any changes you make.
Keep	Manages which columns are kept in the output data flow. If no columns have a Keep check mark, all columns are kept in the output data flow, except for any explicitly marked Remove. Select the Keep check box for columns you want to explicitly keep in the output data flow. A number is automatically added in the Keep Order column when you select its Keep check box. After marking any column with a Keep check mark, only those marked Keep are kept in the output data flow. NOTE: After any Keep check boxes are checked, do not use the Remove check boxes as clicking a Remove check box sets all Keep check boxes to unchecked.
Remove	Manages which columns are removed from the output data flow. Select the Remove check box for columns that you want to explicitly suppress from the output data flow. NOTE: Use the Remove check boxes only when no Keep check boxes are checked.