Dictionary Input Object

The Integrator Dictionary input object (also used by the classic Builder) describes the format of an input or output file. It contains attributes to read and interpret a file or write a file. The Dictionary input object can be defined as text files, which are given a file extension of dic or dict. A Dictionary text file can be generated by a Fileout Output Object.

If defined internally in the process flow, the Dictionary input object (labeled DICT in the script) describes the format of an input or output file.

Dictionary Attributes

Attribute Type Description
columns Array of Objects Contains an array of sub-objects (described in the table following), one for each column in the input file
delimiter String

Specifies the delimiter that is used to separate columns for variable format files. If not specified, ASCII tab is used. Choices are:

  • space
  • tab \t
  • comma ,
  • semicolon ;
  • pipe |

The delimiter is the first character of this string.

encoding String

Defines how files names are read and interpreted in terms of character encoding. Values include:

  • auto—The input object sets the encoding based on the file signature and the Unicode state of other objects in the same task.
  • ascii—The characters in the file are interpreted as ISO-8859-1 or Latin1 characters.
  • gb18030—The file is interpreted as Chinese National Standard 18030-2000 characters. The gb18030 encoding option is supported on Windows platforms only.

  • latin1—The characters in the file are interpreted as ISO-8859-1 or Latin1 characters.
  • utf-8—The file is interpreted as UTF-8 Unicode characters.
  • unicode—The file is interpreted as 2-byte Unicode characters (UCS-2) with native byte swapping, unless overridden by a UCS-2 file signature.
  • unicode-be—The file is interpreted as UCS-2 characters in a big-endian fashion.

  • unicode-le—The file is interpreted as UCS-2 characters in a little-endian fashion.

UCS-2 and UTF-8 files can include a Byte Order Mark (BOM) at the beginning of the file to denote the file encoding. These file signatures are defined as follows:

  • UCS-2 Big EndianFE FF
  • UCS-2 Little EndianFF FE
  • UTF-8EF BB BF

File signatures are common for Unicode files on Windows operating systems. If the file input object reads multiple files, the signature of each file determines its encoding.

If the encoding attribute is auto and no signature is found, the encoding is assumed to be latin1 if no other object in the task handles Unicode data and the VI file is not encoded as utf-8 (using the charset 1208 directive). Otherwise, the encoding is assumed to be utf-8.

See also Integrator Unicode Data Support.

strict_quotes Boolean

If this attribute is set, delimiters found within a quoted string are always treated as part of the quoted string, as opposed to delimiting a new field in a variable format file. This behavior is always the case when the delimiter is a space, a comma (,) or a semi-colon (;). By default, Integrator will treat other delimiters (like tabs) as a hard stop for the field, with the expectation that quoting in fields is possibly incorrect.

record_size Integer Contains the width of a record in a fixed-format file. If set, then the Spectre Build, Builder, or Integrator will read exactly this many bytes per record, regardless of binary data such as newlines or null bytes that may appear in the record.
type String

Describes the type of the input file. It can be "fixed" or "variable". A "fixed " type is only relevant for Filein objects.

NOTE: This attribute is Dict Type in Visual Integrator

Columns Attributes (Sub-Objects)

Attribute Type Description
name String Defines the name of the column. It should start with a letter.
start Integer Defines the start column position for fixed format files.
end Integer Defines the ending column position for fixed format files. If neither this attribute and the length attribute are defined, the column is assumed to be a single character column.
length Integer Defines the length of the column for fixed format files. This attribute overrides the end attribute if both are defined.
packed Boolean Interprets the column to be a number in packed decimal format if this attribute is "true", and the file format is fixed.
implied or implied_dec Integer Indicates that the column is implied to contain decimal places. The defined value determines the number of decimal places in the resulting value.
trim Boolean Controls whether input data is trimmed of leading and trailing white space (including tabs). If this attribute is not present or is "true", leading and trailing spaces are trimmed before the data is processed. If this attribute is "false", the column data is passed to the program in its original format.