String Functions in Integrator

ascii

Returns the integer value of the first character in the string char; inverse of the chr function. On EBCDIC platforms, returns the integer value in the EBCDIC collating sequence.

ascii(char : string) : numeric

For example, assuming an ASCII platform:

ascii("A") returns 65

ascii("Foo") returns 69

ascii(" ") returns 32

capitalize

Capitalizes the first letter of every word or token in the string to uppercase, and changes the remaining characters in each word to lowercase. Single quote characters (’) appearing in a word will be interpreted as an apostrophe, and will not cause the following character to be changed to uppercase.

capitalize(string : string) : string

For example:

capitalize("friends") returns Friends

capitalize("john f. kennedy") returns John F. Kennedy

capitalize("let’s get going") returns Let’s Get Going

capitalize("PLEASE STOP SHOUTING") returns Please Stop Shouting

capitalize("pre-sales") returns Pre-Sales

chr

Returns a single-character string with the character represented by the number value in the current character set (ASCII or EBCDIC).

chr(value : numeric) : string

For example, assuming an ASCII platform:

chr(65) returns A

chr(32) returns space

chr(92) returns \

concat

Concatenates strings together. It takes a variable number of arguments, and concatenates them together. Numbers will be interpreted as strings.

NOTE: In the first example, note the use of the \", \" to insert a comma followed by a blank space between the city and state (e.g. Allentown, PA).

concat(string1 : string, string2 : string, ...) : string

This example takes the Customer City and Customer State and concatenates them together into a new column called Cust CityState, with a comma and space between them.

Copy

object 'PROC' "city" {
    process_type = "calc",
    input = "sales",
    calcs = {
            name = "Cust CityState",
            value = "concat(Customer City, \", \",Customer State)"
        }
    }
};

For some applications, the escape sequence may be quote-quote. For example:

Copy

concat("<a href=""",calc[URL],""">",Info[Summary],"</a>")

equals

Compares two strings and returns an integer. If string1 is equal to string2, this function returns 1. Otherwise, it returns 0. The comparison is case-sensitive.

equals(string1 : string, string2 : string) : numeric

For example:

equals("0.0", "0") returns 1

equals(string("0.0","0")) returns 0

equals(string("0.0"),string("0")) returns 0

equals_ignore_case

Functions like the equals function above, except that string comparison is not case-sensitive.

NOTE: Both equals and equals_ignore_case work for numeric values. In the case of calculated numbers, to allow for inexact floating point calculations, the tolerance is 0.0000001. That is, equals and equals_ignore_case return true (1) if the two numbers are within 0.0000001 of each other.

equals_ignore_case(string1 : string, string2 : string) : string

index

Returns the starting position (index) of string2 in string string1. If string1 does not contain string2, it returns 0. If string2 is null, it returns 1. This function is case sensitive.

index(string1 : string, string2 : string) : numeric

Following are examples of index and the position returned:

The index function can be used with the substr function on an inexact string (or scan).

substr("Boston, MA",1,(index("Boston, MA",",") - 1)) returns Boston

The index function, in conjunction with the sign function, can be used as a substitute for an "IN_String" function.

Sign(Index("ME,MA,CT,VT,NH,RI", State)) if State is TX, 0 is returned; if State is MA, 1 is returned

Function	Result
index("New York, NY","n")	0
index("New York, NY","N")	1
index("New York, NY",",")	9
index("New York, NY", "NY")	11
index("New York, NY", "NV")	0
index("New York, NY","")	1

length

Determines the length of the string string. Returns an integer.

length(string : string) : numeric

For example:

length("four") returns 4

lower

Returns the string string with all uppercase characters converted to lowercase.

lower(string : string) : string

For example:

lower("PLEASE STOP SHOUTING") returns please stop shouting

lpad

Pads the beginning of the given string repeatedly with pad_string until the result is length characters long. If pad_string is omitted, spaces are assumed.

lpad(string : string, length : numeric, pad_string : string) : string

For example:

lpad("foo", 5, "x") returns xxfoo

lpad("James", 10, "+-") returns -+-+-James

lpad("Long string", 5, "x") returns Long string

lpad("bar", 5) returns bar (2 spaces precede bar)

ltrim

Returns string stripped of any characters in trim_chars that appear at the beginning of the string. If trim_chars is omitted, it defaults to spaces.

ltrim(string : string, trim_chars : string) : string

For example:

ltrim(" test") returns test

ltrim("xxxab", "x") returns ab

ltrim("* * **Hello", " *") returns Hello

ltrim("xxxab", "xab") returns (empty string)

map_filename

Maps (i.e. changes) special characters to underscores (_). Any character that is not numeric, alphabetic, ’-’, ’.’,’~’, or ’_’ will be mapped to ’_’. This prevents creating invalid filenames and paths. This function can also be used to strip a field of any special characters.

TIP: Properly handles UTF-8 encoded strings.

map_filename(string : string) : string

For example:

map_filename("mth**01") returns mth__01

map_filename("ma*_*1*") returns ma___1_

map_filename("*bos*2*") returns _bos_2_

match

Checks whether a string appears in a set of other strings, and returns a positive number based on which string it matches. If match_string is equal to string1, this function returns 1; if it is equal to string2, it returns 2. If match_string does not match any string, it returns 0. This function can be used to simplify boolean expressions by matching several strings simultaneously.

match(match_string : string, string1 : string, string2 : string, ...) : numeric

For example:

match("alice", "alice", "bob", "charlie") returns 1

match("fred", "alice", "bob", "charlie") returns 0

match("charlie", "alice", "bob", "charlie") returns 3

if(match(month, "January", "February", "March"), "keep", "discard") if month is March, returns keep

regexp

Performs a search of string based on regular expression, regexp. This function returns 1 (true) if a match is found, and 0 (false) if no match is found or the regular expression is invalid.

regexp(string : string, regexp : string, options : string) : numeric

For example:

regexp("Joe Smo","^jo") returns 0

regexp("Joe Smo","^jo","ignore_case") returns 1

See Regular Expressions in Integrator for more information.

regexp_value

Performs a search of string based on regular expression, regexp. This function returns the portion of the string that matches the given regular expression.

regexp_value(string : string, regexp : string, options : string) : string

For example:

If the Dimension "Company" looks like:

Company
Wholesale, Inc. (182)
Lloyd and Louis Glasses, LLC. (299)
Neverland 123 (0321)
Dumbells & Company ()

regexp_value(Company, "([0-9][0-9]*)") returns

Result
(182)
(299)
(0321)
NULL

The fourth item is null since at least one digit is required for a match.

See Regular Expressions in Integrator for more information.

replace

Replaces occurrences of src_string in string with dest_string; will add or delete characters as necessary.

replace(string : string, src_string : string, dest_string : string) : string

For example:

replace("We had an bogus time", "bogus", "excellent") returns We had an excellent time

replace("A|B|C", "|", "-") returns A-B-C

replace("We can merge words", " ", "") returns Wecanmergewords

replace(text, chr(10), \" \") turns the newlines in the text column into spaces

replace(text, chr(13), \" \") turns the carriage returns in the text column into spaces

reverse

Returns the reversed contents of string.

reverse(string : string) : string

For example:

reverse("abc") returns cba

rpad

Pads the end of the given string repeatedly with pad_string until the result is length characters long. If pad_string is omitted, spaces are assumed. If too short, no padding occurs.

NOTE: The command-line interpreter on UNIX (the shell) treats multiple spaces as a single space. To preserve multiple spaces, the argument needs to be quoted.

rpad(string : string, length L numeric, pad_string : string) : string

For example:

rpad("foo", 5, "x") returns fooxx

rpad("James", 10, "+-") returns James+-+-+

rpad("Long string", 5, "x") returns Long string

rpad("bar", 5) returns bar (bar followed by 2 spaces)

rtrim

Returns string stripped of any characters in trim_chars that appear at the end of the string. If trim_chars is omitted, it defaults to spaces.

rtrim(string : string, trim_chars : string) : string

For example:

rtrim("test ") returns test

rtrim("abxxx", "x") returns ab

rtrim("Hello * -*", " -*") returns Hello

rtrim("xxxab", "xab") returns (empty string)

scan

Parses the string into separate segments based on the set of characters in delimiters, and returns the segment indicated by token_number, with the first segment identified by 1. It will also accept negative segment numbers, and interpret them as segments from the end of the string: -1 would indicate the last segment, -2 would indicate the next-to-last segment. If the number is so negative that it identifies a segment before the first segment, a null string is returned. If delimiters is omitted, it defaults to spaces.

scan(string : string, token_number : numeric, delimiters : string) : string

For example:

scan("first, second, third", 2,",") returns second

scan("first, second,,fourth",4,",") returns fourth

scan("foo bar",4) returns (null string)

scan("foo bar",1) returns foo

scan("foo|bar/baz",3,"|/") returns baz

scan("c:\models\data\sales.mdl", -1, "\") returns sales.mdl

scan("c:\models\data\sales.mdl", -2, "\") returns data

scan("c:\models\data\sales.mdl", -4, "\") returns c:

scan("c:\models\data\sales.mdl", -5, "\") returns (null)

scan("c:/models/data\sales.mdl", -2, "/\") returns data

string

Converts value to a string (if necessary), and makes the return value be evaluated as a string in any subsequent operation. This function controls whether certain functions, like comparison functions, treat their arguments as strings or numbers. Normally, a comparison is done by first performing a numeric comparison by trying to convert both sides into numbers. If that fails, a string comparison is then performed. The string function provides a way to skip the numeric comparison.

string(value : string) : numeric

For example:

equals(string("0.0"),string("0")) returns 0

sin(string("0")) returns a null value (the sin of any string is returns a null value)

string("23") returns 23

string("19") < "2" returns 1

"19" < "2" returns 0

substr

Allows separation of a portion of a string from the entire string. It accepts two or three arguments; start_pos specifies which character in the string to begin at (the first character is 1), and num_chars specifies how many characters from the starting position (this argument is optional). If num_chars is given, up to that number of characters is returned. If it is not included, substring will return all characters in string after the starting position. (A negative starting position is assumed to be 1.)

substr(string : string, start_pos : numeric, num_chars : numeric) : string

For example:

substr("this is a test",2,5) returns his i

substr("this is a test",3) returns is a test

This object takes a substring of the month column (e.g. "M01 January") and puts the first 3 characters of the month name into a new column called "Month Abbr."

Copy

object 'PROC' "month" {
    process_type = "calc",
    input = "sales",
    calcs = {
            {Name = "Month Abbr",Value = "substr(Month,5,3)"}
    }
};

token_count

Counts the number of tokens in string based on the set of characters in delimiters, corresponding to the tokens parsed by the scan() function. If delimiters is omitted, it defaults to spaces.

This function allows the Integrator to find the number of words or tokens in a string. This function, along with the scan() function, allows the Integrator to parse a string, for example, "ak,ny,ma", into separate rows for each state.

token_count(string : string, delimiters : string) : numeric

For example:

token_count("first, second, third", ",") returns 3

token_count("foo|bar/baz", "|/") returns 3

token_count("foo bar") returns 2

token_count(",,,", “,”) returns 4

translate

Translates characters in string that appear in src_chars with the character in the same position in dst_chars; will not remove characters from a string.

translate(string : string, src_chars : string, dst_chars : string) : string

For example:

translate("Mary had a little lamb", " ", "x") returns Maryxhadxaxlittlexlamb

translate("able", "abcde", "bcdea") returns bcla

translate("Dimensional Insight", "DI", "d") returns dimensional Insight

upper

Returns the string string with all case characters converted to uppercase.

upper(string : string) : string

For example:

upper("New York") returns NEW YORK

Regular Expressions in Integrator

Regular Expressions Examples

Character	Description
^$$	will match an empty string.
^.*$$	will match any string.
^This	will match any string beginning with 'This'.
that	will match the string 'that'.
^..$$	will match any string containing exactly two characters.
end$$	will match any string ending with 'end'.
^[0-9]*$$	will match any string only consisting of numbers (including an empty string).
^[0-9][0-9]*$$	will match any string only consisting of numbers (not including an empty string).
\^^^	will match a string that contains '^^^'
^[aeiou]	will match any string starting with a vowel.
\$$[0–9]	will match a numeric string with a leading dollar sign character.
concat(chr(36), $[0–9]	will match a numeric string with a leading dollar sign character.
\$$	will match a dollar sign character.
\0$$	will match 0 at the end of a string.