Classifying Data
Visual Builder Description files end with a Builder output object. The Model Structure tab of the Builder object is where you set the information types. Input columns that do not have an information type setting are ignored and are not included in the model output.
This topic describes the three main information types. Most of the additional object attribute settings are found in the attributes section of the properties pane. This example shows attributes for a Builder object.
For information about additional column settings, see Setting Column Properties and the Builder Output Object Integrator topic.
All data elements that you want in the final model output must be classified into one of the following types of model fields:
Any field classified as a dimension becomes one of the dive fields in the ProDiver console. Each dimension represents a different approach to organizing and looking at your data. Some types of data that might qualify as dimensions are Name, Company, Part Number, Month, Hospital, Attending Physician, and Diagnosis Group. Keep the following points in mind when classifying dimensions:
- Builder has a limit of 32 core dimensions per model. A model with many dimensions might take a long time to build and be very large. Dimensional Insight recommends no more that 10-15 core dimensions per model when your record number exceeds 256,000.
- After a model is built, the ProDiver client can go beyond the 32-dimension limit by promoting info fields to dynamic dimensions using the Edit > Add/Edit Dimensions command in ProDiver. DiveMaster also allows you to save dynamic dimensions in a DivePlan. See About Promoting Info Fields to Dynamic Dimensions.
- Dynamic dimensions use the indexing of their associated core dimensions. If the lookup for the associated core dimension value is fast, then the dynamic dimension lookup is as fast. See Core versus Dynamic Dimensions.
- The maximum field length for a single piece of data is 32,767 characters. Keep in mind that ProDiver displays only 512 characters for a dimension value.
- When building a memory model, core dimensions are not as indexed as they are if built in the command line Builder. They are built within the application memory area and are effectively dynamic, so there can be up to 200 dimensions defined.
Any field classified as a summary field becomes a data element heading in ProDiver, that is, a numeric column in the default tabular. Some types of data that might qualify as a summary are Cost, Revenue, Quantity Sold, Discharges, Surgical Days, and ER Cases. Keep the following points in mind when classifying fields as summaries:
- Mathematical calculations can be performed on these fields because they are numeric.
- These fields may include leading and trailing spaces and plus (+) or minus (-) signs before or after the digits.
- There is no limit to the number of summaries you can define.
- Each summary must be relevant to all dimensions for the model, since they are summed over all rows of data. Such a total must make sense.
Fields containing extra information related to a dimension should be defined as info fields. Codes or descriptions are typical types of info fields. If an info field is numeric, it can be used in calculations. Keep the following in mind when classifying info fields:
- Info fields can attach to only one dimension. To function correctly, info fields must be unique to their corresponding dimension value.
- For example, in a sales information model, the field Branch Manager can be defined as an info filed attached to Branch because there is only one Branch Manager per Branch. By contrast, it would not be acceptable to use Zip Code as an info field attached to a City dimension because many cities have more than one Zip Code. Zip Code could only be used as an info field if it was unique to each value of a dimension, meaning that there was only one per City. Some other types of data that might qualify as info fields are Account Number based on Customer, Description based on Part Number, or Birth Place based on Patient.
- If the data is not as expected and a one-to-one relationship does not exist, Builder makes note of the situation in the build journal, but proceeds to build the model using the first encountered value.
- There is no limit to the number of info fields associated with a dimension. The total size of all info fields based on a single dimension should not exceed 32,767 characters, otherwise an error occurs. Keep in mind that ProDiver displays only 512 characters for an info field.
- Info fields in the model are potential dynamic dimensions. The ProDiver client can go beyond the 32-dimension limit by promoting info fields to dynamic dimensions using the Edit > Add/Edit Dimensions command in ProDiver. DiveMaster also allows you to save dynamic dimensions in a DivePlan. See About Promoting Info Fields to Dynamic Dimensions.
- See Core versus Dynamic Dimensions for help deciding which fields are best as core or dynamic dimensions.