Source Type Details for the Driver
This section explains how each source type is represented as a database in the driver.
Spectre cBases are basically tables. For ODBC they are simply represented as a single table named data.
Note that cBases cannot be used when connecting to a version 7.0 server. As a workaround, you can wrap the cBase in a cPlan and use that instead—you will be able to perform aggregated queries (using GROUP BY), but due to limitations in 7.0 you cannot fetch non-aggregated data from the cBase.
The CALC() SQL function can be used as an aggregate. It uses the default summary type of the column. Numeric columns have the default summary type sum, so for instance if there is an integer column X, and the summary type has not been overridden, then CALC(X) is equivalent to SUM(X).
Spectre Dive scripts describe a process for generating a window table. That window table is the output of the Dive script, and for ODBC that window table is exposed as a table named data. You can still do normal SQL operations on that table (for example, GROUP BY) but keep in mind that those operations see only the window table of the Dive, they do not have access to calcs in the Dive's cPlan, or to columns in the Dive's cBase.
The CALC() function can still be used as an aggregate, but it does not refer to calcs in the Dive's cPlan. Instead it will use the default summary type for the window column. For instance, if a Dive uses the Spectre average() function to compute a column X, then CALC(X) will be equivalent to SUM(X) (because the default summary type for numeric columns is sum)—so it will sum up the averages in column X.
Spectre cPlans can have multiple inputs, and they can define new dimensions and calcs. If a cPlan has multiple inputs, then it does not behave like a single table. Even though queries against a cPlan do not map directly to the traditional database model, it is still useful to be able to perform them using SQL in the ODBC driver. For extract purposes it is also useful to be able to fetch raw data from each of the cPlans inputs.
A cPlan is represented as a database with one or more tables. There will always be a data table, which represents the whole cPlan.
For cPlans, each named input in the cPlan has a corresponding table starting with input_. The input_ tables represent the raw data in its input. A named input is one which has the name= attribute set in the cPlan. For instance, if the cPlan has cbase-input "my.cbase" name="main" then DI-ODBC will expose a table named input_main. Inputs that are not named do not appear as tables in DI-ODBC. When connecting to a version 7.0 server, the input_ tables will not be available.
Queries against input_ tables are processed like queries against a cBase. The cPlan input is treated as a cBase, regardless of its input type (for example, odbc-input).
The CALC() function can still be used as an aggregate, but it does not refer to calcs in the cPlan. Instead it will use the default summary type for the column of the cPlan input.
With a cPlan, the data table represents the cPlan as a whole, and it automatically joins data from all cPlan inputs.
When using GROUP BY, queries against the data table work somewhat like windows in ProDiver: the group columns are like dimensions and aggregate expressions are like summaries. If one of the dimensions is not present in one of the inputs, then you will likely see duplicated values (just as in ProDiver). Also, if a dimension value is not present in one of the inputs, then summaries from that input will be NULL.
With a cPlan, queries that do not have GROUP BY are treated in a special way. In that case, there is no single table to use for the response. But client applications may use queries like SELECT * FROM data LIMIT 1 to get general information about the data. DI-ODBC answers such queries as if the source table has one row, has all columns from all inputs, and the values are the grand total, using the default summary type of those columns. Since the default summary type for string columns is info, this will likely mean that all string columns will just report NULL. This means that non-group queries are not likely to be useful for data analysis, but client applications will still be able to use them for column type and other metadata.
The CALC() function can be used to refer to any calc defined in the cPlan. This includes summary expressions as well as row-level expressions.