SQL Arrays in DB2 for i 7.1
This article covers the process of defining array types, creating arrays both in SQL PL and Java, and manipulating array variables with the new SQL functions.
By Tom Blamer07/01/2010
As business applications get more complex, so do the database schema and the data access code. It's not uncommon that applications need to exchange long lists of values with the database. Imagine lists of account IDs, part numbers, date ranges or any other data type that might be the bread and butter of your application.
Historically, to pass a list of values to or from an SQL procedure, you'd had a few options:
»Use a long list of input and output parameters. Over time, the procedure's parameter list can become unbearably long and difficult to understand.
»Concatenate all the values together into one ad hoc "string" parameter. This requires more code to convert non-character types to character and back, and to construct and deconstruct the string. Unless the data is extremely simple, this approach is vulnerable to unexpected errors cropping up.
»Create a temporary table (perhaps via the DECLARE GLOBAL TEMPORARY TABLE statement) to be used by the procedure, or return the data in a result set. Although this can work well, it's the weakest approach from a performance standpoint.
Array support in DB2 for i 7.1 gives you more options in your procedures and data access code. Arrays can be defined as parameters and variables for SQL procedures and external procedures written in Java. Arrays can be passed from one procedure to another as IN and OUT parameters. DB2 for i 7.1 supports new functions, specific to SQL procedures, which let you easily transform arrays to tables and tables to arrays. By integrating arrays into the relational model, DB2 offers both performance and ease of use for array types.
Finally, release 7.1 will be good news for vendors and developers on other database systems that already support arrays. The array support in DB2 for i 7.1 is compatible with the support offered in DB2 LUW 9.5. The IBM Toolbox for Java JDBC drivers have been updated for IBM i 7.1 to support array parameters through the standard JDBC APIs. The Database Monitor and Visual Explain tools also received enhancements to support application development with arrays.
Creating Array Types
An array data type is a user-defined type (UDT), defined as an array of a built-in data type. Array data types are defined by the CREATE TYPE statement, like other UDTs. Array variables are always associated with an array UDT.
Array data types are defined with a maximum cardinality. The maximum cardinality of an array type defines the maximum number of elements in an array.
CREATE TYPE testArray AS INTEGER ARRAY;
CREATE TYPE countries AS VARCHAR(100) ARRAY;
CREATE TYPE messages AS XML ARRAY;
The overall size of an array is limited to 4 gigabytes of storage. When the maximum cardinality value is omitted in CREATE TYPE, the database will automatically assign one based on the element data type size. Code Sample 1 demonstrates this behavior, and you can run it on your own system:
Unlike arrays in C or similar languages, the maximum cardinality doesn't necessarily define how much memory will be allocated at runtime. Generally, DB2 for i tries to allocate enough memory to store the array elements that have been assigned, and extends that allocation as needed. The maximum cardinality of an array type is enforced when the procedure is run.
DECLARE myArray testArray;
SET myArray = 1;
Trying to add array elements beyond the maximum cardinality will cause DB2 to throw an error with SQLSTATE 2202E:
SQL State: 2202E
Vendor Code: -20439
Message: [SQ20439] Value 999 not valid for array subscript. Cause . . . . . : Value 999 was specified as the subscript of an array but the value is negative or 0, the value is greater than the maximum cardinality of the array, or the value is being used to reference an array element greater than the current cardinality. If the value is -1, the NULL value may have been supplied for the subscript. Recovery . . . : Change the subscript to a valid value. Try the request again.
Using Array Parameters
Arrays are supported as IN, OUT and INOUT parameters. Here's some code for an example procedure that takes an array as input and returns an array for output. The procedure takes as input an array of DATE values and returns a subset of the input array with only the values that fall on a Saturday or Sunday.
For example, if the input dates are a Saturday, a Friday and a Sunday, the procedure will return only the dates that fall on Saturday and Sunday:
Example input: ['2010-04-24', '2010-02-12', '2010-03-14']
Example output: ['2010-04-24', '2010-03-14']
First, an array type needs to be created:
CREATE TYPE dateArray AS DATE ARRAY;
Once the array type is created, any procedures referring to it will need to have the type in the CURRENT PATH, or qualify it with the schema name.
create procedure getWeekends(in myDates dateArray, out weekends dateArray)
-- Array index variables
declare dateIndex, weekendIndex int default 1;
-- Variable to store the array length of myDates,
-- initialized using the CARDINALITY function.
declare datesCount int;
set datesCount = CARDINALITY(myDates);
-- For each date in myDates, if the date is a Sunday or Saturday,
-- add it to the output array named "weekends"
while dateIndex <= datesCount do
if DAYOFWEEK(myDates[dateIndex]) in (1, 7) then
set weekends[weekendIndex] = myDates[dateIndex];
set weekendIndex = weekendIndex + 1;
set dateIndex = dateIndex + 1;
Adding Elements to an Array
You can add elements to arrays using assignment statements:
SET myArray = 100;
SET myArray = null;
SET products[i] = (select sum(pcount) from inventory where code = codes[i]);
The valid range of subindex values is 1 to the maximum cardinality of the array type. The subindex on the left side can be any arbitrary numeric scalar expression.
When an array element at a given subindex is assigned a value, any previous elements in the array that have not yet been assigned are automatically initialized to null.
SET myArray = null;
SET myArray = 102;
-- myInteger will be set to NULL
SET myInteger = myArray;
Unlike other SET <variable> statement types, array elements have to be assigned one by one. Code Sample 2 illustrates the syntax.
All assignments to arrays use SQL storage assignment rules. Any type of assignment error to an array element, such as truncation or rounding, will result in an error being returned to the application instead of a warning.
Cardinality and Maximum Cardinality
The CARDINALITY function takes one array parameter or variable as an argument, and returns a BIGINT value of the count of elements that have been assigned. In other words, the cardinality is the current length of the array.
SET myArray = NULL; -- CARDINALITY(myArray) is NULL
SET myArray = 1; -- CARDINALITY(myArray) is 1
SET myArray = 99; -- CARDINALITY(myArray) is 10
SET myArray = -100; -- CARDINALITY(myArray) is still 10
The MAX_CARDINALITY function returns a BIGINT value of the maximum cardinality of an array, defined by the CREATE TYPE statement.
CREATE TYPE testArray AS INTEGER array;
CREATE PROCEDURE get_max_card BEGIN
DECLARE myArray testArray;
DECLARE max_card int;
-- max_card will be set to 10
SET max_card = MAX_CARDINALITY(myArray);
Array Constructors and TRIM_ARRAY
The array constructor syntax allows you to create an array using a list of expressions or with the results of a single column query. Array constructors can only be used on the right side of an assignment statement.
SET myArray = ARRAY[1, 2, 3, null];
SET weather_data = ARRAY[min_temp, max_temp, avg_temp];
SET emptyArray = ARRAY;
SET employeeList = ARRAY[select empid from employee where dept = 'A00' order by lastname];
It's also worth mentioning the TRIM_ARRAY function, which removes elements from the end of an array. TRIM_ARRAY(a, n) returns a copy of the array "a" with the last "n" elements removed. The value of the second argument must be between 1 and the cardinality of the array. Again, TRIM_ARRAY can only be used on the right side of an assignment statement.
-- remove the last message
SET messages = TRIM_ARRAY(messages, 1);
-- make a copy of the array and remove the last 10 elements
SET sampleCopy = TRIM_ARRAY(samples, 10);
-- myArray becomes an empty array
SET myArray = TRIM_ARRAY(myArray, CARDINALITY(myArray));
Manipulating Array Variables
A few words about how subindexing and null values work:
»Arrays are always one-based. In other words, the range of elements for an array is myArray...myArray[cardinality(myArray)].
»Arrays may be empty or null. An empty array variable has a cardinality of zero. A null array variable has a cardinality of NULL.
SET myArray = null; -- CARDINALITY(myArray) is NULL
SET myArray = ARRAY; -- CARDINALITY(myArray) is 0
»An uninitialized array variable or parameter is equivalent to NULL. »Referencing an array with a null subindex always returns NULL. »Referencing a null array with any subindex expression always returns NULL, even if the subindex value is greater than the array type's maximum cardinality. »In DB2 for i 7.1, subindex values can be an integer type or any compatible type provided it has 0 scale.
Convert Arrays to Tables With UNNEST
The UNNEST function accepts one or more array variables or parameters as input arguments and returns a table with one row for each assigned array element. The count of rows in the table is equal to the greatest cardinality between all the input arguments. Null arrays are treated the same as empty arrays.
If you're familiar with table functions in SQL, UNNEST is somewhere between a VALUES clause and a user-defined table function (UDTF). The database engine doesn't need to actually create a temporary table; it simply uses an abstraction to fetch the array elements as though they were rows of a table.
The short procedure in Code Sample 3 includes an example of using the UNNEST function to return an array as an SQL result set. It also demonstrates a simple array constructor and passing arrays to another procedure.
Calling the procedure will run the UNNEST query and return a result set with the contents of array variable weekends:
The UNNEST function includes an optional WITH ORDINALITY clause. When the WITH ORDINALITY clause is included, the table includes an additional column with the value of the current subindex (ordinality) for each row. Code Sample 4 demonstrates the behavior of UNNEST using an ordinality column and multiple input arrays.
Calling the procedure will return an SQL result set with all the array data, including the ordinality column. If the input arguments to UNNEST don't have equal cardinalities, the shorter array(s) will be "padded out" with null elements.
Convert Tables to Arrays with ARRAY_AGG
Procedures that need to convert query results to array output can make use of the ARRAY_AGG aggregate function. An ARRAY_AGG function can appear in the select list of a scalar subquery on the right side of an assignment statement:
SET bonusList = SELECT ARRAY_AGG(salary * 0.18) FROM EMPLOYEE;
ARRAY_AGG can also be used in the select list of a SELECT INTO statement:
SELECT ARRAY_AGG(salary ORDER BY salary) INTO salaryList FROM EMPLOYEE;
When using ARRAY_AGG, the count of rows in the selection can't be greater than the maximum cardinality of the array type being assigned. This is enforced when the procedure is run.
To put it all together, here's an example of the getWeekends procedure from the beginning of this article, rewritten into a single SQL statement. The input array is queried using an UNNEST and the resulting column is aggregated into an array with the ARRAY_AGG function.
Example input: ['2010-04-24', '2010-02-12', '2010-03-14']
Example output: ['2010-04-24', '2010-03-14']
drop procedure getWeekends2;
create procedure getWeekends2(in myDates dateArray, out weekends dateArray)
select ARRAY_AGG(dateColumn) into weekends
from UNNEST(myDates) as dateTable(dateColumn)
where DAYOFWEEK(dateColumn) in (1, 7);
Using Array Parameters in Java
Array parameters are supported when you use the CallableStatement interface to call a procedure. Arrays of Java objects can be used as input parameters, and output parameters can be converted to arrays of Java objects.
Code Sample 5 shows a client Java program that calls the getWeekends procedure. This example uses classes specific to the IBM Toolbox for Java JDBC driver, and requires jt400.jar to be in the Java classpath.
This doesn't demonstrate all of the new supported functions, but it should be enough to give a starting point. Java can also be used to implement external procedures with array parameters, but an example is out of the scope of this article. A more thorough description of JDBC enhancements in 7.1, including array type enhancements, can be found under the InfoCenter topic "Enhancements to IBM Toolbox for Java JDBC support for IBM i 7.1".
Array type information is available in the QSYS2 database catalog. The QSYS2.SYSTYPES table contains one row for each array type on the system. The METATYPE column contains the value 'A' for array types, 'U' for user-defined distinct types or 'S' for built-in system types.
The following query will select all the SYSTYPES information for all array types on the system:
select * from qsys2.systypes where metatype = 'A'
This query provides a summary of the currently defined array types:
length, ccsid, maximum_cardinality
from qsys2.systypes where metatype = 'A'
More information about the catalog tables can be found under "DB2 for i catalog views" in the IBM i SQL Reference.
Notes for IBM i developers
» There's a new option in the DB2 for i 7.1 QAQQINI parameters. By default, when DB2 for i encounters array references in a query, the query optimizer makes a temporary copy of the array. This was implemented to enforce predictable results for queries. You can imagine if a cursor was opened with an array reference (e.g. WHERE Column1 = myArray), and the array value was changed between two cursor fetches (SET myArray = 'abc'), the query may produce unpredictable results.
The ALLOW_ARRAY_VALUE_CHANGES QAQQINI parameter offers a switch to control this behavior. The default value is '*NO' (i.e., don't allow the array to change and always create a temporary copy).
If your implementation of arrays populates the array elements (either using the SET statement or the ARRAY_AGG aggregate function) and once set, the array elements never change, ALLOW_ARRAY_VALUE_CHANGES can safely be set to *YES. However, the optimizer may still decide to make temporary copies of small arrays.
» The maximum length per element for CLOB, BLOB, DBCLOB and XML arrays in DB2 for i is limited to 1MB.
» Array values can be displayed in debug for SQL procedures, although special rules apply to using arrays in debug commands. More information is available in the SQL Programming manual.
» Database Monitor files include a new type of record to describe array variables in an SQL query. The new record ID is 3011. If you use the STRDBMON command to create a database monitor file and your monitor captures procedures using arrays, you can filter queries with (QQRID = 3011) to see the array information.
This example query highlights some of the new monitor fields for arrays. More detailed information is available under "Database monitor formats" in the IBM i Database performance and query optimization manual.
QQC101 AS array_name, QQI1 AS max_cardinality, QQI2 AS cardinality, QQI3 AS
VARCHAR(QQDBCLOB1, 100) as array_values
from userlib.mydbmon where QQRID =
In the example monitor, the array had been used in an UNNEST, so the index position value (QQI3) is set to zero.
|Array_1||100||4||0||['A00', 'D11', 'D21', 'E11']|
» The SQL Performance Monitor support in System i Navigator is an easy way to analyze array usage with Visual Explain.
Visual Explain in System i Navigator 7.1 uses the new Database Monitor information to provide detailed information about UNNEST and ARRAY_AGG. Figure 1 shows a new icon in VE to explain UNNEST relations.
See the "Navigator monitors" InfoCenter topic to get started with SQL Performance Monitors.
If you've made it this far, you have the information you need to start exploiting array types in DB2 for i 7.1. Array types give you a new tool to implement procedure interfaces for your applications. Array support is just one of several new major SQL features in IBM i 7.1 to provide compatibility with modern database standards.
This article covered the process of defining array types, creating arrays both in SQL PL and Java, and manipulating array variables with the new SQL functions. Arrays can be easily converted to tables using the UNNEST table function, so they can be queried and joined with other tables. The ARRAY_AGG function is a new aggregate function that allows arrays to be created as part of the select list of a query.
The examples in this article cover some typical uses and syntax variations, but as always, the authoritative source is the IBM i Information Center. Whether or not array support is on your list of reasons to upgrade to 7.1, the IBM i integrated database system continues to grow with new enhancements.
Author's Note: Acknowledgements go to John Broich, John Eberhard, Theresa Euler, Scott Forstie and Gina Whitney for reviewing this article. Scott also provided several code examples.
Tom Blamer is a software developer at IBM Rochester. He has worked on the SQL Query Engine for DB2 for i since joining IBM in 2007. More →
Post a Comment
Note: Comments are moderated and will not appear until approvedcomments powered by Disqus