Memories Are Made of This
A quick quiz on RPG data types and storage
For this month’s column, we decided to look at memory. Not memory in the context of Jon forgetting where he’s put the car keys, but memory in terms of how variables are stored within your RPG programs. Why this subject? It’s just that we’ve recently seen several questions on Internet forums that clearly showed a misunderstanding of how RPG stores data within a program. Additionally, we noticed when working with the Open Access for RPG templates supplied by IBM that, among other techniques such as nested Data Structures (DS), they used the ALIGN keyword. Since this is rarely used in RPG, we thought some exploration might be in order. So here’s our quick quiz on RPG data types and storage.
Q: Within a program, where are a file’s fields stored?
A: If your background is in COBOL or perhaps C programming, your answer might be "in the record buffer" and for both of those languages, you’d be correct. But for RPG, the answer is "anywhere the compiler feels like putting it." At least that’s the default behavior.
If you think about it, the notion of a record storage area can’t work for RPG because RPG has traditionally associated any given field name with one specific piece of storage, regardless of how many files that field may be defined in. That’s why RPG is able to let you specify the same field name in a display file and a database (and a print file as well for that matter) and display the correct data on the screen without you needing to explicitly move the data physically from one place to the other.
Under the covers, RPG allocates storage for the field and does the equivalent of an EVAL to copy the data between the record buffer and that storage. The storage allocated for this purpose isn’t neatly arranged in a "record," or DS, or anywhere else that you can directly access it, other than as an individual field. It’s actually extremely dangerous to assume that the storage for FieldB follows that of FieldA simply because they were defined consecutively in the database. We’ve seen cases where programmers relied on this and got a nasty shock when a new version of the compiler stored the fields in a different sequence.
If you want to override RPG’s default behavior and take control of where the fields are stored you can do so in several ways. The simplest approach is to define an externally described DS based on a file’s record format. The other approach is to simply define the field names in a DS of your choosing without any length or type definitions. This causes the compiler to effectively say, "Oh! So you want me to put it there!" We covered this technique in our “D-Spec Discoveries&drquo article so we won’t go into more detail here.
Q: How many bytes of storage do each of the following variables occupy? And what’s the total size of the DS "sizes"?
D sizes ds
D int1 3i 0
D int2 5i 0
D uInt1 3u 0
D uInt2 5u 0
D binary 3b 0
D packed 7p 2
D zoned 7s 2
D float 8f
D char 20a
D varying 20a Varying
D varying4 20a Varying(4)
A: The answers are 1, 2, 1, 2, 2, 4, 7, 8, 20, 22 and 24 for a total of 93. Get any wrong?
If you did, let’s start at the beginning and see where you went wrong. Each of the 3i and 3u fields takes one byte. A one-byte integer is defined as 3i because RPG traditionally (but sadly not always) defines the length of a numeric field in terms to the decimal value it can contain. A single byte can contain values in the range 128 to +127 or in the case of unsigned integers the range is 0 to 255—in other words, three digits, and hence 3i and 3u. A two-byte integer is defined as 5u because it can contain values from 32,768 to +32,767 and the unsigned 5u from 0 to 65,535. The same principles hold true for four- and eight-byte integers, which are defined as 10i and 20i respectively.
So why does the 3b field occupy two bytes? Because the size of these old-style b(inary) fields is based on the number of bytes needed to hold the value should all of the digits specified reach their maximum value. In our case, this would be 999 and since that won’t fit in a single byte, which has a capacity of only 256, another byte is added to accommodate the full value range. As if this wasn’t weird enough already, every time you use one of these fields, it’s converted to packed, the required math performed and the resulting value converted back to binary. What a waste of energy and as good a reason as any to NEVER use type b. Even if IBM examples foolishly include it, take no notice and define the field as a type I of the appropriate size.
For packed fields, there are two digits per byte plus one half-byte or nibble for the sign (this is sometimes spelled nybble or nyble—assumedly because those spellings match better with "byte." Who says computer geeks have no sense of humor!). So a 7p field occupies four bytes. In the case of even-numbered packed fields (e.g., 6p), an additional half byte is added to round it up—so a 6p occupies the same space as a 7p. For the zoned (or signed numeric) field, the length specifies the number of bytes so 7s occupies seven bytes. For both zoned and packed fields, no actual decimal point is stored with the data, so the number of decimal places has no impact on the length. Floating-point fields are defined as 8f because they’re eight bytes long. Yes, they break RPG’s rules about numeric fields being defined by the number of digits they can hold. Pity the rules weren’t broken the same way for integers; it would have made more sense.
For basic character fields, the number of bytes of storage occupied is the same as the size of the field—no excuses for getting that wrong! For varying length character fields the storage occupied is normally two bytes longer than the field itself. If the field is longer than 32K in length then two bytes isn’t enough to contain the count and four bytes is used. Four bytes are also used if you ask for it by saying VARYING(4). So our character fields were 20, 22 and 24 bytes long respectively.
comments powered by