New and Improved XML-INTO
Namespace support makes the opcode a viable option
When details of version 5 release 4 were announced, many people were surprised to find that RPG had added native support for XML processing in the form of the two new opcodes, XML-INTO and XML-SAX. If you’re not familiar with the topic (and the rest of this may not make much sense unless you are), we have written a number of articles on the subject. Starting with “A Traditional Approach to a Modern Technology,” which was followed by “More on RPG’s XML Support.” We also covered subsequent enhancements made to the support via PTF in the V6 release in “XML-INTO Revisited.” Now that we’re all on the same page in the hymnbook, we can take a look at IBM’s latest enhancements.
Despite the ease of use that this XML support appears to offer, uptake has been relatively slow. This is surprising since there has been a significant increase in the number of IBM i shops that are handling XML data as part of their daily workload. The reason for this slow adoption, as many of you may have discovered for yourselves, is that XML-INTO lacks namespace support, and namespaces are commonplace in most standard XML documents.
The simplest way to think about namespaces is that they are a means of qualifying element names in much the same way that qualification of an RPG data structure allows you to have two different versions of the same field in a program. Why is this necessary? Because anybody and everybody can devise XML schemas. It is after all a language designed explicitly for this purpose. Such flexibility raises the possibility, indeed probability, that two different organizations may have decided to use identical element names for different purposes. For example, consider names such as "value," "quantity," "name" or "address." Not only are these very common identifiers, but some may have multiple meanings depending on context. "name" might refer to the name of a company—or an individual. "address" could refer to a physical location such as "24 Main Street" or to an IP address such as 10.1.1.25. Add to this the fact that XML documents frequently contain other XML documents as the "payload" and you can see the potential for problems. Namespaces are the mechanism by which these issues are avoided. Typically a namespace has a connection with the company or organization that originated it. For example, our company domain is Partner400.com. If we were to design an XML schema to contain our customer information we might choose to use the namespace of "partner400." Our elements would be identified by placing the namespace followed by a colon, immediately before the relevant element names. So, our document might look something like this:
<partner400:street>61 Kenninghall Cres.</partner400:street>
As you can see, there’s no chance that our street element would be confused with a street element from the ABC company’s schema. So while XML provided a solution to the problem, until now RPG had no way of handling it. Namespaces were simply not supported. As a result, many people who should have been able to use XML-INTO were forced to seek other solutions such as XML-SAX, alternative parsers or pre-processing the document to produce a solution such as incorporating the namespace within the element name so that RPG could cope with it. None of these was an ideal solution, and in some cases people just gave up, or expended the effort to write their own parsers to deal with the inadequacies of the standard feature.
All of that changes with the new RPG runtime PTF. This is currently available for V6 (SI42426) and should shortly be available for V7. It adds several new options to the XML-INTO support specifically designed to deal with namespaces. Let’s take a look at how this is achieved.