DB2 9 delivered major advancements with respect to the CPU efficiency of the IBM DB2 utilities. Tests of the performance of the DB2 9 utilities versus their DB2 V8 counterparts showed the following:
- For LOAD: CPU savings of 5-30%
- For LOAD of a partition of a partitioned table space: CPU savings of 35%
- For LOAD REPLACE of a partition of a table space with non-partitioned indexes, using a dummy input data set (a technique regularly used to quickly "empty out" a partition): CPU savings of up to 70%
- For RUNSTATS INDEX: CPU savings of 30-50%
- For REORG INDEX: CPU savings of 40-50%
- For REORG TABLESPACE and REBUILD INDEX: CPU savings of 5-20%
- For CHECK INDEX: CPU savings of 20-60%
Similar gains were seen for utility job elapsed times. Detailed information on a number of performance tests can be found in chapter 6 of the IBM "red book" titled, "DB2 9 for z/OS Performance Topics."
How were these CPU efficiency and run time improvements, available in DB2 9 conversion mode, achieved? There are three primary factors:
- A block-level interface to the DB2 index manager. The most significant CPU and elapsed time reductions are associated with utilities that process index keys. Whereas previously a call to the index manager was necessary for each key involved in a utility execution, with DB2 9 (and 10) blocks of keys can be passed with an index manager call. The resulting reduction in required index manager calls reduced CPU overhead for index-intensive utilities.
- More efficient generation of index keys. This comes into play especially for non-padded indexes with varying-length keys.
- Exploitation of shared private storage. z/OS 1.7 provided a new type of virtual storage resource known as shared private storage. With the availability of this resource (configured via the HVSHARE parameter of the IEASYSxx member of PARMLIB), address spaces can register to use a Virtual Storage Object (VSO) that is created just for those address spaces (the VSO is thus part of those address spaces' virtual storage, but not part of other address spaces' virtual storage -- this in contrast to ECSA, a shared resource that is part of every address space). A DB2 9 (or 10) utility address space can use a VSO (which is located above the 2 GB "bar" in virtual storage) to exchange data rows with the DB2 database services address space (aka DBM1) in a way that does not require the use of z/OS cross-memory services. That, in turn, reduces the CPU cost of utility execution.
While the big splash in terms of IBM DB2 utility performance was made by DB2 9, DB2 10 added some utility performance enhancements of its own, to wit (and you can get the details in chapter 9 of the IBM red book, "DB2 10 for z/OS Performance Topics"):
- COPY utility exploitation of FlashCopy functionality. DB2 10's COPY utility can utilize FlashCopy to make near-instantaneous (from a data availability perspective) backups of individual database objects (this capability is also available for inline image copies generated through the execution of other utilities, such as LOAD and REORG). The object-level, DB2-directed FlashCopy backup functionality provided with DB2 10 builds on the volume-level FlashCopy exploitation delivered via the BACKUP SYSTEM utility introduced with DB2 V8 (and significantly enhanced with DB2 9). Object-level backups created using FlashCopy can reduce host CPU consumption, particularly when used for larger objects.
- zIIP exploitation for RUNSTATS. A major portion of the processing done by a RUNSTATS execution in a DB2 10 system can be offloaded to zIIP engines (the percentage of zIIP offload will vary based on the nature of the work done by RUNSTATS). As just about everyone knows by now, zIIP MIPS are less expensive than general-purpose-engine MIPS.
- Page-level versus row-level RUNSTATS sampling. People commonly specify, on a RUNSTATS utility control statement, a sampling percentage -- this to reduce the CPU consumption of RUNSTATS execution. Basically, this told RUNSTATS to look at a percentage of rows in the table -- versus all rows -- in generating statistics for the catalog. The thing is, in looking at a relatively low percentage of a table space's rows (25% is the default SAMPLE value), RUNSTATS might have to read a large percentage of a table space's pages. The new sampling capability, specified via the TABLESAMPLE SYSTEM option, can result in RUNSTATS examining significantly fewer of a table space's pages during execution, thereby reducing CPU time for an execution of the utility.
- Online REORG use of list prefetch for disorganized indexes. When a DB2 10 online REORG job reorganizes an index that is disorganized (and this is particularly applicable to a REORG INDEX job, or to a partition-level REORG TABLESPACE job when the associated table has non-partitioned indexes), it can use list prefetch to efficiently access the index leaf pages in logical order when unloading the index to the shadow data set. This technique avoids the large number of synchronous index pages reads that might otherwise be required for the index unload, with associated CPU and elapsed time benefits.
- Exploitation of data buffering enhancements for BSAM files. By utilizing more buffers for BSAM data and page-fixing these buffers (exploiting enhancements delivered in z/OS 1.9 and 1.10), DB2 10 utilities using BSAM can run faster (COPY and UNLOAD) and use less CPU time (COPY, UNLOAD, LOAD, and RECOVER) than in previous-release DB2 systems.
- Variable-blocked spanned record support for LOAD and UNLOAD. I blogged about this a few weeks ago. Big-time performance improvement for LOAD and UNLOAD of tables containing LOB (large object) data.
- LOAD and UNLOAD of data in internal format. By eliminating column-level processing, this option (retrofit to DB2 9 via the fix for APAR PM19584) can substantially reduce CPU and elapsed time for some LOAD and UNLOAD jobs.
- PRESORTED option for LOAD. This new option (also retrofit to DB2 9 via the aforementioned APAR PM19584) eliminates LOAD-related sort of input data when that data has already been sorted in clustering key sequence prior to utility execution. Bypassing that sort operation can result in reduced CPU and elapsed time for LOAD utility execution.
On top of all this, the IBM DB2 utilities, which by default use DFSORT for sort processing, benefit from performance enhancements delivered by the DFSORT development team. An example is the performance boost -- especially as it pertains to CPU consumption -- provided by the fix for DFSORT APAR PM18196.
So, having laid out all these recent IBM DB2 utility performance benefits, I have two questions for you:
- How fast, and how CPU-efficient, do your DB2 utility operations need to be, and can IBM's DB2 utilities meet those requirements? As noted in the opening paragraph of this blog entry, if you haven't put IBM's DB2 utilities to the test in a DB2 9 or DB2 10 environment, and if performance has been the reason for your organization's use of other vendors' DB2 utility products, it's time to give the IBM utilities another look. You could end up helping to reduce your company's mainframe software costs (especially if your organization is already licensed for IBM's DB2 Utilities Suite for z/OS, which is often the case).
- If IBM's DB2 utilities, even in a DB2 9 or DB2 10 environment, don't quite clear your performance bar, would they if you also had the IBM DB2 Sort product? I blogged about DB2 Sort a couple of weeks ago. DB2 Sort can take the performance of the IBM DB2 utilities to another level, and the added cost of DB2 Sort could still mean software cost savings for your organization in light of what your company may be paying for another vendor's DB2 utility products.
Let IBM's DB2 utilities perform for your company. Could be a good move on your part.