Monday, January 14, 2013

Reproduce a Dynamic, Client-Server Production DB2 for z/OS Workload in Test with Optim Query Capture and Replay

Effective regression testing, and comparative application testing in general, depends largely on your ability to reproduce in a test environment a workload that closely mimics the production application workload of concern. Historically, the difficulty of reproducing a production workload in test has depended on the nature of that workload. Back when I started in IT (Ronald Reagan was in his first term as President of the USA), the focus of mainframe application performance testing efforts was often on batch workloads, and reproducing that type of production workload in test was not so tough: you had your input files, you had your batch jobs that were submitted in a certain sequence and with a certain timing through a job scheduling tool, you ran the jobs and measured the results against baseline data, and that was that.

Batch is still an important part of mainframe computing, but over time the emphasis at many DB2 for z/OS sites has shifted to transactional workloads; moreover, the nature of transactional workloads -- especially during the past 10 years or so -- has changed, with multi-tiered, network-attached, DRDA-using, client-server applications coming to the fore. On top of that, these modern transactional applications are much more dynamic than their forerunners. They are not driven by in-house staff performing clerical and data-entry functions via a "green screen" interface; instead, in many cases the end-users are external to the organization -- maybe consumers browsing online product catalogs and making purchase decisions, or perhaps employees of client companies in a business-to-business context, checking on orders or reviewing fulfillment history. If the end-users are internal to the organization, increasingly they are not clerical workers; rather, they are senior professionals, managers, and executives using analytics-oriented applications to improve speed and quality-of-outcome with respect to decision-making. The actions of these individuals, and the frequency and timing of their interactions with your DB2 for z/OS subsystem, are often hard to predict. For testing purposes, how do you get your arms around that?

And, getting your arms around that particular kind of application testing scenario is getting to be more and more important. If an environmental change (e.g., new system software releases) or an application modification is going to negatively impact performance from the end-user perspective, you REALLY want to catch that before the change goes into production. Elongated response time for in-house clerical staff is one thing, but poor performance affecting an external-to-the-organization end user can lead to lost business and, perhaps, long-term loss of customers (as the now-familiar adage goes, your competition is often just a click away). If performance degrades for an internal-use decision support application, likely as not it won't be a DBA getting calls from irate users -- it'll be directors and VPs and maybe your CIO getting calls from their peers on the business side of the organization.

In short, the challenge is tougher than it's been before, and the stakes are higher than they've been before. Gulp.

Fortunately, a recently announced and available IBM tool addresses this need very nicely. It's called IBM InfoSphere Optim Query Capture and Replay for DB2 on z/OS, and it came out just a couple of months ago. The fundamentals of what Optim Query Capture and Replay can do are spelled out in the product's name: it can capture a DDF application workload executing in your production DB2 for z/OS environment and enable the playing back of that captured workload in a test environment: the same SQL statements, with the same values, executed with the same volume and timing through the same number of connections to the DB2 subsystem.

This capture and replay capability by itself would come in very handy for things like regression testing, but that's not where the story ends. Suppose you want to see what would happen to response times if transaction volume were to increase by some amount? No problem: not only can Optim Query Capture and Replay play back a captured workload -- it can play it back at a higher speed; so, instead of, say, the 100 transactions per second seen in production for a client-server application workload, you could see how response times hold up at 150 transactions per second.

Speaking of response time, Optim Query Capture and Replay provides built-in reporting capabilities that help you to easily zero in on changes between baseline and replay test results.

What's more, Optim Query Capture and Replay can be used to invoke the IBM DB2 Cloning Tool to make a copy of a DB2 subsystem for testing purposes.

Oh, and I would be remiss if I failed to tell you that Optim Query Capture and Replay is not just about comparative workload testing. It's also a great tool for helping you to better understand a DB2 client-server application workload. Often, when it comes to these very dynamic, shape-shifting transactional applications, people want to get a better look at the trees in the forest. What SQL statements are being executed? What predicate values are being supplied by users? What columns are being retrieved? We are familiar with the idea of taking a "snapshot" of a database, but taking a snapshot (more accurately, a time slice), of a DDF workload seemed implausible -- until now. And why stop with just a better understanding of a client-server application workload? How about tuning it? The SQL statements in a workload captured by Optim Query Capture and Replay can be exported for analysis and tuning -- something you might do with a tool such as IBM's InfoSphere Optim Query Workload Tuner.

Now, all this good stuff would be less appealing if it came with too great a cost in terms of system overhead, so it's nice to know that Optim Query Capture and Replay has a pretty small footprint. This is true largely because the tool employs a "catch and throw" mechanism (more like "copy" than "catch") to send statements associated with a workload being captured to an external appliance, from which the workload can be replayed; thus, there is not a reliance on relatively expensive performance trace classes to get the statement-level data recorded by Optim Query Capture and Replay.

There you have it: a way to efficiently capture what may have appeared to you as an elusive workload, and to effectively use that captured workload for regression testing, "what if?" testing, and application SQL analysis. Check out Optim Query Capture and Replay, and get ready to go from, "My gut tells me..." to, "Here are the numbers." 

No comments:

Post a Comment