Regression Analyzed, Tout de Suite: Leveraging Oracle 12c Database Real Application Testing (RAT) Suite, Part 2
-
Posted by Quest Customer Learning Team
- Last updated 10/13/19
- Share
By Jim Czuprynski ◾ Nick Marcovecchio, Editor
To recap where Part 1 of this article series ended, we finished using Database Workload Capture to collect two different workloads from an Oracle 11.2.0.3 database — one OLTP workload, and one DSS workload — and saved the resulting shadow capture files for eventual transference to a new platform for Database Workload Replay testing and evaluation. The results of several queries against statistics for those Database Workload Capture sessions are available here.
Replaying a Workload: Preparations
To prepare for Database Replay operations against the new database system — an Oracle Database running the latest beta version of Release 12cR2 — I transferred the generated Database Workload Capture files via SFTP to two separate directories on the new destination database’s server. I also created a separate directory to hold just the results of the consolidated workloads’ execution and any Database Replay activity.
I then set up the destination database environment for replay using the code shown in Listing 1 to create corresponding directory objects for the DSS and OLTP workloads.
Listing 1: Preparing for Database Workload Play
Note that I also created a new Oracle database user account named WRC that I’ll use to invoke multiple Workload Replay Client sessions. Using this separate account is a recommended best practice to make it easier to isolate any session(s) that are only involved in replaying workloads against the destination database.
Combining Database Replay Workloads
Next, I’ll combine the OLTP and DSS workloads into a single workload schedule using Oracle 12cR1’s new BEGIN_REPLAY_SCHEDULE procedure of package DBMS_WORKLOAD_REPLAY. The code to perform their combination is detailed in Listing 2.
Listing 2: Combining OLTP and DSS Workloads for Database Workload Replay
Let’s break down exactly what this step accomplished:
- The PROCESS_WORKLOAD_CAPTURE procedure prepares both the DSS and OLTP workloads — which were recorded against an 11.2.0.3 database — to be replayed against the destination 12.2.0.1 database.
- The SET_REPLAY_DIRECTORY procedure defines exactly where the results of the Database Workload Replay activity will be recorded.
- The call to BEGIN_REPLAY_SCHEDULE defines the Database Workload Replay schedule named TEST2DESTRUCT which includes the DSS and OLTP workloads captured against the source system.
- The ADD_CAPTURE procedure defines several key aspects of the impending Database Workload Replay session for both workloads:>CAPTURE_DIR_NAME defines the location of the directories that contain the existing DSS and OLTP workload capture files.
>Setting START_DELAY_SESSIONS to a value of zero (0) specifies that the workload sessions should begin without any delay. Note that it’s also possible to enable a replay to start after time has elapsed — say, to enable a database to attain a desired steady state.
>Setting STOP_REPLAY to FALSE means that the replay should continue until all captured workloads have been consumed completely.
>The calls to procedures TAKE_BEGIN_SNAPSHOT and TAKE_END_SNAPSHOT specify that an AWR snapshot should be taken just before the beginning and just after the end of consolidated workload activity.
>Setting QUERY_ONLY to FALSE specifies that the database should replay all DML statements as well as all queries. If I had set this to TRUE, then only read-only queries would be replayed.
- Finally, the END_REPLAY_SCHEDULE procedure defines the end of the scheduled replay activity.
Initializing a Consolidated Database Workload Replay
Now that I’ve finished building my Consolidated Replay schedule, my next step is to make sure my destination database is ready to accept that workload schedule. For these tests, I simply reset my database by reloading the data into the AP schema via a series of stored procedures that generated exactly the same data that was present in the source database before I began workload capture.
Of course, in the real world and for a larger database, I could use any number of methods to reset my destination database, including:
- Rewinding the database back to a prior point in time using FLASHBACK DATABASE.
- Employing a snapshot standby version of the original source database.
- Restoring the database from earlier backups using Recovery Manager (RMAN).
- Reloading the database completely via DataPump Import.
To initialize the database for a consolidated replay, I used the code in Listing 3. Procedure INITIALIZE_CONSOLIDATED_REPLAY assigns the task name of DBR_CONS_100 to the TEST2DESTRUCT replay schedule created earlier. Note the use of the REMAP_CONNECTION procedure to remap the database services that user sessions used to connect to the source database during the original Database Workload Capture session to just two database services (OLTP and DSS) during the consolidated Database Workload Replay to come.
Listing 3: Initializing Database Workload Replay
Starting a Consolidated Database Workload Replay
Next, I’ll start up two Workload Replay Client (WRC) sessions using the script shown in Listing 4. Note that I’m using the WRC user account to log into the destination database so that any additional activity generated by WRC sessions is extraneous to the workloads being replayed.
Listing 4: Starting Workload Replay Client Sessions
Now that the Workload Replay Client sessions are running, I’ll start consuming the captured workloads by kicking off the Consolidated Replay with a call to procedure START_CONSOLIDATED_REPLAY, as shown in Listing 5. Note that setting START_REPLAY_TIMEOUT to FALSE turns off the default action of the WRC, which is to terminate any user call that may inadvertently cause a replay to run slowly or possibly cause it to hang.
Listing 5: Starting a Consolidated Workload Replay
Monitoring Consolidated Database Workload Replay Progress
As the Consolidated Database Workload Replay unwound, I monitored its progress through several panels via Oracle Enterprise Manager 13c Cloud Control (EM13cCC). As its name suggests, the Database Replay panel (see Figure 1) displayed the overall progress of the ongoing Database Replay operation, including a progress bar reflecting the time it took to play back the original workloads as compared to the current consolidated workloads. It also provided tallies of any regression observed so far — for example, a variance in the number of rows queried or modified, or the number of errors observed — between the captured and replay workloads.
Figure 1: Database Workload Replay Progress Monitoring
Figure 2 shows a snapshot of the Database Performance Monitoring panel as the workload was replayed. Note that the bottom half of this panel allowed me to see how the different database services were consuming the captured workload and replaying it in the new environment.
Figure 2: Database Workload Replay: Database Performance Monitoring
Finally, I kept an eye on the state of my database’s Database In-Memory (DBIM) memory utilization using one of the newest features of OEM13cCC, the In-Memory Central panel. This screenshot reflects the fact that during this particular run of Consolidated Workload Replay, I had enabled population of several key tables into In-Memory Column Store (IMCS) so that I could evaluate the impact of enabling Database In-Memory features as part of my beta-testing activities for Release 12cR2. Figure 3 shows the results of monitoring that aspect of the consolidated replay.
Figure 3: Database Workload Replay: Monitoring In-Memory Area Via In-Memory Central
Analyzing Consolidated Database Workload Replay Results
After the Consolidated Database Workload Replay completed its execution, I captured some statistics from the results via the queries in Listing 6; their output is shown in Listing 7.
Listing 6: Tracking Consolidated Workload Replay Progress
Listing 7: Consolidated Workload Replay Statistics
In actuality, these reports show the results from several tests I had run over a period of 24 hours with varying results, but I’ll focus in on the latest run (#167) in the following sections as we analyze the results from the consolidated replay activity. Listing 8 shows the code I used to report against that last Consolidated Database Workload Replay run. Note that the output from that report (Listing 9) shows that a consolidated replay task definitely occurred, that it ran against an Oracle 12cR2 database, and that it was originally captured from an 11.2.0.3 database. Of course, this information is also available from the Database Workload Replay panel in HTML format.
Listing 8: Workload Replay Report Generation
Listing 9: Workload Replay Report Output
Now let’s take a deeper look into whether this new configuration of my database was able to handle the consolidated workload that unwound during this Database Workload Replay task. The easiest way to tackle that topic is with tools we’ve been using since Oracle 11gR1 — running an Automatic Workload Repository (AWR) Compare Periods Report.
Figure 4.1 shows some interesting initial results: The consolidated workload appears to have completed with significantly diminished Database Time (285.2 minutes vs. 191.9 minutes), a 32.7 percent improvement. Also, it shaved almost 21 percent off of the original elapsed time for the two separate captured workloads. So far, so good — we’re not seeing any apparent performance degradation at a database level.
Figure 4.1: AWR Results: Summary
Meanwhile, Figure 4.2 compares the Top Timed Events between captured and replayed consolidated workloads. This report snippet looks promising as well: CPU time decreased slightly, but a new wait event — Acknowledge Over PGA Limit — has appeared. However, this does make sense because this event is directly related to the population and use of the In-Memory Column Store in the newest release of the database.
Figure 4.2: AWR Results: Top Timed Events
As Figure 4.3 illustrates, there’s some good news in the Time Model Statistics section of the report as well. SQL Execution time has dropped by a factor of nearly 31 percent, and SQL parse times have improved as well.
Figure 4.3: AWR Results: Time Model Statistics
Figure 4.4 shows some interesting information, too. This snippet of the report summarizes database activity within Wait Classes, and this reveals some apparently significant “pushback” against the database workloads during replay. This was most likely due to a Database Resource Manager (DBRM) plan directive that was active during the replay and definitely warrants additional investigation, especially since that “pushback” wasn’t apparent during the previously captured workloads. Again, this doesn’t necessarily mean we’ve encountered performance regression, but it could be the result of an errant DBRM plan directive that was enabled during replay, or it could even be related to how 12cR2 now manages parallelism during population of the In-Memory Column Store.
Figure 4.4: AWR Results: Wait Classes Detail
Finally, Figure 4.5 shows some interesting results, too, but this time at the Database Service level: Only the workload sessions played back via the OLTP database service show any significant regression in DB and CPU time, while most of the other application sessions show slight improvement in those categories.
Figure 4.5: AWR Results: Service Statistics
Conclusion
In this final part of this article series, we demonstrated how to leverage the latest features of Real Application Testing in Oracle Database 12c to replay previously-captured workloads concurrently, even when they were recorded during completely different timeframes. We also discussed how to ramp up those captured workloads so that we could perform a test to destruction against the target database in its new 12c environment. Finally, we illustrated how easy it is to determine if the new destination system is able to handle the stresses of those ramped-up workloads using existing tools like AWR Compare Period Reports to quickly determine potential vectors for performance improvement or regression.