Stori Logo




Analysis Results

This Stori analysis is based upon the ABC123 database Statspack data from 04-Dec-2012 03:00:04 to 04-Dec-2012 04:00:03. Details: host udbsou20, dbid 1523308526, instance 1, snap 14515 to 14525, Stori version 0.28-0.11-0.13.

Performance Analysis Summary

1375297246_opr_x_pie.png

Overall, processes are requesting more CPU resources than the server can provide. The CPU subsystem is certainly the bottleneck. Solutions will focus on either reducing CPU requirements or increasing CPU capacity.

The 8 core CPU subsystem is 88% utilized and this Oracle instance is consuming 84% of the database server CPU capacity. Once utilization gets into the 80% area, performance is noticeably affected.

While all the recommendations presented are technically valid, they need to be analyzed in conjunction with user priorities and comments, budgetary constraints, IT administration complexity, and application uptime requirements. Only then will the best overall initial solution be discovered.

Recommendations

  • For SQL ID 7aw349n5chnzz focus on reducing elapsed time and CPU time. (rank: .644)

  • Decrease CPU requirements on the server. For example, look for non-essential big CPU consuming processes or perhaps a process you do not expect to be running. The OS command "top" can be very useful in identifying high CPU processes. (rank: .41)

  • Increase database server CPU capacity by either increasing number of CPU cores and/or their processing speed. (rank: .41)

  • For module oracle@uaadc142.arxbs.redbrick.org.pe_(TNS_V1-V3 focus on reducing elapsed time by reducing CPU consumption and by appropriately increasing IO activity. (rank: .204)

  • To reduce time related to multiple block buffered physical reads, explore influencing Oracle server processes to do more direct (non buffered) reads using event 10949 and the hidden parameter _small_table_threshold. (rank: .165)

  • To reduce time related to multiple block buffered physical reads, if blocks are likely to be read multiple times, carefully consider increasing the buffer cache, caching popular application tables, and implementing Oracle compression features. (rank: .165)

  • Look for opportunities to parallelize IO intensive processes, which will increase the IO workload on the server and decrease process duration. (rank: .155)

Operating System Performance Analysis

After analyzing the operating system CPU, IO, network, and memory subsystems, there is a very serious CPU resource shortage. Each subsystem is analyzed below in more detail.

CPU Subsystem Summary

The CPU subsystem is certainly the bottleneck. The average CPU utilization of the 8 core server is 88%.

1375297245_line_os_cpu_rt.png

This does not mean the CPU subsystem is misconfigured or there is a technical problem. This means simply the database server CPU requirements have exceeded the capacity of the CPU subsystem. Since this Oracle system is consuming 84% of the available database server CPU capacity, it is certainly the combination of the Oracle configuration and the application code attempting to use more CPU resources than is physically possible. This is a long way of saying, there is no more CPU power available.

Refering the CPU Response Time plot, the database server is operating at the point where the arrival rate (the horizontal axis) is at 434.1 lio/ms. Notice where the arrival rate intersects the response time curve. As you can see, the system is operating pretty deep into the elbow of the response time curve.

From a very operating system CPU perspective, you have two fundamental options:

1) Decrease CPU requirements on the server. For example, look for non-essential big CPU consuming processes or perhaps a process you do not expect to be running. The OS command "top" can be very useful in identifying high CPU processes.

2) Increase database server CPU capacity by either increasing number of CPU cores and/or their processing speed.

IO Subsystem Summary

IO is somewhat likely a performance limiting factor from an Operating System perspective. From an Oracle mb/sec perspective, Oracle processes are likely waiting too much for IO resources. For this Oracle instance, IO activity averaged 3369.73 mb/sec (R:3366.69 W:3.03) and 13944.33 IOPS (R:13911.74 W:32.59).

Memory Subsystem Summary

The memory subsystem is not likely the bottleneck. The Oracle caches and process global area are consuming 8% (2.5 GB) of the database server memory (31.7 GB).

Network Subsystem Summary

The network subsystem is not likely the bottleneck. Oracle specific network traffic averaged 1.8 mb/sec and Oracle processes were not likely to be waiting for network resources.

Oracle Performance Analysis

1375297242_oracle_x_BB.png

To summarize from an Oracle perspective, performance can be improved by reducing Oracle wait time and reducing CPU consumption. Primary focus should be on CPU consumption and then wait time. From a total time perspective, over the 60 minute analysis period, Oracle processes consumed 24269 seconds of CPU (69%) and waited for non CPU resources for 10962 seconds (31%).

Oracle processes are consuming 84% of the available database server CPU capacity. This is a serious concern and will certainly have to be addressed. Since this instance is consuming all of the CPU being consumed on the database server, it is certainly the only instance doing any real work on the database server. Considering both the operating system CPU utilization (88%) and Oracle process CPU consumption, looking for ways to specifically tune the Oracle instance to reduce CPU consumption is certainly to be of value in improving performance.

When there is a high percentage of CPU consumption, serialization of Oracle memory structures needs to be investigated. In this system, serialization control accounts for 0% (17 seconds) of all Oracle process (server and background) wait time (non-idle) and is therefore not likely playing a significant role in CPU consumption.

1375297243_oracle_wait_x_BB.png

In relation to all Oracle wait time (that is, non-CPU consumption time), Oracle processes are waiting too much for IO requests to be returned from the operating system. This IO time (i.e., Oracle IO wait time) is equal to the average IO wait time multiplied by the number of IO wait occurrences. This means that the IO wait time can be reduced by reducing either the average IO wait time or the number of wait occurrences. The number of wait occurrences is related to the number of Oracle IO requests. Keep in mind that some Oracle IO requests may not result in a wait occurrence and therefore will have no associated wait time.

1375297244_oracle_io_x_pie.png

The read and write activity (mb/s) is far from being balanced. This is not necessarily a performance issue but rather a characteristic of the system. From an Oracle centric perspective, performance solutions will have the specific objective of reducing IO wait time by focusing on reducing predominately non Oracle cache (physical) read activity.

Multiple block buffered physical read (ref:dbfscatr) IO requests account for 53% of all non-idle wait time, 53% of all IO wait time, and take an average of 1.2 ms. A general IO read time rule of thumb threshold is 10 ms. Since this average read time is way less than our rule of thumb, the IO subsystem would not likely not benefit with some additional capacity.

Here are some Oracle focused solutions to consider:

1) If blocks are likely to be read multiple times, carefully consider increasing the buffer cache, caching popular application tables, and implementing Oracle compression features.

2) Explore influencing Oracle server processes to do more direct (non buffered) reads using event 10949 and the hidden parameter _small_table_threshold.

Single block buffered physical read (ref:dbfseqr) IO requests account for 37% of all non-idle wait time, 38% of all IO wait time, and take an average of 6.7 ms. A general IO read time rule of thumb threshold is 10 ms. Since this average read time is way less than our rule of thumb, the IO subsystem would somewhat likely not benefit with some additional capacity.

Here are some Oracle focused solutions to consider:

1) If blocks are likely to be read multiple times, carefully consider increasing the buffer cache and caching popular application tables.

2) If blocks are not likely to be read mutliple times, explore influencing Oracle server processes to do more direct (non buffered) reads using event 10949 and the hidden parameter _small_table_threshold. Direct read rates are typically much faster than buffer reads.

Always consider non-Oracle subsystem centric solution strategies as well.

Application Performance Analysis

1375297247_appl_sql_opr_pie.png

Combining the operating system and Oracle system performance analysis, the application analysis focus will center squarely on identifying SQL statements which directly contribute to the performance problems.

Oracle processes spend more time consuming CPU (24269 secs) compared to waiting for non CPU resources (10962 secs). This information combined with what is occurring at the operating system level and in Oracle will guide us in our search for the key application SQL.

CPU Consumption

1375297249_appl_sql_cpu_pie.png

We want to find opportunities to reduce CPU consumption. This means focusing on the high CPU consuming SQL; identifying them and then either tuning them or reducing their execution rate. For example, the top CPU consuming SQL statement (SQLID=7aw349n5chnzz) consumes 77% of all Oracle CPU resources (SQL and anything else). More details are provided in the below SQL And PL/SQL Statement Analysis section.

IO Consumption

1375297243_oracle_io_x_BB.png

While Oracle processes are IO read intensive, the IO subsystem is responding quickly. This presents us with an opportunity! We want to creatively use available IO resources. Look for long running IO intensive SQL statements or an IO intensive batch job running as a single serial process; it will likely have a low execution count along with a high elapsed time. By modifying the application to parallelize the SQL or batch job, its duration will decrease because multiple processes will run concurrently. It is like having a single large shopping list but having three people find specific items instead of one person finding all the items. This approach will consume unused IO resources while reducing run times.

For example, the top physical read consuming SQL statement (SQLID=7aw349n5chnzz) consumes 91% of all physical blocks read.

1375297252_appl_sql_pior_pie.png

Slow running IO intensive OLTP SQL that can not be parallelized will need to be tuned: A SQL statement requiring two million physical IOs with each physical IO taking only 2 ms will still run in 4 seconds.

Programatic Module Activity

1375297248_appl_module_x_pie.png

Application code associated with one or more of the 8 specific modules is playing an important role in the system. While it is important to look at specific statements with an overall perspective, many times users experience issues related to multiple PL/SQL and SQL statements functionally grouped together.

Module analysis allows us to grasp how multiple good performing statements can together result in a poor user experience. When this activity grouping is assigned a module name, we can analyze their activity together. Then by focusing on the PL/SQL and SQL within the identified module, performance tuning efforts can directly impact the user experience. Generally speaking, focus on reducing elapsed time by reducing CPU consumption and by appropriately increasing IO activity.

Top Module Matrix

Below is a matrix for the top five modules. The ranking variable, Opportunity Rating (OPR), is an OraPub proprietary method relating performance improvement opportunity between modules. The higher the Opportunity Rating (OPR) value and the larger the different between modules directs us to focus on the higher value module.

Referencing the matrix, the module presenting us with the greatest opportuninty is, oracle@uaadc142.arxbs.redbrick.org.pe_(TNS_V1-V3 containing 41% of all the opportunity.

You can ask Stori to create this matrix with, "is top module matrix".

Top Module Matrix

Module Name EL CPU LIO PIO Srt Exe Par DW OPR
oracle@uaadc142.ar 1 1 1 1 4 2 1 2 1.83
pmdtm@uaadc149_(TN 2 2 2 2 3 3 3 1 .88
sqlplus@udbsou20_( 4 4 4 4 4 4 4 2 .15
TOAD_11.6.0.43 5 5 5 5 2 5 5 2 .12

EL: elapsed time, CPU: CPU time, LIO: logical IOs, PIO: physical IO block reads, Srt: sorts to disk, Exe: executions, Par: parses, DW: direct writes OPR: opportunity rating

Top SQL Within Top Module Matrix

Within the top module are many PL/SQL and SQL statements. Performance can likely be improved by focusing on the top statements within this module! Shown below are these top five statements.

Referencing the matrix, focus on the statements with the greatest opportunity. It is likely the top module statements are detailed in the below Statement Analysis section.

You can ask Stori to create this matrix with, "is top sql matrix module oracle@uaadc142.arxbs.redbrick.org.pe_(TNS_V1-V3".

Top SQL Within Top Module Matrix

SQL ID EL CPU LIO PIO Srt Exe Par DW OPR
7aw349n5chnzz 1 1 1 1 1 1 1 1 1
33z4yynbc063m 2 2 2 2 1 2 2 1 .5
8b9hnf7af7z36 3 3 3 3 1 3 2 1 .33

EL: elapsed time, CPU: CPU time, LIO: logical IOs, PIO: physical IO block reads, Srt: sorts to disk, Exe: executions, Par: parses, DW: direct writes OPR: opportunity rating

SQL and PL/SQL Statement Analysis

1375297247_appl_sql_opr_pie.png

Considering all the above during the analysis period, of the 206 statements executed only one need our attention. Taking into consideration the above application focused analysis and from a specific statement perspective, focus on reducing (per execution) CPU consumption. Statement performance details are shown below in both the Top SQL Matrix and the Key SQL Statement Details section.

A proprietary method developed by OraPub is used to relate and differenciate the importance between objects, such as SQL and PL/SQL statements. The higher the Opportunity rating (OPR) value and the larger the difference between objects directs us to focus on the higher value objects.

Top SQL Matrix

Below is a matrix containing the SQL IDs for top ten SQL based on a number of criteria. Reviewing the matrix will increase awareness of the overall SQL situation. For example, sometimes there is a single SQL statement that is ranked high in many areas. At this time, focus on reducing (per execution) CPU consumption.

Top SQL Matrix

SQL ID EL CPU LIO PIO Srt Exe Par DW OPR
7aw349n5chnzz 1 1 1 1 18 5 1 2 1
33z4yynbc063m 2 2 2 2 18 29 2 .5
8b9hnf7af7z36 3 3 3 3 18 2 .33
gm61prsakf2ud 4 4 5 4 15 1 .25
b02ykty03gsxg 6 5 4 13 18 34 25 2 .2
3k8yvvqc0zazk 7 6 10 15 18 8 3 2 .17
db78fxqxwxt7r 5 7 6 6 1 1 18 2 .14
3dwb0ndumvaa6 12 8 21 24 18 17 9 2 .13
7ng34ruy5awxq 8 9 7 9 2 4 5 2 .11
3v3jhaqxm6xn5 9 10 11 5 15 2 .1

EL: elapsed time, CPU: CPU time, LIO: logical IOs, PIO: physical IO block reads, Srt: sorts to disk, Exe: executions, Par: parses, DW: direct writes OPR: opportunity rating

Key SQL Statement Details

The key SQL statements are selected based on a number of criteria including both operating system and Oracle system activity. Thankfully Statspack does not collect data for every SQL statement. But then percentages of all SQL can be inflated. Obviously some SQL statements reside within a function or procedure. If so, their values are counted multiple times resulting in percentages related to all SQL being deflated. Therefore, the Percent of all SQL (i.e., collected) statistics below need to be viewed with this in mind.

Stori contains a significant amount of statement activity available through the, "is sql id ..." command set. Do a "help is sql" for details.

Identifiers
SQL ID 7aw349n5chnzz
Opportunity rating 1
Focus areas elapsed time, CPU time
SQL type query
Module oracle@uaadc142.arxbs.redbrick.org.pe_(TNS_V1-V3
Elapsed Time
Total (sec) 22672.35
Average (sec/exe) 12.356
Percent of all SQL 75.605
CPU Consumed
Total (sec) 18759.775
Average (sec/exe) 10.223
Percent of all SQL 82.743
Wait Time
Total (sec) 3912.575
Average (sec/exe) 2.132
Percent of all SQL 17.257
Executions
Total completed 1835
Per second .51
Percent of all SQL .968
Parse Cals (soft+hard)
Total (#) 29040
Average (#/exe) 1
Percent of all SQL 66.258
Physical IO Blocks Read
Total (#) 1407755356
Average (#/exe) 767169.131
Percent of all SQL 91.115
Logical IO Consumed
Total (#) 1413190231
Average (#/exe) 770130.916
Percent of all SQL 91.054
SQL hash value
New 180900863
Old 3264153475
SQL text
Partial SELECT "DONATION_ATT_ID","DONAT

There are almost always non-Application subsystem centric solution strategies as well.