Recently one of my customers asked to investigate a database performance problem on a system where for multiple reasons I wasn’t having direct access.
To troubleshoot the performance, I started asking few AWR reports; at the first glance I didn’t spot any bottleneck, but while re-reading the statistics I found a strange ration between number of Transactions and Rollbacks.
Extract of AWR Load Profile
— | Per Second | Per Transaction |
Rollbacks | 476.8 | 0.9 |
Transactions | 546.6 |
Because all others OS and database statistics were quite good, I decided to follow the trail of the high percentage of transactions rolled back.
Before any fast conclusion I checked the nature of the rollbacks:
- transaction rollbacks: Oracle is automatically executing a rollback, this happens for example in case of constraint violation (i.e. Primary Key violation).
- user rollbacks: number of times users manually issue the ROLLBACK statement or an error occurs during users’ transactions.
Extract of AWR Activity Stats
— | Total | per Second | per Trans |
user commits |
125,878 |
69.90 | 0.13 |
user rollbacks |
858,562 | 476.76 |
0.87 |
transaction rollbacks |
2,704 |
0.75 |
0.00 |
rollback changes – undo records applied |
46,132 |
25.60 |
0.06 |
data blocks consistent reads – undo records applied |
423,396 |
235.14 |
0.49 |
transaction tables consistent reads – undo records applied |
128 |
0.08 |
0.00 |
consistent changes |
5,599,562 |
3,109.48 |
12.7 |
From the AWR analysis has emerged the following data:
- 87% of the user’s transactions ended with a rollback.
- While rolling back, Oracle applies 25.6 undo records/sec. which means that at least a certain percentage of user rollbacks generates “real work“.
- Even more important in term of performance analysis
- number of data block consistent reads 235.14 records/sec.
- number of consistent changes 3109.48 times/sec.
Because the database load profile is 90% SELECT and 10% DML, and there are important values about data block consistent reads and consisten changes, we can assume that most of the activity is concentrated in a small number of user objects.
Next step: I have to gain access to the system and continue the investigation…