Subscribe to our blog

Your email:

Follow Me

Database Blog

Current Articles | RSS Feed RSS Feed

Oracle Performance Tuning -- Log Sync Waits

  | Share on Twitter Twitter | Share on Facebook Facebook | Buzz This  Google Buzz | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon |  Share on LinkedIn LinkedIn |  Share On Technorati Technorati | Submit to Reddit reddit | 

I recently worked with a client to resolve a significant performance issue related to painfully slow queries in an Oracle database. One of the queries that they wished to execute was taking over five hours to run.

This is an Oracle 10g database running on a Solaris platform. The server has 8 processors so there is plenty of processing power. The database is very large with a very high insertion rate. The database is about 10 Terabytes and most of the data gets inserted into one large table and is partitioned by date, with a few partitions per day.  Therefore the query, which specifies a small date range should only need to look in one or two partitions, so queries should be relatively fast.

Before I became involved, the in-house dba looked at a number of the typical issues. He confirmed that there were appropriate indexes on the table, and that the query was expected to access only one partition. The cause of the significant execution time was not clear.

When I became involved, I was given limited access to the server. I was not given DBA access but I did have select access to the catalog tables so I was able to poke around a bit.

I also received execute privilege to dbms_workload_repository so I could create AWR snapshots and reports. Fortunately, the client had set up hourly snapshots so I was able to select snapshots before and after the query ran. The resulting report is where I found the clue to one cause of the problem.

I ran a report for an 9 hour period during which the query executed.

AWRStart

The top wait event in the AWR report for the day was log sync waits. These were very high, 1,104,069 with a total wait time of 72,781 seconds.

LogSyncAWR

There are a number of causes of log file wait events. Some of these are described in this excellent post.

However, in this case, the cause was something else. 

Because of a similar situation that I saw in the past with very high insertion rates, I thought that frequent log file switches may be the cause. I checked v$loghist and found that the logs were switching every five seconds! The optimal frequency of log switches is twenty minutes so this clearly a problem. 

The log size was set at 131M which was fine in the past when the insertion rates were lower, but as the rates grew, the logs were never increased to handle the higher rates. 

We first increased the logs to 1G and the log switches dropped to every 2-3 minutes. Much better but still far too high. We increased them again to 4G and now the logs switch every 30-40 minutes. This is higher than it needs to be and we will now reduce the size a bit in order to reduce the recovery time in the event of instance failure.  

After this change, the log file sync rates disappeared from the top wait events and the system performance has improved significantly. The query that was taking 4-5 hrs is now taking about 3 minutes.

I cannot be certain that insufficient log file size was the only cause of the performance problems because other steps were taken simultaneously including refreshing statistics which can have a significant impact as well.

However, I am certain that the insufficient log size was the cause of the log file sync waits And with a total of 72,781 seconds of wait time over nine hour period (that is about 2.25 hours of wait per processor if it is distributed evenly), it very likely had a significant impact on overall system performance as well.

Five reasons why I avoid database query hints.

  | Share on Twitter Twitter | Share on Facebook Facebook | Buzz This  Google Buzz | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon |  Share on LinkedIn LinkedIn |  Share On Technorati Technorati | Submit to Reddit reddit | 

nohints resized 600I know that some will disagree with me on this but I have never been a fan of query hints. 

Most database vendors give you the ability to override the query optimizer with hints that you specify in a query. Each query can be executed in a variety of ways by the database engine. Normally, the optimizer evaluates the statistics about the data in each of the tables involved in the query and makes a decision about the best way to execute that query.  Hints enable you to direct the database engine to use a specific approach, even if it prefers a different one.

Here are five reasons that I avoid hints.

Reason Number 1. I view hints as a workaround to a problem that should be solved in other ways. There are a few problems that may lead to the need to use a hint:

  1. One is that the database statistics may not be up to date which prevents the optimizer from making the best choices. The obvious solution to this problem is to ensure that all appropriate statistics are collected frequently.
  2. Another problem may be with the database or application design. If the query optimizer cannot choose a path that will execute the query quickly enough, we may need to look at the use of indexes or even the data model itself. Does the data model actually represent the true relationship between data or is the model causing the application to do more work than it should?
  3. Finally, all major database vendors upgrade and improve their optimizers over time. For example, Oracle's optimizer has improved very significantly over the past few major releases.  If you are not on a recent release of you database software, Upgrade! When you do, rip out your hints and see how the new optimizer does, you may be surprised.

Reason Number 2: Database vendors change their optimizers with new releases. Hints that have good results in one release can have poor results or even cause a query to break in later releases.

Reason Number 3: As data volumes grow and statistics are updated in the database, hints prevent the optimizer from selecting new and better execution paths as the result of the change in the characteristics of the data.

Reason number 4: As much as we would like to think that we are smarter than Microsoft or Oracle, most of us have not put millions of dollars and many years into query optimization technology as they have. Since we are paying for this through our license fees, we should take advantage of that investment.

Reason number 5: Now the reason that led me to write this post: This excellent blog post describes situations where a hint can cause a query to succeed when one set of criteria is given and to fail when other criteria is given. This type of problem is very difficult to diagnose because it is sporadic and depends on specific criteria.  This problem occurs on SQL Server and in that environment, this issue alone is enough to rule out the use of query hints.

Bottom line: use query hints only as a last resort. Getting to and fixing the root cause of the problem will ensure that the query will continue to perform well into the future. 

Using the Data Dictionary to find hidden data in SQL Server

  | Share on Twitter Twitter | Share on Facebook Facebook | Buzz This  Google Buzz | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon |  Share on LinkedIn LinkedIn |  Share On Technorati Technorati | Submit to Reddit reddit | 

needle Photo by Tomas BuchteleA client asked me recently how he could find a string in his SQL Server database without knowing what table or column it was in. The string was a translation of a code that appeared on one of the UI screens. This was a packaged CRM database and he had no documentation on the schema. With hundreds of tables whose names were not obvious, he had no idea where the translation might be kept, but wanted to do some reporting using the translations. It was like finding a needle in a haystack!

In order to help, I wrote a utility to find the data for him. Using the data dictionary tables and a little dynamic sql,  I created a script that would search for a string in every column of every table in the database. 

We can do this by using the information_schema.columns view to create statements that insert search results into a temporary table. We want to record the table, the column, and the full contents of any column that contained the string we wanted to search for in order to provide context for the search results.

There are some complications that we have to address as we do this.  First, since we want to do a like comparison against any of the fields we must restrict the search to char and varchar fields. This is necessary because the like comparison cannot be used against xml and some other datatypes. That restriction works in this case because I was searching for a string and it was very unlikely that this string would be embedded in an xml field. Second, to prevent errors resulting from spaces, hyphens, or other special characters in table or field names, we must surround the object names with brackets -- this is always a good practice when using dynamic sql with sql server.

Since this utility uses the like comparison with a wilcdard before and after the search string, indexes will not be used so performance will be an issue. This utility is best run during non-production hours and may take a long time to complete. This can be mitigated by modifying the application to remove the leading wildcard and then indexes on any of the columns would be used, but this will only find strings that start at the beginning of the column value that is being checked. 

We now have a nice utility that give a report of all places where the string lives. The utility can easily be extended to handle numbers, dates, and other data types as necessary. This script works with SQL Server but similar scripts can be created for any major database that has a user accessible data dictionary.

Note that this procedure does not automatically delete the prior contents of the search_findings table. You may wish to add that to the script if you don't want the results to be cumulative.

The instructions follow.

1. Start by downloading the find-it source code.

2. Create the table that will hold the search results using create_search_results_table.sql

3. Create the find-it procedure using create_find-it_procedure.sql

4. Issue the search with the following command:

 exec find_it  'search string'

Storing documents in a MongoDB database instead of the file system.

  | Share on Twitter Twitter | Share on Facebook Facebook | Buzz This  Google Buzz | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon |  Share on LinkedIn LinkedIn |  Share On Technorati Technorati | Submit to Reddit reddit | 

Cambridge LibraryWe are currently working on an application that uses a mysql database. The application produces a large number of csv files that must be kept around for a few weeks, and then removed after being used in various ways by the users of the system. 

We need to keep track of all of these files and present a list of the files to the user so they can download them to their local machine, or send them to other systems as needed.  We need to track when the file was created, who created it, and other information. 

We considered a few alternatives for managing these files;  We could keep then on the file system and have scripts that keep the database synchoronized with the file system. Or we could load each file into a blob type and keep it in the mysql database. Either of these options will work fine if we are careful. But we are seeking something more.

Our research led us to MongoDB. An open source document oriented database.

MongoDB claims to bridge the gap between key-value stores which are typically used by high volume web applications that have simple data storage requirements, and traditional RDBMS database systems. 

In addition to simple queries based on key-value pairs, you can use boolean expressions, regular expressions, set operations, and sorting and limiting. MongoDB also has the ability to grab documents directly from the file system using its gridFS functionality. 

We have not implemented MongoDB for this project yet but it looks very promising.  

I will send updates as we get closer. 

All Posts