Sunday 29 April 2012

Avoid Sorting

AVOID SORTING


Disk Sorting Excessive I/O which severs as the main purpose for Performance Hindrance when i do a sort via the Order by clause and which is not able to fit inside the sort_Area_size allocated in the PGA.Sorting starts in Oracle PGA RAM (defined by the limits of sort_area_size and pga_aggregate_target 5% session limit), and a "disk sort" is invoked when the sort exceeds the maximum PGA allowed for the session.

A disk sort can far outweigh the unwanted effects of regular disk I/O read activity because a disk sort involves both physical reads and physical writes. First, Oracle must perform physical writes to a session's temporary tablespace for the sort activity it cannot handle in memory. Then, the database must turn right around and read that information back from disk to build the result set requested by the SQL query. So in essence, it's a double whammy especially for large result sets that are the product of heavy sort requests.

Oracle states that "The new sort algorithm shows more performance improvement for in-memory sorts.".Oracle10gRw introduced a new sort algorithm which is using less memory and CPU resources.For sorting in Oracle 10g release 2 there is a hidden parameter called "_newsort_enabled" that turns-on the new sorting algorithm.
We also see that in-memory sorts are CPU-intensive, and that faster processors will also improve in-memory sort performance.

Disk operations are about 10,000 times slower than a row's access in the data buffers.

If the clustering factor-an integer-roughly matches the number of blocks in the table, your table is in sequence with the index order. However, if the clustering factor is close to the number of rows in the table, it indicates that the rows in the table are out of sequence with the index.


In large active tables with a large number of index scans, row re-sequencing can triple the performance of queries.

Also, excessive disk sorting will cause a high value for free buffer waits, paging other tasks' data blocks out of the buffer.

Each sort in a query can consume memory up to the amount specified by SORT_AREA_SIZE, and there can be multiple sorts in a query. Also, if a query is executed in parallel, each PQ slave can consume memory up to the amount specified by SORT_AREA_SIZE for each sort it does.


SORT_AREA_SIZE is also used for inserts and updates to bitmap indexes. Setting this value appropriately results in a bitmap segment being updated only once for each DML operation, even if more than one row in that segment changes.Larger values of SORT_AREA_SIZE permit more sorts to be performed in memory. If more space is required to complete the sort than will fit into the memory provided, then temporary segments on disk are used to hold the intermediate sort runs.

SORT_AREA_SIZE's are not allocated until they are needed (and GROW to be sort_area_size bytes in size -- they don't start there) and are freed the second the sort is over.

Oracle will always use the cheapest method for sequencing a result set, and the optimizer will use index retrieval (extracting the rows in sorted order) if it consumes fewer resources than a back-end sort. Remember, a sort that cannot fit into RAM will have to be done in the TEMP tablespace, very slow with lots of disk I/O.

As a general rule, the sort_area_size should be large enough that only index creation and ORDER BY clauses using functions should be allowed to use a disk sort.

A Query on the V$sysstat gives info on the sorts performed both in memory and disk sorts and also the number of rows that was sorted.

In general, Oracle sorting occurs under the following circumstances:
 
  • SQL using the ORDER BY clause
  • SQL using the GROUP BY clause
  • When an index is created
  • When a MERGE SORT is invoked by the SQL optimizer because inadequate indexes exist for a table join

Other SQL operations that might require disk sorting are:
 
  • ANALYZE
  • Select DISTINCT
  • UNION  causes a SORT to eliminate the duplicates whereas UNIONALL Does not
  • INTERSECT
  • MINUS
  • Sort-Merge joins.

IN MEMORY , ONE PASS AND MULTI PASS SORT

Explanation by Tom Kyte

In memory sort Everything fits in memory, no disk is needed, everything done in ram.

one pass sort We read some amount of data and sort it, the sort work area fills up, we write that to disk. We do it again (read more, sort it) - this too spills to disk. We do this over and over until we've read the entire set of data and sorted each of the chunks. Now we need read a bit of each of the chunks we've sorted on disk and start returning data from there. In the one pass sort, we can read a bit of each chunk and start returning data. If we cannot read a bit of EVERY chunk then....

we are in the realm of the multi-pass sort. We read and merge the first few chunks - resulting in a new set. We read and merge another set of chunks - resulting in a new set. And so on - until we've processed all of the output from the first pass and created a "second pass" of more merged (more sorted) data. Then we merge those in the sort area and start returning data.







No comments:

Post a Comment