Filtering by joining is difficult to optimise

 



Joining data is one of the most complex parts of SQL optimisation. Especially when you are dealing with large tables and you are trying to visualise what set processing the query engine will do.


You might say you don't need to do all that, you can just use the Database Tuning Adviser. Well you're wrong. The database tuning advisor only helps out indexing a database. It doesn't help out when you have bad SQL. Do you have an app that fires a query for each line in an order, or calculates the order totals for all customers but only returns one of them


All these situations are about database querying design. The two examples above are quite simple to see and fix, but what happens when you just have a very large query that takes ages.You've got covering indexes, and key aligned indexes as much as possible but its still slow and you are only returning a few rows. Well what is most likely happening is that within the query you are processing more data than you need to. Trying to resolve this is very complex and oftens comes down to joining.


There are a number of ways to reduce the number of rows processed during a query,


1. use a where clause with a literal


2. join to another table


option 1 is very easy for the engine to process and can often mean the rows never make it out of the storage operator, index seek, index scan. The second is more difficult. If you have two tables with 1 million rows and when joined you only get 10 records. The Query processor has to likely process all rows in both tables.You maybe lucky and have key aligned indexes (indexes where the key columns are in the same order and match those of the join clause) which means you can do a merge join, if you haven;t got key aligned indexes then you will likely have a hash join. In both case you still have to read all the data. What happens when you add a 3rd table into the equation. that makes things more complex.


The summary is that even though you only need 10 rows if you are joining tables to filter it is likely you have to process all dat in both/all tables. If you can't do anything about the query/table design you need lots of memory and if your data is greater than your memory you need fast storage


 


Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>