Loading
YITT
Data Retrieval Performance

Data Retrieval Performance

9/15/2019 11:09:08 PM

I recently had a situation where multiple IT associates stated they had "exhausted" all performance enhancement options and the users were just going to have to deal with the report taking 4-6 minutes to return their result. In this situation, just one result. Yes, you heard me right; the report wasn't a Data Warehouse analytical summation of millions of records, across complex filters and joins; but rather an Operation Data Store report where the users entered in 1 parameter in which the result should only return the data for that 1 record. Even worse, the parameter the users were providing was the main utmost parent in the whole model and even was a unique identifier in said table.

Sure, the report needed to pull information about this one parameter from multiple tables, around 20, but they were all PK / FK to the main driving table. I contended that there must be something wrong because even if there were 100 tables, as long as the users were giving us 1 record to search by, the result should return sub-second.

After a nice Friday evening of trouble-shooting the report, it was determined that the developer tried to use multiple sub-reports and chain the results from one to be used in the next. This is a fairly common practice among the big hitter reporting tools, but this was not working for one reason or another. In addition, another developer which had helped with some of the SQL statements had added a where condition that wasn't really needed to a huge table on a column that wasn't indexed. This in turn caused a full table scan and essentially killed the performance of the report.

To remedy the issue quickly, we changed each sub-report to just use the parameter that was entered by the user instead of trying to chain from the results of the previous report. The other where condition as removed and automatically... the report that was taking 4-6 minutes to run with "no way" to improve its performance was now returning in 2 seconds or less.

The real reason that I bring this up is to emphasize what a relational database does. By design, it breaks out the data into normalized structures. In most modes you will have many, many tables. You may have upwards of 100 or so reference tables. You may have many associative tables to break up the so-called many to many relationships. Regardless, if each of those tables are architected appropriately with PK's and enforced FK relationships, getting data out should be no problem.

Be careful to not let anyone in IT get away with saying "It just is what it is and can't be improved" because that is often not the case. There are very few cases in my career where I haven't been able to at least shave some amount of significant time off of performance hog queries.

Now, when we get into the Data Warehouse and more analytical number crunching, the second response times can get more challenging... but I will share my thoughts on that at a later time...

If you have any Reports that do not return you the data you expect in a timely manner, please let me have a look.

Comments   0

Leave a Reply

Your email address will not be published. Required fields are marked

 *
 *
 *
 
Back