Supplement Your Microsoft Business
Intelligence Strategy with the Fast Performance and Excellent ROI of
Personal Data Marts
Tim Peterson
How fast is fast enough for OLAP browsing?
OLAP is at its finest when the data changes in less than a second. The
user clicks and the data changes instantaneously. This is the kind of
performance that allows a user to really explore an OLAP cube –
drilling up, drilling down, trying one filter, adding an additional
filter, switching the dimension displayed on the columns, etc.
Nigel Pendse wrote an article called Performance Matters for
The OLAP Report (www.olapreport.com) in August, 2007. In it he
says the following about OLAP tools:
Slow query performance has been consistently the most serious
product-related reported problem, and for the last few years it has
been the single most often complained of problem.
According to data gathered by the OLAP Report, industry-wide median
OLAP query time averaged between 7 and 7.5 seconds in 2006.
|

Median input data volume and median query performance trends
Used with permission of The OLAP Report |
|
Microsoft SQL Server 2005 Analysis Services provides
excellent OLAP browsing speed. But most organizations using Analysis
Services could benefit in certain situations with the performance
boost provided by personal data marts. If your OLAP query response
time is over sixty seconds, you may be able to bring it down to five
seconds. If you now have a response time of ten seconds, you may be
able to reduce it to one second.
With faster speed your OLAP cubes will provide more actionable
business intelligence because your business users will be willing to
examine the data from more perspectives.
What Are Personal Data Marts and Why Haven’t I
Heard About Them Before?
A personal data mart is a set of one or more customized local cube
files that have been created for a specific user’s queries. Each local
cube contains a small portion of the data available in the Analysis
Server cube. Because they are small, they can be very fast.
For example, a company could have a large Sales cube, with data for
1,000 Sales Representatives for the past 5 years. The Sales
Representatives examine the Sales cube every day. They spend most of
their time analyzing their own sales for the most recent month. A
local cube containing only the data for one Sales Representative for
one month will have only 1/60,000th of the data in the Analysis Server
cube. Unused members from the Product dimension, the Customer
dimension, and other dimensions can be removed from the local cube.
This small cube will browse much more quickly than the Analysis Server
cube, especially when calculations are being evaluated. This fast
speed can be achieved while still keeping all of the cube’s
dimensions, attributes, measures, and calculated members.
Besides improving browsing speed, the local cube files also provide
the flexibility for the Sales Representative to look at data when not
connected to the corporate network.
Local cube files have been a part of the Microsoft BI toolset since
1998 with the release of SQL Server 7. Their purpose has been to
provide users access to OLAP data while they are offline. The use of
local cubes to improve cube browsing speed has often not been
recognized.
While Microsoft has always provided the ability to create local cube
files, it has often been difficult to find tools to automate the
repeated creation of customized local cubes.
With the release of SQL Server 2005, the ability to design local cube
files has been improved. The Analysis Services Scripting Language
(ASSL) gives the user full control over the creation of the local
cube. Most users, however, do not have access to a tool that creates
local cubes using ASSL. Local cube creation tools usually use the
Create Global Cube statement, which gives far fewer local cube
creation options. A cube created with Create Global Cube cannot be
optimized for browsing speed in the same way as a cube created with
ASSL.
The result is that there are organizations with a lot of experience in
using Microsoft Business Intelligence, but little experience in using
local cube files. Many of these organizations could achieve a dramatic
improvement in their cube browsing speed by switching to personal data
marts for their OLAP data delivery or by using personal data marts for
at least some of their more challenging cube browsing situations.
How Can I Know If My OLAP Performance Will
Improve With Personal Data Marts?
The only way to know for sure, of course, is to try it. However, we
have found that personal data marts can be particularly useful in the
following situations.
1. The cubes have one or more large, flat dimensions. Sometimes
dimensions are designed without hierarchies or with hierarchies having
very few levels. One example of this is the Sales Order dimension in
the Adventure Works sample database that is distributed with Analysis
Services. The Analysis Server cannot build effective aggregations for
a large, flat dimension.
If you design a local cube that removes a significant portion of a
large, flat dimension (such as limiting the Sales Order dimension to
the sales for a single Sales Representative), you will greatly improve
browsing speed when the user is looking at that dimension.
2. The cubes use complex calculated members. Calculations can
slow browsing speed. Because there is less data in a local cube, the
calculations can be evaluated more quickly. The improved performance
is most evident in a calculation that has to evaluate a large number
of null cells. Because local cubes can have all unused dimension
members removed, the number of null cells can be exponentially
reduced.
3. Each individual user is using only a small portion of the cube.
Personal data marts are especially effective at improving browsing
speed when each user typically looks at a small subset of the cube’s
data – like the previous example of a Sales Representative looking at
the most recent month’s data.
4. Some users only want to see an odd subset of the cube. I
have often seen situations where users want to see an oddly shaped
subset of the data – such as all sales from Region 1, together with
the sales from two cities in Region 2, excluding sales initiated by
the corporate headquarters staff, and only including sales to
customers who haven’t purchased anything in the previous two years. A
good OLAP client tool will allow a user to see this subset of the
data, but it is much easier and there can be much better performance
if the user starts browsing with this exact set of data.
5. Cube browsing speed is adequate for some but not all users, or
is adequate at certain times but not other times. Analysis Server
cubes can be effectively optimized for the typical user, but there may
be certain users who are browsing the cubes in unusual ways that
greatly increase query response time. When one user issues a long
query, the query response time can be slowed for everyone using that
Analysis Server. In these situations, it can be very helpful to create
personal data marts for the users that have the lengthy queries.
Specialized local cubes can be designed to improve the browsing speed
for these users. Because the Analysis Server no longer has to evaluate
these complex queries, the browsing speed for all the other users can
be improved and will be more consistent.
Can I Achieve the Same Level of Performance
Without Using Personal Data Marts?
In some situations, you can match the performance of personal data
marts by using other strategies. In many cases, however, these other
strategies are inadequate or may be more expensive to implement.
1. Add MOLAP aggregations. The Analysis Server gains a lot of
its ability to quickly respond to queries through the use of
aggregations. If you haven’t done so already, create MOLAP
aggregations for your cubes. Aggregations will not, however, increase
the browsing speed in large, flat dimensions. They may or may not be
helpful in improving the speed of calculated members.
2. Set proper attribute relationships in hierarchies. When
using the Analysis Server from SQL Server 2005, aggregations cannot be
created for hierarchies unless relationships are set for the
attributes used to create the levels.
3. Add partitions to the cube. When used with appropriate
slicing in the queries, partitioning can greatly improve cube browsing
speed. If there is one single dimension that is causing browsing speed
problems, adding partitions along that dimension can be very
effective. Personal data marts, though, are more effective when
different users need a cube optimized along different dimensions. It’s
very difficult to take every individual’s situation into account when
designing partitions for an Analysis Server cube.
4. Optimize calculated members. There are many strategies for
optimizing calculated members which, in some cases, can dramatically
improve browsing speed.
5. Use more powerful hardware. The Analysis Server performs
better with a 64-bit server, multiple processors, and a large amount
of memory.
6. Teach users how to avoid problem browsing areas. Many
Analysis Server cubes will perform well as long as the user doesn’t
pick the wrong combinations of attributes. In many, but not all,
situations, slow performance can be avoided by picking appropriate
slicers before attributes are placed on the columns and rows.
7. Simplify the Analysis Server cubes. Look at the list of
situations where personal data marts can improve performance and
modify the Analysis Server cube so that these conditions are removed:
• Eliminate large dimensions
• Remove the lowest level of detail in large dimensions
• Add levels in large, flat dimensions
• Remove calculated members if they are slowing cube browsing
• Remove seldom used attributes
In summary, all these strategies can speed OLAP browsing. Some of them
will make a big difference with very little effort. But in many cases,
the easiest and cheapest option is to give your users personal data
marts.
There are many excellent resources available for optimizing Analysis
Server cubes. See the end of this article for suggestions on papers,
web sites, and blogs.
Personal Data Mart ROI
In considering the ROI of a personal data mart, there are three
primary courses of action:
1. Be satisfied with cubes that have slow browsing
speed
2. Use one or more of the other strategies to improve
cube browsing speed
3. Use personal data marts to improve cube browsing
speed
Each of these options has a cost associated with it.
Course of Action #1 – Be Satisfied
When looking at the costs of slow browsing speed, consider the
following:
1. What is the overall value of your Microsoft Business Intelligence
system? How much did your BI system cost? What benefit is your
organization receiving from the cubes? What additional benefits are
available through more effective use of the cubes?
2. Your cubes will not give a good ROI unless they work well for your
users. Whatever benefit you receive from the cubes will be greatly
enhanced if they can be browsed quickly. Fast browsing cubes are an
essential part of a Business Intelligence system. It is unwise to
spend a lot of money building a BI system without also achieving the
best possible cube browsing speed.
3. What speed is too slow? Nigel Pendse, in an article entitled What
Is OLAP?, has said that a defining characteristic of OLAP is fast
browsing response time – where simple queries take one second, most
responses are returned in about 5 seconds, and very few responses take
more than 20 seconds. I think this should be considered to be a
minimum standard. Users (and organizations) will gain much more
benefit from OLAP if the typical query is under 1 second and very few
queries exceed 5 seconds. With this level of performance, users feel
that they’re getting their questions answered immediately.
Course of Action #2 – Other Strategies
The cost is quite varied for the other performance-enhancing
strategies:
1. Some strategies are very cost-effective. If you can achieve
excellent cube browsing speed by adding aggregations, setting
attribute relations, adding partitions, or by optimizing calculated
members these would usually be the best options. Even if you have to
hire a consultant to help you do this, the benefit should be worth the
cost.
2. Some strategies reduce the value of the cubes. If you have to
remove dimensions, attributes, levels, or calculations you need to
consider the cost of giving up some of the cube’s functionality.
Adding levels to a hierarchy can also make a cube harder to use. The
option of teaching users to avoid certain types of browsing has a very
high cost because it restricts OLAP browsing to the people who have
learned how to do it, rather than having the benefits of OLAP insight
available to everyone in the organization.
3. Some strategies can be expensive. The improved hardware option has
a definite cost attached to it. And though better hardware can be very
helpful, by itself hardware may not bring optimal performance.
Course of Action #3 – Personal Data Marts
When using the personal data mart strategy, you need to calculate the
costs of doing these tasks:
1. Designing and creating the local cubes. It is fairly easy to create
local cube files in SQL Server 2000. In SQL Server 2005, it is usually
necessary to use the Analysis Services Scripting Language (ASSL) to
create optimized local cubes. The ASSL can be quite difficult to
manipulate. Our company has developed a product, CubeSlice, which uses
ASSL to create customized local cube files that can be optimized for
browsing performance.
2. Distributing the local cube files. Local cube files need to be
recreated when new data is available. These new local cube files need
to be distributed to the users on a regular basis.
Summary of ROI Considerations
There are many factors involved in calculating the ROI of personal
data marts. From my perspective, the key factors are as follows:
1. Poor cube performance is not acceptable. If OLAP cubes are needed
by an organization, those OLAP cubes need to have excellent
performance.
2. There are some inexpensive ways to improve the performance of
Analysis Server cubes. You should always try these methods to see if
they will give the performance you need.
3. In some situations, personal data marts and local cubes provide the
fastest possible cube browsing speed. There are expenses in setting up
a system of creating and distributing local cube files, but the
results are often well worth the expense. When you use personal data
marts, you can give individual users a set of cubes that are optimized
for the querying they want to do.
Fast Cubes – Happy Users – More Business Insight
The goal of using personal data marts is to make OLAP fast –
convenient, easy, and effective.
Fast cubes make for happy users. If there are happy users, there will
be more users. And with fast cubes, each of those users will be able
to find more insights to improve the organization.
About the Author
Tim Peterson is the lead author of Microsoft OLAP Unleashed (SAMS,
1999) and the author of Microsoft SQL Server 2000 Data Transformation
Services (SAMS, 2000). He is a speaker, teacher, and consultant. He is
the chief software architect of CubeSlice, a tool that automates the
creation of local cube files.
Would you like to see if the use of local cube files in personal data
marts could speed up your cube browsing? Sign up for a free demo of
CubeSlice now. Using remote desktop software, we will show you how to
design high performance, customized local cube files. You will see for
yourself whether they would be useful in your organization.
For More Information
www.cubeslice.com
How to Dramatically Improve Browsing Speed Performance for Microsoft
SQL Server 2005 Analysis Services
Local Cubes in SQL Server 2005 Analysis Services
The Local Cube Information Center
Tim Peterson’s BI Performance Blog
– Local Cube and Other Strategies
to Boost OLAP Cube Browsing Speed Performance
www.olapreport.com
What is OLAP?
Performance Matters
Extensive and detailed reviews of OLAP products
www.microsoft.com
Microsoft SQL Server 2005 Analysis Services Performance Guide by
Elizabeth Vitt.
“Query performance directly impacts the quality of the end user
experience. As such, it is the primary benchmark used to evaluate the
success of an OLAP implementation.”
www.xlcubed.com
Build Better Cubes: Real-Life Advice on Building Analysis Services
Cubes by Gabhan Berry
“Every business intelligence solution will have its problems with data
scalability. It’s inevitable. Almost always, the source data will
increase in size over time. What you have today may perform adequately
but this may not be true next month or next year. As technology and
hardware has improved, this problem has been alleviated but not
solved; the limits have simply been moved, not extinguished. Building
a cube that uses all data for all time, and where the data increases
over time, is a recipe for a cube that will eventually be too slow to
use.”
Richard Tkachuk’s Analysis Services Page, with an article about
optimizing calculations – MDX Performance Hints
www.ssas-info.com
Everything about MS SQL Server Analysis Services 2005
(Suggestion – Select the menu item SSAS Articles by Subject and then
choose Performance)
Microsoft OLAP by Mosha Pasumansky
Microsoft SQL Server Development Customer Advisory Team
Chris Webb’s BI Blog – MDXtreme Programming
|