Business intelligence (BI) is a continually changing landscape. In-memory analysis tools are maturing rapidly, and business users are more technically savvy and demanding of control of information. In the Microsoft Office suite, Excel and Access have been the tools of choice for these business users, while BI experts have rejected these solutions as difficult to control, and not enterprise grade.
Microsoft addressed this tension with the introduction of PowerPivot, making it much easier to get data into Excel. Now, Excel is becoming the standard tool for transforming data into relevant and meaningful information. With the releases of Excel 2013, SharePoint 2013 and SQL Server 2012, we see business solutions built on the Microsoft BI platform changing the role of enterprise data warehouses and data cubes. The groundwork is laid to shift how one gains business insight and how data is managed and delivered in the world of modern business intelligence.
State of the BI World
The current world of BI seeks to achieve the data warehousing’s Holy Grail of a “single source of truth.” Traditionally, this involves fairly complex projects that must move and transform data to meet the needs of the data consumer. Often, data is structured around Kimball’s Dimensional Model which organizes measurable facts and descriptive dimensions into a star schema. In order to achieve this level of simplicity in the model, a business intelligence or data warehousing project must move, transform, and reorganize data from a variety of sources. Not only is this a time consuming process, but it is notoriously complex. Only after this process has been completed are users able to interact with the data.
Business Input Required
Solution builders depend on business users to help them understand and process requirements for a project. The quality of this communication is often the determining factor between success and failure of a BI solution. Most often, the business users and solution builders struggle to adequately translate business needs into a usable solution design. But here’s the interesting twist: where in the past these communications were between business and technical folks, they are increasingly happening within business teams. New communications patterns are emerging to enable these discussions.
Without user input into the design, there’s no way to make sure they get what they need. The solution may stand idle because it doesn’t do what it is supposed to do, or because of unforeseen barriers to adoption. The challenge is to get those busy business people to invest their time up front, giving the input needed to produce the application that will return the investment many times over in the form of fast, efficient, leading-edge BI.
In a highly competitive and often global marketplace, businesses need to be exceedingly responsive to changing consumer trends, new regulations, and other external demands. A BI Solution must be timely to be relevant. We have seen many situations where the enterprise data warehouse (EDW) is no longer relevant, yet still being maintained. Historically, projects would span years before being available to the business and the first analysis performed, but that model has changed over the last few years. In the process of trying to resolve the issues of scope and duration, BI architects use agile type methods and iterative development to minimize the impact of change to the solution. This allows delivery of solutions that meet user needs more quickly than in traditional builds.
However, changing the deployed structures in a data warehouse to reflect a changing business is still a costly investment in time and resources.
Costly Barrier to Entry
While the value of a well-designed, properly implemented BI solution is well understood, the actual cost of BI projects is more than is reflected “on the books.” For many organizations, these costs represent a formidable barrier to achieve usable information from their BI solutions. That first step is way too high.
The user wants and needs their information turned into data they can use. How do we help them overcome this cost hurdle?
Changing Our Starting Point
Traditionally, BI projects start with intensive requirements gathering sessions including interviews and meetings. Once those requirements are gathered, they must be translated into the data model. Data architects build complex ETL solutions to pull data from their points of origin into a central repository called a data warehouse. What if we could change this paradigm? What if the users closest to the meaning and value of the data build the model? That is not only a possibility, but is quickly becoming the new normal.
PowerPivot for Excel
Microsoft introduced PowerPivot for Excel with Excel 2010. PowerPivot is a SQL Server add-in which expands Excel’s capabilities with an in-memory engine called xVelocity In-Memory Analytics Engine (xVelocity). PowerPivot gives the Excel user the ability to pull data from a variety of data sources into Excel. Once in PowerPivot, traditional data cube functions such as relationships, calculated members, and calculated measures can be used to organize the data for use in Excel. With its impressive compression and indexing capabilities, xVelocity performs traditionally difficult and time intensive data tasks with ease. Operations such as sorting and filtering are now easily done against data tables with millions of rows. All of this is possible without leaving the comfort of Excel.
PowerPivot in Excel is our starting point. By making data available to our users, they will put together a data model which meets their needs. They are able to pull data from OData feeds, SQL Server Reporting Services reports, SQL Server databases, Oracle databases, Teradata databases, Excel workbooks, and even other PowerPivot models. Then, the user can create the relationships and build a data model that is right for the job . In this scenario, the desired result drives the underlying implementation, not vice versa.
Risk and Reward
By putting this type of power into the hands of business users, there is opportunity for mistakes. These mistakes are magnified when business decisions are based on user-created data analysis. But this isn’t new – spreadmarts with cobbled together data and inaccurate formulas have long persisted within organizations. The difference is that with training and greater familiarity with the data, these users will begin to produce the business insight that the data warehouse is designed to deliver. The key difference: the insight is delivered at a fraction of the cost and time.
Although enabling users to create their own insights may appear risky, tools such as Qlikview and Tableau have delivered the same capabilities since they arrived on the scene. These in-memory analysis tools are designed to be used by business users to perform analysis of disparate data sets. The reward is worth the risk. It appears that the market perceives this to be true as well, as shown by the success of these products.
Building the Solution
Let’s begin our exploration of this paradigm shift toward user empowerment with a discussion of Excel. How does that translate into a BI solution? Isn’t this just spreadmarts all over again? How do we prevent this from becoming an ad hoc file share nightmare? By strategically planning for growth in users and content, you can build a solution that will meet the needs of the organization while still empowering your users.
The Power of Excel
Excel is the cornerstone for creating and delivering business insight within your organization. All BI components including reports, dashboards and pivot tables originate within Excel, and in particular, within PowerPivot. With the latest version of PowerPivot, users can create the data model visually using the diagram view.
It is clear that Microsoft’s strategy with BI is to enable business users with tools and capabilities once reserved for technical resources to:
- Access data from a wide variety of sources (including OData)
- Create relationships within the data
- Identify and remove “dirty” data
- Create dimensional data models
- Publish data models to SharePoint
- Create relevant and timely analysis and visualizations
Excel, with PowerPivot, now provides all of these capabilities, without requiring any technical resources. Let that sink in for a moment. Some people may claim heresy, some may be skeptical, while others may say, “It’s about time, why didn’t we do this from the beginning?” Whatever your position, the reality is that this empowerment movement is already underway.
A Point of Convergence
In the good old days of BI, a technical person had all the sophisticated tools and access to data and the business person had Excel with very limited access to ‘real’ data. PowerPivot changes this dynamic by being a place where the technical person and the business person can come together as equal partners in delivering BI solutions.
Sharing with SharePoint
Picture this scenario: your Excel-savvy power user creates a PowerPivot workbook that the entire finance department wants to use. She comes to you and asks about sharing her spreadsheet. Should she email it? Put it on a fileshare? What do you tell her?
You tell her: publish the PowerPivot workbook into SharePoint. PowerPivot paired with Excel Calculation Services in SharePoint provides some additional capabilities such as scheduled data refreshes. This is not just a file in a library – it is a fully functional, in-memory data model. Once a PowerPivot workbook is loaded into SharePoint, Excel Calculation Services converts it to the server supported version using SQL Server Analysis Services. Not only does this allow users to share the uploaded workbook online, the uploaded PowerPivot workbook becomes a data source for reports and other data visualization tools that can connect to a SQL Server Analysis Services (SSAS) database (aka data cube).
There are a couple of restrictions with this solution. First, the workbook can only be 2GB. This is due to how SharePoint manages files. While PowerPivot does a great job of compressing data, the set of data can eventually outgrow 2GB. Furthermore, the data refresh is done at the table level. This means that large tables can be painful to refresh as all of the data in the table is removed and reloaded.
Converting to Tabular
When the PowerPivot model has exceeded its 2GB size limitation in SharePoint or more sophisticated data requirements are needed such as partitions or more granular security, the PowerPivot model can easily be transformed into a Tabular model in SQL Server 2012 Analysis Services. The Tabular Model was added to the SQL Server Analysis Services offering in addition to the Multidimensional Model (e.g. data cubes). The Tabular Model is an in-memory data store built on the same xVelocity engine that is in PowerPivot. If size is the only consideration, SQL Server 2012 Data Tools supports the Import from PowerPivot option to create the tabular model and you are ready to go. If you need to implement partitioning for more selective data refreshes or you need more granular security, some modification of the model will be required after the model has been imported. Beyond scalability or size limitations, the Tabular Model adds management flexibility, more granular security, and the Visual Studio development platform.
Going Beyond xVelocity
Now imagine that the PowerPivot workbook originally created by the business has grown well beyond its starting point. What are the next steps? The traditional world of BI is still at your disposal and now you have a data model from which to build. Cubes, data marts, and even data warehouses can be generated from the models or modifications of the models created in PowerPivot. But the real question is, should you?
Rethinking the Role of the Data Warehouse
As you can see from the previous sections, the data warehouse is not what it used to be in this scenario. Does a data warehouse still fit into this model of BI delivery? In a word, Yes. The data warehouse needs to take on a new role. More precisely, the nature of the data changes - data warehouses have always been places to put data for use by users. Data warehouses, operational data stores, and data marts serve a number of purposes for the current design pattern. The new goal of a data warehouse should be to surface data for users to consume with PowerPivot. With the powerful capabilities within PowerPivot to handle various data sources and relationships, data warehouses will likely need to focus on “edge” cases or specific scenarios where more direct data access will not work such as complex ERP data sources or sources with sensitive data. A key difference in the world of modern BI is that data will be enhanced or turned into information by the users and not by the data warehouse architects. And that is exactly where insight should happen.
While the role of the data warehouse as the primary store of accessible corporate data is diminishing, the data warehouse will still be a valuable asset for many companies. PowerPivot highlights gaps in data needed to gain additional insight, which serves as business justification for investments in enterprise-grade solutions and platforms such as data warehouses. A key change in the approach to data within an organization is to scale the data warehouse only large enough to get started. Communication of requirements at a high level to support a scaled-down data warehouse takes a fraction of the time that it does to plan for everything. If you omit critical needs in the early stages of developing the data warehouse you can revisit these needs later with a much better-informed perspective. When the PowerPivot author needs content added or modified in the data mart, they are capable of providing very precise requirements. This in turn reduces the time required to implement the change in the data warehouse.
Excel Isn’t Just for Data
Excel is no longer just a spreadsheet. With very little work, users can create dashboards, interactive analytic visualizations, and even reports using Excel. With the release of Excel 2013, Microsoft has unleashed a premium BI data visualization tool that everyone already knows how to use. As stated previously, Microsoft has positioned Excel as the de facto standard BI tool. This is a wise move considering the massive adoption of Excel within organizations.
Excel supports slicers, charts, conditional formatting, sparklines, and pivot tables. And that is just the beginning. Interactivity is key to the updated user experience in Excel 2013 including the Timeline Filter, Quick Explore and Quick Analysis to name a few. GeoFlow is a 3D mapping add-in that is currently in preview. GeoFlow allows you to create very cool geographic representations of data that actually “move” through time. Power View is also embedded in Excel 2013. This breakthrough visualization tool from Microsoft supports mapping, time series, and highly interactive analysis.
These tools are not limited to the desktop. Once deployed to SharePoint 2013 or Office365, data consumers can use these online without any concerns about capabilities on their desktop. This solution also works as you first step into mobile BI as SharePoint and Office365 expose Excel Web Apps to tablets such as the Windows Surface RT and the Apple iPad.
A Glimpse into the Future
As we look into the future, Microsoft is “all-in” with Excel as the enterprise BI client. As a result, we should expect to see continued innovations in the product. This includes various preview products such as Data Explorer and GeoFlow. With the introduction of Office and SharePoint Apps, more visualization and data tools will be made available through the Office Store.
What about ETL for Excel? While currently in preview, this add-in allows you to discover data in the public arena and retrieve data from additional data source types such as Hadoop. But what makes it truly powerful is its capability to “shape” data. Data Explorer uses steps to manipulate the data that is being requested prior to the data landing in Excel. This includes adding columns, changing and separating data, and various other transformation tasks usually reserved for ETL tools such as SQL Server Integration Services. While not as powerful as SSIS at this time, this capability allows the business user to create an ETL process to load the data.
Apps for Excel
This is how we see the modern BI solution taking place: even with a data warehouse, the primary goal is to get the data into Excel. In the end, data is in the warehouse; information is in Excel. As data warehouses continue to age without grace, more people are going after the operational data directly. Traditionally, requesting reports in large IT institutions requires opening projects, planning, and weeks to months of labor to complete. Give it to me in Excel, and I can show my boss what she needs in hours.
The role of IT will be that of data providers more than report writers or information builders. Developers and data teams will create more ways to access the data that meets their needs for security and performance. Timely and valuable insight is delivered quickly at a much lower cost, resource and time profile by enabling the people who best understand their data to harness the power within Excel 2013. When IT enables business, everyone wins.