Why are data silos so prevalent when we really ought to know better? We know that it’s risky to have inconsistent data. We strive for the “single version of the truth” along with our IT and high tech vendors.
So, with all this awareness, why does almost every company end up with data silos?
There are two causes that I’ve seen in my consulting work. First, companies typically organize BI and DW related projects on a tactical and project-by-project basis. This makes data silos very, very likely. Second, these tactical projects are designed without an overall information architecture blueprint. This absolutely, positively guarantees a silo. You “can’t get there from here” if you don’t know where you are trying to go.
There are several reasons why otherwise smart people create projects on a tactical basis without an overall architecture:
Funding - Projects are funded by a group representing an organization or business process. These groups are focused mainly on their own information needs and delivering business value in the context of those needs. Building an enterprise-wide solution is not in their minds nor in their charter.
Confusion - Industry jargon creates the impression that things are different, when they’re actually similar. There are many similarities between CRM (customer relationship management), SCM (supply chain management), budgeting and forecasting, performance management and balanced scorecards. Many people do not see that each has data, data integration and business intelligence that can be shared across projects.
Technology - People tend to treat different technologies as different projects and applications. ETL (extract, transform and load), SOA (service oriented architecture), EAI (enterprise application integration) and data virtualization are considered different and treated as separate entities rather than different methods to implement data integration.
Disconnects - Data warehousing and enterprise applications are treated as separate worlds. Even though the business person may need reporting from operational applications and the DW, different projects are created to implement reporting from each of these data sources. And often IT splits its ERP and DW people into separate groups, further perpetuating silos.
No Architecture - Finally, without an information architectural blueprint how is anyone to know that these projects should be connected? If you split up building a house amongst various contractors and specialists such as electricians and plumbers without a blueprint and without these groups talking to each other it is unlikely the house gets built correctly. It’s the same with a DW or BI project. You need the blueprint (information architecture) - especially if the projects are set up tactically and on a project-by-project basis. This is called accidental architecture and it creates data silos.
Data silos result in poor quality data that is often inconsistent, wastes resources and time on overlapping and redundant projects, costs more money to build and maintain, and ultimately results in the business not getting information in a consistent, comprehensive and current manner. Data silos also encourage the creation of data shadow systems or spreadmarts, further exacerbating the problems.
How do we get out of this? Stay tuned for more posts in the People, Process and Politics series.






I recently co-authored with Wayne Eckerson of TDWI the
I'm working with Wayne Eckerson (Director, TDWI Research, The Data Warehousing Institute) on their new Best Practices Research report, "Strategies for Managing Spreadmarts and Integrating with Microsoft Office," which will be published in January 2008.
It was
interesting to read a recent article from DMReview.com
Well, it does matter how business users get information. It matters because, quite often, they get it from data shadow systems, which are groups of spreadsheets and local, customized databases - often Microsoft Access and statistical databases - created by business groups to gather data for their users. While these systems provide exactly the information that business users are asking for, they are rarely part of an enterprise's official data warehouse corporate performance management strategy. Outside the purview of the IT group, they often spawn data silos with the usual problems of inconsistency and quality.