Once upon a time, there was the enterprise data warehouse. The idea seemed to make sense: put all the data you want to analyze in one place, and then you can do all the analysis you might ever want to do. Economies of scale will make analysis easier and cheaper than having to figure out how to look at data sitting in lots of little piles scattered all over the place, which is messy and inefficient. The elegance of a single, optimized solution was very appealing.
And so we built massive monuments to data science. More and more data was sucked into the data warehouse, with complex ETL jobs to change it into the One True Format the data warehouse required. Data flowed into the data warehouse from all over the organization and a future of endless insights gleaned from Big Data seemed inevitable.
But there was a problem.
And maybe our company would acquire another one, and then we’d have two data warehouses. Or a business unit would decide that they needed to do things a bit differently, so they’d create another data warehouse that was also, confusingly, an enterprise data warehouse.
[...]