You might be looking at your data & analytics tech stack and thinking to yourself, is this good enough, or is it worth investing in major improvements? It’s a tough question to answer, as the benchmark for what warrants the spend on evolving to a more contemporary architecture will be radically different depending on your industry, your data volumes, your user’s needs – and well really, your ability to answer critical business questions today. Users might be complaining – but then again, they might be squeaky wheels who aren’t familiar with the full set of features currently available.
Before we begin, a few quotes for your reading and consideration pleasure. My favorite principle behind the Agile Manifesto is:
Simplicity - the art of maximizing the amount of work not done - is essential.
I’m a huge believer in NOT doing things that are unnecessary. I love cutting scope. In fact, despite what many people believe about consultants, I LOVE cutting entire projects when the value proposition is questionable. Then again, in the spirit of Patrick Verona’s convictions in 10 Things I Hate About You,
Don't let anyone ever make you feel like you don't deserve what you want.
1. Is your industry radically evolving at warp speed?
Quickly-moving industries often result in rapidly swinging targets, benchmarks and objectives. With these changes, your data needs to be able to evolve at a matching speed – and ideally, even get you thinking ahead of the curve to gain an edge over your competitors. For a recent data evolution project with a quick-serve fast food giant, this was absolutely the case. Within the last decade, the way we order and consume foods has radically evolved. Most brands now have a mobile app or some mechanism of online ordering. Even more brands, including small mom-and-pop locations without the tech infrastructure to create an app for themselves, have partnered with delivery service providers to get food in the hands of customers without them having to leave their couch. Could you imagine what would have happened to this industry if COVID-19 hit 10 years ago before these infrastructures were in place? It’s hard to remember the times before we could order anything we wanted in a few clicks from our phones, but it was in those times that most of the business intelligence architecture was implemented for these large restaurant corporations. Back then, the ability to quickly bring on new sellers for your brand or integrate new customer feedback channels into your core data set wasn’t a pressing issue. No one was really doing these things to a scale that mattered much yet. But by the time 2015 rolled around, these started to become significant revenue channels for brands, and the ability to analyze sales results holistically across a number of experience channels surged in priority. Marketing, Sales and Operations DESPERATELY needed to invest in data evolution for these new channels, but the existing infrastructure didn’t make this possible.
2. Is your current architecture scalable?
Traditional data architectures before solutions like Redshift and Snowflake came around to support the OLAP (online analytical processing) use case often included cubing your big transactional data so that you could analyze aggregations on select dimension sets quickly. Cubes basically take every possible dimension combination with your facts, and (often overnight in big batch jobs) pre-aggregate these results so analysts can explore the rolled-up data by product, or day, or whatever other dimension slice is supported in the cube. For this fast-food example, when you took the amount of data flowing into our transactional systems and tried to cube it nightly, what started to happen almost weekly would stop the decision-making of sales analysts in their tracks:
- Cubes would take 12+ hours to build, and if there was a single failure, it could take days to rebuild and recover.
- Cubes were virtually impossible to change; the amount of work and complexity that would go into adding even a single dimension in the model would introduce more risk than it was worth. So, the behavior around any new order capture mechanisms or marketing channels was impossible to analyze distinctly from the traditional data set.
- Cubes prevented analysts from being able to get down to any transactional behavioral data any more granular than the lowest dimension level in the cube (which was usually just product + hour).
The cubes were great for giving sub-second response times to queries across all of these transactions. BUT, it couldn’t scale to additional dimensions without a significant amount of work. And for this fast food client, stakeholders joked they had to wait about 10 years for new data requests to make it through the pipeline. (And they weren’t necessarily wrong…).
In fast food product sales analysis, not only are new dimensions like order capture and dining mode important as the industry evolves, but also the ability to very granularly inspect the behavior within each transaction, in order to inform a more advanced customer engagement model. If a customer selects product A, which product are they most likely to select next? Based on a mobile customer’s past transaction history, which deals or offers are most likely to result in a conversion? You can’t query individual transactions from the cube – and querying from the straight underlying SQL tables requires a level of technical proficiency that many business users don’t have.
As the volume of the data set grew and grew, the cubes became increasingly fragile. The polling pipeline that retrieved the data from the stores and brought it into the central warehouse had failures every single day. There were so many failures that at a certain point, it was almost impossible to quantify the amount of data that was missing. (Spoiler alert: we uncovered and backfilled thousands of missing records as we migrated to a new tech stack). So for us, the architecture couldn’t scale. In fact, it was struggling to support the basic use cases within the current state.
3. Do you have “BIG” data?
How do we measure data size? In BYTE SIZE chunks that is (I’m here all day for the puns, folks)! 8 bits (binary digits) make up a byte. A terabyte is 10004 bytes. As data technology evolves quickly, the measure for what constitutes “big” is also changing. In the past, data volumes into the terabytes were difficult to query with adequate performance (without the aforementioned cube approach, which limits your querying flexibility as dimensions change). People are now talking in terms of ZETTABYTES (10007 bytes), and IDC estimates we’ll acquire 175 zettabytes of data worldwide by 2025. HOLY SMOKES that’s a lot of data and a LOT of spinning screens waiting to spit out something meaningful.
Let’s think back to fast food. How much data is generated from a single transaction? You may think of your order as containing fries, a drink and a burger. Of course we capture those details, but a single transaction can result in a wealth of attribution:
- Products contained in the order;
- Every minute modification (extra tomatoes, no sauce, add bacon for $.50);
- How your order was captured;
- Who took your order;
- How quickly your order was fulfilled;
- How your order was obtained;
- How your order was paid for;
- And later, if the customer so chooses, how they felt about their order experience via survey results.
This one order could enrich 3-5 fact tables storing different details. In the single lunch hour, you could have tens or hundreds of these transactions from a single location. All of this adds up, and the sheer volume can be overwhelming if you’re not sure how to wrangle and capitalize on all of this detail. So, is your data big? Well, you have to determine if the processing power of your current architecture can process the requests of your users within a reasonable response time (ideally less than 30 seconds). If you have less than a terabyte of data, you’re probably not “BIG” yet. If you have 10+, you may be seriously considering contemporary technologies that can compute this volume of records with ease and the data evolution required.
4. Can you make the right decisions for your business?
I once met a business owner who would take pencil and paper reports from his storefronts – and he ran a successful franchise. Could it have been MORE successful? Probably, but by no means was he struggling to generate a decent profit margin within his current processes. Sometimes, all of this fancy pantsy architectural data evolution stuff doesn’t matter. At the end of the day, can you make the right decisions to grow your business in the way that meets your strategic goals? Some executives have an amazing gut and mind for strategically driving improvements to the top and bottom line. Others…don’t so much. The larger your enterprise is, the more required good data will be – because the decision making is decentralized across many different leaders and often siloed departments. Before you go down the path of considering new technology, I encourage you to take a deep look at what decisions you need to make today. What questions do you have? What data do you need to answer them? Can you access that data easily? Can you start to transform the information into insights? If you can do all of these things, then CONGRATULATIONS, you probably don’t need to invest time and energy into a data transformation right now. If you’re nodding your head to all of the points listed above, and your executives are struggling to run the business, then perhaps it’s time to take that dive into the deep end and take a sip of enlightened data tech tea.