When to Move to a Data Analytics Platform
Data analytics platforms and “data clouds” have become more accessible as technology platforms mature and costs to acquire and maintain are historically lower than ever. Yet, organizations still take a cautious approach to implementing a data warehouse and/or standalone analytics platform.
Two of the fondest memories of organizations expressing the need for data analytics were expressed as:
Every office will have a flat screen on the wall with a real-time dashboard
Pull out an iPad in the back of a cab to prepare while on the way to a meeting
Achieving these goals is absolutely possible, even if those don’t sound entirely relatable to you.
The trick is that practically all organizations start with and continue to do “analytics” with spreadsheets or the code-based equivalent.
Export data or connect to systems
Perform some tasks to gain insights
Share insights and dataset via email, chat, or a shared drive
There might be a fourth step to make that a repeatable process so others can spin up the same Excel workbook or Jupyter notebook to reproduce later with fresh data.
Ultimately, the reason for this is typically one or more of these:
Data is in disparate systems and not integrated
Need to create synthetic data from other data (e.g. formulas, feature engineering)
System doesn’t have analytics or its feature set is limited
Desired visualizations are not possible
Resources don’t exist or are too busy to take on additional work
To break free from this cycle, it’s worth getting started with an analytics platform and/or data warehouse.
Databox is a decent entry point for organizations with zero to few technology resources
Power BI and Tableau are long-standing leaders in visualization and lighter-weight data engineering tasks
Snowflake and Microsoft Fabric are best suited for large volumes and/or complex data engineering needs
These five questions are a starting point to understanding your organization’s needs:
Do you have datasets across multiple systems (e.g. CRM, email marketing, e-commerce, website analytics, core product/service) that are not represented in a single place?
Do you endeavor to (regularly) perform data engineering and machine learning on your datasets (e.g. regression, classification, recommender algorithms)?
Are your data volumes large enough to slow down or break your existing process and/or are growing rapidly?
Do multiple teams need to have concrete definitions and the same view of datasets and (e.g. a “customer”, an “order”, etc)?
Do you have regulatory or internal compliance constraints that require auditable and customizable security and access controls around your data?
If you answered yes to more than half of those, it’s time to consider collecting and analyzing your data in a standalone environment.
TLDR: Data warehouses with standalone analytics platforms are meant for organizations with a data-driven or data-informed mindset, not just enterprises.