Data Architecture with SAP – Modern Data Stack
2024-1-9 14:37:12 Author: blogs.sap.com(查看原文) 阅读量:7 收藏

To be interested in Modern Data Stack (MDS) means to be interested in current trends and developments in modern data management for Analytics.

The term Modern Data Stack is not really well defined. We can see MDS as a collection of modular cloud-based ‘as a Service‘-tools that work seamlessly together in a data ecosystem and are comparable easy to use concerning user experience and offering a simple billing model.

How it began

The classical MDS rise with the first Cloud Data Warehouses like Amazon Redshift and Google BigQuery in the 2010s. Data Warehouses became available much easier and brought a higher degree of automation through the cloud. Around this central data platforms a first ecosystem of easy to use cloud-native data integration and analytics tools created the first idea of not just a, but the MDS.

Fig. 1: Timeline Modern Data Stack until 2015

In this first years the MDS was somehow simple and the core of what is needed to do data and analytics until today:

Fig. 2: Core functionality of a data stack

So far this is a typical approach which is also very common in SAP environments with SAP BW and SAP BusinessObjects BI for on-premises or SAP Datasphere and SAP Analytics Cloud in the cloud.

The evolution of the data stack

But over the years we have seen several impulses and developments in modern data management like:

  • Data plays an increasingly important role for companies
  • Increasing demand in data and analytics solutions
  • More and more cloud-native offerings, not only for data and analytics
  • Unbundling of functionality
  • Growing demand for the best fit of a solution to maximize value from data
  • Agile working environments
  • Several decentral responsibilities for data and analytics in enterprises

Today cloud data platforms like Snowflake or Databricks are the forerunners of the Data Lakehouse which makes this architecture the modern, simplified and flexible center of todays MDS.

This growing interest in modern cloud data solutions and the unbundling like separating the ETL process (Extraction-Transformation-Load) into different solutions, coining the ELT approach, with separate solutions for Extraction and Loading into a data platform (like Fivetran or Rivery) and a processing of data transformation (like dbt or coalesce) for better scalability and the possibility to split the work into different roles like Data Engineer and Analytics Engineer. This idea started around 2016 and the demand for more specialized functionality led to a sometimes called ‘cambrian explosion’ of data tool or Modern Data Stack 2.0:

Fig. 3: Evolution history of the MDS with exemplarly vendors

As a result the MDS today is much more complex in a very demanding world to perform best on the data of the company. Therefore what we see is typically a Best-of-Breed ecosystem well adapted to the needs of the company and the context it is used for. The following shows therefore a framework offering the typical options for a modern analytical data management in the sense of the MDS:

Fig. 4: Modern Data Stack Framework

The framework shows that a modern data management needs specific integration options e. g. for tracking of customer events (Event Tracking) or to send the data back to operational systems (Reverse ETL) like CRM or ERP to make use of the data where business happens. Furthermore data governance tools like Data Catalog and Data Observability are playing an increasingly important role in many data landscapes today.

Besides classical transformation work, AI-based transformation to finde complex pattern in data or use predictive analytics is getting more and more common in modern data solutions and a metric layer delivers decoupled central and governed KPI and metric calculations.

SAP and the Modern Data Stack

To come back to SAP, they are delivering modern solutions like SAP Datasphere in a Best-of-Suite approach, where we see also more and more adaption to what a MDS has to offer. While SAP Datasphere is more integrated in it’s functionality for data management, as well as it offers the best integration into the SAP application ecosystem, they integrated services and functionalities like Data Lake, orchestration and data catalog for data governance in the last year.

In 2023 SAP started a transformation from SAP Data Warehouse Cloud to SAP Datasphere to deliver the concept of a Businss Data Fabric. Additonally they initiated an Open Data Partner Ecosystem with vendors like Databricks, Google, Collibra, Confluent and DataRobot to extend functionality and integrate these solutions into a common and efficient ecosystem.

With this ongoing transformation two things get more relevant concerning the MDS:

  1. How far is the idea of the MDS transferable to SAP Data & Analytics solutions like SAP Datasphere, SAP Analytics Cloud and others?
  2. As companies today often have more than one data & analytics stack in place, how good is the interoperability and optimization of the different vendors and component?
  3. Does it really make sense to establish or maintain more then one data & analytics platform in your company?

With colleagues from INFOMOTION and SAP we came together to answer these questions in a whitepaper which will be published soon. I look forward to hearing your ideas and experiences about SAP and the MDS and to discussing them with you!


文章来源: https://blogs.sap.com/2024/01/09/data-architecture-with-sap-modern-data-stack/
如有侵权请联系:admin#unsafe.sh