SAP Datasphere & Databricks: Pioneering the Data-Driven Future Together
2023-11-23 15:23:50 Author: blogs.sap.com(查看原文) 阅读量:5 收藏

Commencing the Journey of an Open Data Ecosystem with SAP

In March 2023, SAP announced SAP Datasphere, a comprehensive data service which aims to deliver seamless and scalable access to crucial business data.

SAP also announced its vision towards creating a powerful open data ecosystem around SAP Datasphere. For the fruition of this vision SAP has partnered with leading technology vendors. The picture below gives a brief overview of these partnerships.

Picture Credits: SAP (March 2023)

In May 2023 , SAP also announced its partnership with Google Cloud as a part of the Open Data Ecosystem

In today’s business driven tech landscape, customers operate with a diverse and complex ecosystem of technology stacks to meet their evolving needs. This diversity allows them to create tailored applications to their specific needs. This diversity can, however, quickly result in fragmented data silos. Through the partnership SAP and Databricks aim to break down silos by integrating data from all kinds of sources across data planes. The partnership empowers users to unleash the power of their data through a bi-directional integration of Data Lakehouse with SAP Datasphere.

The ‘Why’ behind SAP and Databricks Integration

There is a lot of mist surrounding the integration between Databricks and Datasphere primarily around the relevance of Databricks in the ecosystem of SAP. Throughout this blog we will have a look at architectural concepts like data unification, data mesh, business fabric.

We will also take a deep dive into the why, what, and how of SAP and Databricks integration. We will explore how organizations can benefit by embracing this integration, from enhanced data governance and scalability to real-time analytics and machine learning capabilities

Data: The new oil of the digital age

Well often the author of quotes is a subject of debate and one such famous quote in our digital world is about data being the new oil. In 2006, British Mathematician Clive Humby said, “data is the new oil”, he didn’t just refer to data with oil as a commodity which is very valuable. He compared data to oil also because oil in its raw state isn’t as valuable as it becomes after the process of refinement. Similarly, the real potential of data isn’t in its raw state, but it lies in the refined version we get after the actual processing and transformation.

Even though a debate persists over the source of this quote, it holds a powerful essence: data surrounds us whilst generated continuously – knowing or unknowingly through clicks, transactions, or even mere swipes. Organizations collect this vast reservoir of untapped data potential. However, managing this ever-expanding data, particularly in the era of prevalent cloud-based solutions, poses challenges. Data spreads across different platforms with varying data formats resulting in data silos, making analysis and extracting value increasingly complex.

This is exactly where the real value of integration of SAP & Databricks comes into play. It is the key that unlocks data’s hidden treasures. Below is a reference architecture of Datasphere and Databricks (Open Data Ecosystem) integration. In the sections which will follow later in this blog, we will dig deeper in different architecture patterns which leverage the best of both worlds.

Datasphere allows unification of data from these different Source systems with different flavors of data integration and Data Management. Data federation is one of the key components in this architecture which allows data to be used without the need for actual replication or duplication. This allows Databricks to leverage this data and highlight its powerful features of advanced analytics and machine learning.

The ‘How’ behind SAP and Databricks Integration

Well, we did cover the What and Why of this integration so far, now let’s cover the ‘HOW’ by traversing through the data landscape as we discuss two patterns which are so far away from each other yet so close. When we refer to these patterns namely Data Fabric and Data Mesh, they sound very similar, maybe because we have heard of a cloth fabric which is a Mesh? Just kidding and jokes aside, these 2 are very different approaches which leverage the power of a strong architectural pattern to move swiftly through the expanding data landscape.

Understanding Data Fabrics: Weaving Data into Insights

Imagine data as threads, distributed across various sources, formats, and locations. Data Fabrics acts like a loom that weaves different thread of data into a single fabric. Unified data makes it simple for organizations to access, understand and use data to achieve the best possible results.

Described below are few of the key features of using Data Fabric

Below is an Architectural overview of how a seamless integration of SAP and Non-SAP Data sources can be achieved with the Data Fabric of SAP Datasphere and Databricks.

Picture Credits : Databricks

Data federation is at the heart of this architecture: unify data by federating SAP & non-SAP Data to Databricks and post processing the data send back the analytics to SAP. We will have a closer look at it in the next section

Real Life Pragmatic Business Use case’s

Let’s explore some practical use cases where Datasphere and Databricks can empower customers to bring more value out of their data.

Deep Dive: Inventory Management & Forecasting Use case

Maintaining right inventory levels is a major challenge for every business. With major decision revolving around demand for a given product, around which fulfillment location needs to be stocked with which product so that the companies can deliver the product to their customers quickly.​

Data fabric of Datasphere and Databricks can allow companies to leverage advanced analytics on the unified data at Datasphere and allow them to make informed decisions.

With the power of data federation, Databricks will utilize the sales data from Datasphere and feed it to a trained ML model for sales forecasting. (which is trained for this forecasting and deployed in Databricks). The insights derived from these forecasts are then sent back to their core SAP transactional systems, like S/4 HANA. A good example of leveraging these insights would be, if the inventory forecasting indicates a high demand for a particular product while the actual inventory is low, the system automatically generates a purchase order to replenish the stock (in the SAP S/4 Hana instances).

Understanding Data Mesh: Leverage the Power of Decentralization

Data Mesh brings a shift in the way organizations handle their data. Instead of having a central approach, Data Mesh brings a more decentralized approach in which data is divided into smaller units which are referred to as ‘domains.
Domains allow organizations to distribute ownership and responsibility since each domain has its own specific data hence it will have its own group of data owners which will allow better control over data. This in turn will allow for better collaboration and efficiency. Data Mesh improves the agility of an organization by making data management more adaptable & effective for modern business needs.

Here is a wonderful article about cloud platforms for data mesh in automotive industry which sheds more light around the concepts and implementation of Data Mesh

Described below are few of the key features of Data Mesh

Leveraging Data Mesh with Datasphere and Databricks

Now let’s look further to bring Data Mesh into action with Datasphere and Databricks where both these tech stacks play a vital role in building a robust solution.

SAP Datasphere enables decentralized data domains (so called “spaces”), ensuring each team has ownership and control over their specific data domain The below picture highlights a Data Mesh Architecture with Datasphere

We recently published a detailed blog on how we can unlock the power of data in SAP ecosystem by applying Data mesh Strategies.In this blog we talk in depth about integrating SAP services into Data Mesh journey and Improved data quality and governance, accelerated innovation and collaboration and data monetization.

Databricks plays a vital role in empowering data domains with advanced analytics and processing capabilities. This gives the domain team a self-serve platform.

This collaborative approach enhances agility, scalability, and the efficient utilization of data, making Data Mesh with Datasphere and Databricks a powerful paradigm for modern data management.

Earlier we deep dived into using Inventory Management and Forecasting by unifying the data together in Datasphere and Databricks. Let’s deep dive and explore how Inventory forecasting can be optimized by using the architecture pattern of Data Mesh.

Conclusion

SAP Datasphere along with Databricks enables a powerful ecosystem, which weaves together a Data Fabric for unification, governance, and analytical insights. This synergy also enables Domain centric ownership, real time analytics and quick and enhanced query processing.

This partnership together is paving the way for a seamless interplay of possibilities which the key to unlocking the data driven future.

You can connect with us for further details: 

Pranav Kandpal

Specialist Lead 

[email protected]

Pranav Kandpal has a focus on driving strategy, architecture, and implementation of modern data platforms to help clients in realizing value out of their data. 

Bikas Panigrahi 

Senior Manager 

[email protected] 

Bikas is in AI & Data portfolio with strong focus on Analytics Strategy, Transformation, Cloud Analytics Platforms, Advanced Analytics and SAP Analytics.  

Georgios Marios Giannantonakis

Senior Manager 

[email protected] 

Georgios has focus on SAP Analytics strategies and architectures. He is helping clients to transform, create valuable data driven strategies and shape their digital (cloud) strategy. 


文章来源: https://blogs.sap.com/2023/11/23/sap-datasphere-databricks-pioneering-the-data-driven-future-together/
如有侵权请联系:admin#unsafe.sh