What is SAP Datasphere?
SAP Datasphere is SAP's cloud data warehouse service. It is designed to bring data from SAP systems (S/4HANA, BW/4HANA, SuccessFactors, etc.) and non-SAP sources together into a single governed, semantically rich layer that analysts, BI tools, and downstream applications can consume. It is the successor to SAP Data Warehouse Cloud (DWC) — same product, renamed in 2023 — and it runs on top of SAP HANA Cloud.
The motivation behind SAP Datasphere is a federation-first data architecture. Rather than forcing every byte into one warehouse, Datasphere lets you replicate hot data for performance, federate cold data in place, and expose a unified business model on top regardless of where the rows physically live. The result is a semantic layer that business users query through SAP Analytics Cloud (SAC), Excel, or any OData/SQL consumer.
Key Features
SAP Datasphere ships with a feature set tuned for enterprise data integration:
- Spaces — self-contained, tenant-like units with their own storage, compute, and user membership. Spaces are how you separate dev/test/prod, business domains, or partner tenants. Database users, data access controls, and storage budgets are all scoped to a space.
- Connections — more than 40 connection types covering SAP and non-SAP sources: S/4HANA, BW/4HANA, HANA Cloud, ABAP CDS views, SuccessFactors, cloud databases, object storage, and generic OData, plus partner connectors from Dremio, Collibra, Confluent, and others.
- Data flows — graphical, ETL-style pipelines that transform data as it lands in Datasphere. A data flow reads from a source, applies a chain of transforms (joins, projections, aggregations, calculated columns), and writes to a table.
- Replication flows — lightweight, near-real-time or scheduled replication of entire tables or ABAP CDS views from a source into Datasphere with minimal transformation. Ideal for bringing S/4HANA data in continuously.
- Task chains — orchestration objects that sequence data flows, replication flows, and other tasks on a schedule with dependencies.
- Content transport — a controlled transport mechanism (CT objects) for moving models, connections, and spaces between tenants (dev → prod) without manual rework.
- Data Marketplace — a catalog where providers publish data products and consumers subscribe to them, useful for sharing governed datasets across business units or with external partners.
Data Architecture
SAP Datasphere sits in the middle of SAP's modern data architecture. It is not a replacement for every component; it is the federation and semantic layer that ties them together.
A typical landscape looks like this:
- Source systems — S/4HANA, BW/4HANA, SuccessFactors, third-party databases, files in cloud storage.
- SAP HANA Cloud — the in-memory database engine that Datasphere runs on. HANA Cloud instances can be federated as peers so Datasphere queries reach them transparently.
- SAP Datasphere — replication, federation, modeling, and the semantic layer.
- Consumption layer — SAP Analytics Cloud for stories and dashboards, SAP Analytics Cloud for planning, Microsoft Excel, third-party BI tools, and any OData/SQL consumer.
The design rule is: keep data in the system that owns it when you can, replicate when you must, and always expose consumption through Datasphere models so that definitions (metrics, hierarchies, security) are governed in one place.
Modeling in Datasphere
Modeling in Datasphere happens in two builders, and that split is the heart of the product:
- Data Builder is where technical data lives. You create local tables, graphical or SQL views, and intelligent lookups (Datasphere's join-with-business-semantics object). Output here is relational — tables and views ready to be consumed.
- Business Builder is the semantic layer. You build analytic models on top of Data Builder views, defining measures, dimensions, hierarchies, and associations. Analytic models are what SAC and other consumers query; they are the contract between IT and the business.
This separation matters because it lets data engineers work in the Data Builder while business analysts consume curated analytic models from the Business Builder — without either group stepping on the other. Underneath, everything is HANA Cloud SQL with Datasphere metadata layered on top.
For programmatic control, the Datasphere CLI lets you script tenants, spaces, connections, and content transport. The same operations are exposed as APIs, which is useful for CI/CD pipelines that promote content between dev, test, and prod tenants.
AI-Assisted Datasphere Development
Datasphere's vocabulary is dense and specific — spaces, data flows vs replication flows, intelligent lookups, analytic models, associations, content transport. A general AI assistant will routinely confuse data flows with replication flows, invent non-existent object types, or suggest SAC features that only exist in Datasphere. The JSON and YAML configuration formats for connections, agents, and task chains are Datasphere-specific and unforgiving.
A Datasphere skill gives the assistant the correct terminology, object model, and configuration patterns so it can help you design models, draft data flow logic, generate CLI commands, and review content transport definitions that actually work against a live tenant.
The recommended skill is sap-datasphere. It covers Data Builder, Business
Builder, analytic models, 40+ connection types, real-time replication, task chains,
content transport, and the data marketplace, and ships with agents, slash commands, an
MCP server, and validation hooks:
npx skills add secondsky/sap-skills --skill sap-datasphere
Pair it with sap-sac-planning or sap-sac-scripting if your consumption layer
is SAP Analytics Cloud, and with sap-abap-cds if you are federating ABAP CDS views
from S/4HANA — Datasphere is the most common consumer of exposed CDS views.