SAPSkills

SAP Datasphere Tutorial: Build Your First Data Model from Scratch

A hands-on SAP Datasphere tutorial covering spaces, connections, data flows, replication flows, analytic models, and consumption through SAP Analytics Cloud and SQL.

Updated June 14, 2026

SAP Datasphere Tutorial

SAP Datasphere is SAP's cloud data warehouse. This SAP Datasphere tutorial walks you through the practical path from a fresh tenant to a working analytic model that business users can consume in SAP Analytics Cloud, Excel, or any SQL client. If you are new to Datasphere and want a hands-on, step-by-step walkthrough instead of a feature list, start here.

By the end of this tutorial you will have created a space, connected a data source, brought data in, built an analytic model, and queried it. The concepts you pick up along the way (spaces, connections, data flows vs replication flows, analytic models) are the ones you will use in every real Datasphere project.

Prerequisites

To follow this tutorial you need:

  • An SAP BTP account with an entitlement for SAP Datasphere. SAP offers a free tier on some hyperscaler regions, which is enough for this walkthrough.
  • The DW Administrator or DW Modeler role collection assigned to your user in the BTP subaccount.
  • A data source. Easiest options: an SAP S/4HANA system you can reach, a public CSV file in cloud storage, or one of SAP's open sample datasets (the SAP Goldref sample data and the sap-samples/datasphere GitHub repo both work well).
  • Basic SQL knowledge. You do not need to write raw SQL to complete the tutorial, but understanding SELECT, JOIN, and GROUP BY will make the modeling steps feel natural.

Step 1: Access Datasphere

If Datasphere is not yet subscribed in your subaccount:

  1. Open the BTP cockpit for your subaccount.
  2. Go to Service Marketplace and search for SAP Datasphere.
  3. Click Create to subscribe, accepting the default plan.
  4. Assign the DW Administrator role collection to your user under Security > Users.
  5. From the subscription in Service Marketplace, click Go to Application.

The link opens the Datasphere home page, which is the entry point for everything: spaces, connections, the Data Builder, the Business Builder, and the data marketplace. Bookmark this URL; it is the one you will use daily.

Step 2: Create a Space

A space is the unit of isolation in Datasphere. Each space has its own storage, its own compute, and its own membership. You typically create one space per environment (dev, test, prod) or per business domain (sales, finance, HR).

To create your first space:

  1. From the left navigation, open System > Spaces.
  2. Click New Space and give it a name and a space ID (for example, SALES).
  3. Set Storage to the default HANA Cloud database assigned to your tenant.
  4. Assign yourself and any collaborators as members with the Modeler role.
  5. Save.

The space is now the target for everything you build next. Database users, data access controls, and storage quotas are all scoped to the space you are working in. When you later transport content from dev to prod, the transport carries objects between spaces.

Step 3: Connect a Data Source

A connection is Datasphere's link to an external system. Datasphere ships with more than 40 connection types: S/4HANA, BW/4HANA, HANA Cloud, SuccessFactors, ABAP CDS views, cloud databases (Snowflake, Databricks, Redshift, BigQuery), object storage, generic OData, and partner connectors.

To connect an S/4HANA source (the most common case):

  1. Open Data Builder > Sources (or Connections depending on tenant version).
  2. Click New Connection and pick SAP S/4HANA or SAP ABAP.
  3. Fill in the connection details: host, client, system number, and credentials. If the source is on premise, route the connection through an SAP Cloud Connector bound to the Datasphere Data Provisioning Agent.
  4. Test the connection, then save.

For a zero-setup alternative, use a CSV file connection or upload a local file directly into a local table. This is a great way to try the modeling steps even if you have no SAP system available.

After the connection is saved, the source's tables and ABAP CDS views become available as remote tables you can use in data flows, replication flows, and views.

Step 4: Create a Data Flow or Replication Flow

Datasphere gives you two main ways to bring data in, and the choice matters.

  • Replication flow copies data from a source into a Datasphere local table, either on a schedule or near real time. Minimal transformation. Use it when you want the data physically inside Datasphere for performance and reliability.
  • Data flow is an ETL pipeline. You read from a source, apply a chain of transformations (joins, projections, aggregations, calculated columns, filters), and write to a local table. Use it when the source data needs shaping before it is useful.

To create a replication flow for an S/4HANA sales order table:

  1. Open Data Builder and create a new Replication Flow.
  2. Add a source (your S/4HANA connection) and pick the table or CDS view to replicate.
  3. Set the target to your space and choose the load type: Initial Only for a one-off snapshot, or Initial and Delta for ongoing near-real-time replication.
  4. Set a schedule or enable streaming.
  5. Save, then Run the flow once to load the data.

To create a data flow that transforms the replicated data:

  1. In Data Builder, create a new Data Flow.
  2. Drag the replicated local table onto the canvas as the source.
  3. Add transforms: a Projection to pick columns, an Aggregation to compute totals by customer, a Join to enrich with a customer dimension.
  4. Add a Local Table target and map the output columns.
  5. Save and run the data flow.

The result is a clean, modeled local table that is ready to feed an analytic model.

The SQL underneath a typical data flow would look roughly like this if you wrote it by hand (Datasphere generates it for you):

CREATE COLUMN TABLE "SALES"."CUSTOMER_REVENUE" AS (
  SELECT
    h."CustomerID",
    c."CustomerName",
    c."Region",
    SUM(h."NetAmount")        AS "TotalRevenue",
    COUNT(DISTINCT h."OrderID") AS "OrderCount"
  FROM "SALES"."SALES_ORDER_HEADER" h
  JOIN "SALES"."CUSTOMER" c
    ON h."CustomerID" = c."CustomerID"
  WHERE h."Status" = 'Completed'
  GROUP BY h."CustomerID", c."CustomerName", c."Region"
);

Step 5: Build an Analytic Model

Data flows and local tables live in the Data Builder (the technical layer). To make the data consumable by business tools, you promote it into the Business Builder as an analytic model. The analytic model is Datasphere's semantic object: it defines measures, dimensions, hierarchies, associations, and the default behavior consumers see.

To build an analytic model:

  1. Open the Business Builder and choose Analytic Model.
  2. Pick the fact source (the local table or view from Step 4).
  3. Add dimensions (for example, Customer and Region) by associating to dimension views.
  4. Define measures: the default measures are the numeric columns, but you can add calculated measures such as Revenue per Order = TotalRevenue / OrderCount.
  5. Add hierarchies where they make sense (Region > Country > City for geography).
  6. Configure variables for runtime filtering (a date range, a region picker).
  7. Save and Deploy the model.

Once deployed, the analytic model is published to the catalog and is available as a live data source to SAP Analytics Cloud, Excel, and any SQL consumer with access.

Step 6: Consume the Model

The analytic model is the contract between IT and the business. From here, you consume it through whichever tool fits the audience:

  • SAP Analytics Cloud is the primary consumer. Create a connection of type SAP Datasphere in SAC, select the analytic model, and build a story or dashboard on top of it with live data.
  • Microsoft Excel connects through the SAP Analytics Cloud add-in or the Datasphere ODBC driver, letting analysts build pivot tables directly on the model.
  • SQL clients (DBeaver, DataGrip, the Datasphere SQL console) query the model through a standard SQL interface. Datasphere exposes an OpenSQL schema for the space.

A SQL query against the deployed model looks like standard SQL:

SELECT
  "Region",
  SUM("TotalRevenue") AS "Revenue",
  SUM("OrderCount")   AS "Orders"
FROM "SALES"."CUSTOMER_REVENUE_MODEL"
WHERE "FiscalYear" = '2026'
GROUP BY "Region"
ORDER BY "Revenue" DESC;

Because the semantic definitions (currency handling, hierarchies, security) live in the analytic model, every consumer sees consistent results.

Datasphere Concepts Reference

The terminology is the steepest part of the learning curve. Keep this reference handy:

ConceptWhat it isWhere it lives
SpaceIsolated unit with its own storage, compute, and membersSystem configuration
ConnectionLink to an external source (S/4HANA, cloud DB, file, OData)Connections area
Remote TableA table or view exposed through a connection, queried in placeData Builder
Local TableA table physically stored in the Datasphere spaceData Builder
Replication FlowScheduled or streaming copy of source data into a local tableData Builder
Data FlowETL pipeline with transforms, writes to a local tableData Builder
Intelligent LookupA join object with business semantics and auto-matchingData Builder
View (graphical/SQL)A modeled view on top of tables and other viewsData Builder
AssociationA relationship between two entities, used by models and lookupsData Builder / Business Builder
Analytic ModelThe semantic object consumed by SAC, Excel, and SQLBusiness Builder
Task ChainOrchestration that sequences flows on a scheduleData Builder

Datasphere with AI Coding Assistants

Datasphere has a dense, specific vocabulary. A general-purpose AI assistant will routinely confuse data flows with replication flows, suggest BW concepts that do not apply, or generate SQL that uses syntax the Datasphere OpenSQL layer rejects. The YAML and JSON formats for connections, agents, and content transport are Datasphere-specific and unforgiving.

The sap-datasphere skill gives an AI assistant the correct terminology, object model, and configuration patterns so it can help you design spaces, draft data flow logic, generate content transport definitions, and produce SQL that actually runs on the Datasphere SQL layer. Install it to bring accurate Datasphere context into your editor:

npx skills add secondsky/sap-skills --skill sap-datasphere

Pair it with sap-sac-planning or sap-sac-scripting if your consumption layer is SAP Analytics Cloud, and with sap-abap-cds if you are federating ABAP CDS views from S/4HANA. With the right skills loaded, an assistant can help you pick the right object type for each requirement, draft the modeling steps, and generate CLI commands for automating content transport between dev and prod tenants.

Note: SAP Skills is a community-maintained, open-source collection of plugins for AI coding assistants. It is not an official SAP product and is not affiliated with or endorsed by SAP SE. The skills encode publicly documented SAP knowledge to help AI assistants produce more accurate SAP code.

Related Skills

Frequently Asked Questions

How do I access SAP Datasphere?

From the SAP BTP cockpit, open Service Marketplace, subscribe to SAP Datasphere, and assign the relevant role collections (such as DW Administrator or DW Modeler) to your user. The subscription URL opens the Datasphere home page where you manage spaces, connections, and models.

What is a space in Datasphere?

A space is a self-contained unit inside a Datasphere tenant with its own storage, compute, and user membership. Spaces are how you separate dev/test/prod, business domains, or partner projects. Database users and data access controls are scoped to a space.

How do I create a data model in Datasphere?

Use the Data Builder to create local tables or SQL/graphical views on top of your connections, then use the Business Builder to create an analytic model that defines measures, dimensions, and associations. The analytic model is what consumption tools like SAP Analytics Cloud query.

Can I use SAP Analytics Cloud with Datasphere?

Yes. SAP Analytics Cloud has a native live data connection to SAP Datasphere analytic models. Create a connection of type SAP Datasphere in SAC, point it at your tenant, and build stories and dashboards directly on top of the analytic models you publish.

What is a replication flow?

A replication flow copies data from a source (an S/4HANA table or ABAP CDS view, a cloud database, or a file) into a Datasphere local table on a schedule or near real time. It is the lightweight alternative to a data flow when you need the data in Datasphere with minimal transformation.

How is Datasphere different from BW?

Datasphere is the cloud-native, federation-first data warehouse. BW/4HANA is the traditional, ETL-heavy data warehouse that replicates and persists large volumes. The two are often integrated, with Datasphere acting as the consumption and federation layer over BW and S/4HANA data.

Explore all SAP Analytics & Data skills