HANA Multi‑Model: Graph, Spatial & Document Store – Day 8 Hands‑On Lab Guide | Part 53 | FreeLearning365

 

HANA Multi‑Model: Graph, Spatial & Document Store – Day 8 Hands‑On Lab Guide | Part 53 | FreeLearning365


🔥 DAY08 [SAP HANA Database & Performance] {Part-53}

Welcome to the day where we break free from the rectangle. For the past week, you’ve mastered relational tables, column stores, SQLScript, and even machine learning. But the world’s data isn’t always rectangular. Supply chains are networks. Customers exist at geographic coordinates. Product specifications are messy, semi‑structured, and different for every category. HANA’s answer to this messy reality is its multi‑model engine: Graph, Spatial, and Document Store – all running natively inside the same database, alongside your transactional ACDOCA data.

Today you’ll learn how to build a bill‑of‑materials graph, query it with pattern matching, run spatial proximity searches, and store JSON documents that can be queried with SQL. These aren’t separate bolt‑ons; they’re integrated so tightly that you can join a spatial query with a column‑store table in a single statement. By the end, you’ll be designing hybrid data models that solve problems no flat table ever could.


🕸️ Graph Engine – Networks Inside Your Database

HANA’s Graph engine lets you define vertices (nodes) and edges (relationships) as database objects, then query them using openCypher, the standard graph query language. You can find shortest paths, detect communities, and run pattern matching – all within the safety of HANA’s ACID transactions.

🧱 Graph Workspace – The Foundation

A Graph Workspace is a metadata definition that points to an existing vertex table and an edge table in your schema. You don’t copy data; you just declare the relationship. The tables can be existing column‑store tables.

-- Vertex table for materials CREATE COLUMN TABLE MATERIALS ( MATNR NVARCHAR(40) PRIMARY KEY, MAKTX NVARCHAR(100), MATKL NVARCHAR(9) ); -- Edge table for BOM structure (parent -> child) CREATE COLUMN TABLE BOM_EDGES ( PARENT_MATNR NVARCHAR(40), CHILD_MATNR NVARCHAR(40), QUANTITY DOUBLE, PRIMARY KEY (PARENT_MATNR, CHILD_MATNR) ); -- Create graph workspace CREATE GRAPH WORKSPACE BOM_GRAPH EDGE TABLE BOM_EDGES SOURCE COLUMN PARENT_MATNR TARGET COLUMN CHILD_MATNR KEY COLUMN ID -- optional, we can skip VERTEX TABLE MATERIALS KEY COLUMN MATNR;

🔍 Querying with openCypher

Once the workspace is defined, you can execute openCypher queries directly via SQL procedure GRAPH_WORKSPACE or in HANA Database Explorer.

-- Find all direct children of material 'FINISHED_PRODUCT_A' SELECT * FROM OPENCYPHER_TABLE( GRAPH BOM_GRAPH MATCH (a)-[e]->(b) WHERE a.MATNR = 'FINISHED_PRODUCT_A' RETURN b.MATNR, e.QUANTITY );

🧪 Real‑Life Lab: Multi‑Level BOM Explosion with Pattern Matching

Scenario: You want all components of a final assembly, down to the raw materials, along with total required quantity. This is a recursive graph traversal that SQL can’t express easily. With openCypher:

-- Explode BOM up to 10 levels, aggregate quantity along paths SELECT * FROM OPENCYPHER_TABLE( GRAPH BOM_GRAPH MATCH (root)-[p:EDGE*1..10]->(leaf) WHERE root.MATNR = 'ASSY_TOP' RETURN leaf.MATNR, SUM(p.QUANTITY) AS TOTAL_QTY );

This single graph query replaces a complex ABAP recursion. It leverages HANA’s graph engine to traverse the network in parallel, returning results in milliseconds even on thousands of nodes.

🎯 Real‑World Use: Supply Chain Risk Analysis

A manufacturer linked their supplier‑material graph with location data. Using graph queries, they identified all parts sourced from a region hit by a natural disaster within seconds, allowing them to activate alternate suppliers. Without the graph, they would have manually traced each assembly, a process that used to take days.

⚙️ Performance Tuning for Graph Queries

Graph operations benefit from HANA’s in‑memory engine. However, ensure your vertex and edge tables are column‑store (avoid row store for large networks). Use proper indexing on the key columns. The GRAPH_WORKSPACE definition should match partitioning strategies if tables are huge.

✅ Best Practice: For BOMs and hierarchies up to a few hundred thousand nodes, native HANA Graph is excellent. For extremely large graphs (billions of edges) with complex algorithms like PageRank, you might still use HANA Graph but monitor memory carefully; consider sampling or specialized graph data stores if memory becomes a bottleneck.

🌍 Spatial Engine – Location Intelligence Where Your Data Lives

SAP HANA supports spatial data types (ST_GEOMETRY, ST_POINT, etc.) and hundreds of spatial functions to compute distances, areas, intersections, and buffer zones. You can store points, lines, and polygons directly in column‑store tables, and index them for lightning‑fast proximity queries.

🗺️ Spatial Data Types and Indexes

-- Create a table with a spatial column CREATE COLUMN TABLE CUSTOMER_LOCATIONS ( KUNNR NVARCHAR(10), LOCATION ST_GEOMETRY(4326) -- SRID 4326 for GPS coordinates ); -- Insert a point (longitude, latitude) INSERT INTO CUSTOMER_LOCATIONS VALUES ('CUST01', NEW ST_POINT('POINT (77.1025 28.7041)')); -- Create a spatial index CREATE SPATIAL INDEX CUST_LOC_IDX ON CUSTOMER_LOCATIONS(LOCATION);

📏 Spatial Queries: Distance, Buffers, and Containment

-- Find all customers within 50 km of distribution center DC1 SELECT c.KUNNR FROM CUSTOMER_LOCATIONS c, DISTRIBUTION_CENTERS d WHERE d.DC_ID = 'DC1' AND c.LOCATION.ST_WITHIN_DISTANCE(d.LOCATION, 50000, 'meter') = 1;

🧪 Lab: Optimize Delivery Zones

Given a set of customer locations and warehouse polygons, you can compute the number of customers per zone and even suggest new warehouse locations by clustering customer points using HANA’s spatial clustering (similar to PAL but spatial‑aware).

-- Count customers per polygon region SELECT r.REGION_NAME, COUNT(*) FROM SALES_REGIONS r, CUSTOMER_LOCATIONS c WHERE c.LOCATION.ST_WITHIN(r.AREA) = 1 GROUP BY r.REGION_NAME;

🎯 Real‑Life Win: Geospatial Sales Analysis

A beverage company used HANA Spatial to map all retail outlets and identify gaps in coverage. They joined sales data (from ACDOCA) with store locations and found that 15% of sales came from areas with no dedicated sales rep. They restructured territories and increased revenue by 8% in six months. The entire analysis ran directly on HANA, without any GIS software.

🔥 Pro Tip: Combine spatial queries with HANA PAL: cluster customers spatially using DBSCAN on coordinates, then assign each cluster to the nearest distribution center. All inside a single SQLScript procedure.

📄 Document Store – Flexibility for Semi‑Structured Data

Relational tables force every row to have the same columns. But in the real world, product specifications vary: a laptop has "processor", a t‑shirt has "fabric", a chemical has "pH". HANA Document Store allows you to store JSON documents in collections, query them with SQL/JSON functions, and even join them with relational tables – all while retaining transactional integrity.

📝 Creating a JSON Collection

CREATE COLLECTION PRODUCT_SPECS; -- Insert JSON documents INSERT INTO PRODUCT_SPECS VALUES ( '{"MATNR":"LAPTOP123", "attributes":{"CPU":"i7", "RAM":"16GB", "Storage":"512GB SSD"}}' ); INSERT INTO PRODUCT_SPECS VALUES ( '{"MATNR":"TSHIRT99", "attributes":{"Fabric":"Cotton", "Size":"L", "Color":"Red"}}' );

🔍 Querying JSON with SQL

You can use standard SQL with JSON functions to extract values, filter, and aggregate.

-- Find all products where the JSON attribute "CPU" exists SELECT JSON_VALUE(DOC, '$.MATNR') AS MATNR, JSON_VALUE(DOC, '$.attributes.CPU') AS PROCESSOR FROM PRODUCT_SPECS WHERE JSON_QUERY(DOC, '$.attributes.CPU') IS NOT NULL;

🧪 Lab: Flexible Material Master Extension

In S/4HANA, the material master (MARA) has hundreds of columns, but many are empty for specific material types. Instead of adding custom fields, you can store variant attributes in a document store collection, linked by MATNR. Then query both together:

-- Hybrid join: relational MARA with document store SELECT m.MATNR, m.MTART, JSON_VALUE(ps.DOC, '$.attributes.Fabric') AS FABRIC FROM MARA m LEFT JOIN PRODUCT_SPECS ps ON m.MATNR = JSON_VALUE(ps.DOC, '$.MATNR') WHERE m.MTART = 'FERT'; -- Finished goods

🎯 Real‑Life Use: Unified Product Catalog

A retailer used HANA Document Store to merge product data from multiple acquisitions, each with different attributes. They stored the common fields (price, SKU) in a column‑store table and the variable fields in a JSON collection. This allowed them to launch a unified web shop in weeks, not months, and update attributes without database changes.

⚠️ Document Store Best Practices

  • Keep frequently queried, filterable fields as separate columns in a relational table for performance; use JSON for truly variable attributes.
  • Index JSON fields with generated columns or JSON indexes if you frequently filter on a specific attribute.
  • Document Store collections are stored internally as column‑store tables, so they benefit from memory optimizations. However, large binary documents (MBs of JSON) should be limited; HANA is not a blob storage.
💡 Golden Rule: If a field exists for >80% of your products, keep it in a standard column. If it’s only for 20%, consider Document Store. This avoids sparse columns and saves memory.

🧩 Combining All Three Models – The Ultimate Hybrid Scenario

Imagine you need to analyze supplier risk. You have:

  • Graph: supplier‑part relationships and BOMs.
  • Spatial: supplier locations and natural disaster zones.
  • Document Store: supplier audit reports (unstructured text, scores).
  • Relational: purchase order history from ACDOCA.

With HANA, you can write a single SQLScript procedure that:

  1. Uses a graph query to find all parts impacted by a supplier disruption.
  2. Joins the result with spatial data to see which parts are in affected regions.
  3. Filters supplier audit scores from a JSON collection to focus on high‑risk suppliers only.
  4. Aggregates financial exposure from ACDOCA for those parts.

This is the holy grail of enterprise data: all data models working together without moving data. No separate graph database, no GIS server, no MongoDB. Just HANA.


⚖️ Pros, Cons & Alternatives to HANA Multi‑Model

✅ Pros

  • Eliminates data silos: All data in one engine, no ETL between models.
  • ACID compliance: Graph and document operations are fully transactional, same as relational.
  • Performance: In‑memory processing and columnar storage accelerate graph traversals and spatial joins.
  • Simplified architecture: One database to manage, backup, and secure.

⚠️ Cons

  • Not specialized: For extreme graph workloads (e.g., social networks with billions of edges), dedicated graph DBs like Neo4j may outperform HANA’s graph engine. But for enterprise scale, HANA is more than sufficient.
  • Memory cost: Complex spatial and graph structures can consume significant RAM; proper sizing is essential.
  • Learning curve: openCypher, spatial SQL, and JSON path expressions are new skills for traditional SAP developers.

🔄 Alternatives

  • Neo4j / TigerGraph: For pure graph use cases with extreme scale and advanced graph algorithms.
  • PostGIS: If you only need spatial and are on a different DB, but lacks HANA’s in‑memory speed.
  • MongoDB / Couchbase: For document‑only use cases, but they can’t join with relational tables directly.
🧠 Decision Framework: Use HANA multi‑model when the data already resides in HANA (e.g., S/4HANA). The integration and low latency outweigh the potential performance edge of a separate specialized database. For greenfield projects, evaluate the primary data volume and query patterns – but HANA is now a solid multi‑model contender.

📌 Wrapping Up Day 8 – Your Data Model Horizon Expands

Today you’ve added three powerful arrows to your quiver: graphs for connected data, spatial for location intelligence, and document store for flexible schemas. These aren’t theoretical; they’re production‑ready features used by leading S/4HANA customers to solve problems that relational tables alone cannot. You can now model a supply chain, optimize delivery networks, and manage diverse product attributes – all within the database you already know.

🔥 Action Item: In your sandbox, create a graph workspace on a sample BOM table, then run a shortest path query. Next, load some spatial data (even fictional coordinates) and try proximity queries. Finally, create a JSON collection for product attributes and join it with a relational table. Share your experiments with the community.

Post a Comment

0 Comments