🔥 DAY06 [SAP HANA Database & Performance] {Part-51}
Welcome to the day where you truly understand what’s happening inside HANA’s RAM. Over the last five days, you’ve built procedures, tuned queries, and modeled data like a pro. But none of that matters if the database runs out of memory and crashes. Today we’re diving into the dark, often misunderstood world of HANA memory management: how the heap allocator works, how the column store consumes memory, and what to do when the dreaded “Out of Memory” (OOM) kills your production system. This is the survival knowledge that separates a reactive Basis admin from a proactive performance engineer.
We’ll decode the numbers you see in HANA Cockpit, explain why a 128 GB system can OOM with only 80 GB allocated, and give you a step‑by‑step method to analyze and fix memory issues. By the end, you’ll be able to read an OOM dump, identify the culprit, and prevent future incidents – skills that directly save companies tens of thousands in avoided outages.
🧠 The HANA Heap Allocator – The Gatekeeper of RAM
HANA manages its dynamic memory through a heap allocator, a layer that sits between the operating system and the internal memory consumers (column store, row store, SQL execution, caches). Understanding this allocator is key because when memory runs out, it’s the allocator that throws the OOM error, not the OS.
🗂️ Shared Memory vs Process‑Local Heap
HANA’s memory landscape is split into two main areas:
- Shared Memory: Used by the row store, table metadata, and some internal structures. This memory is pre‑allocated at startup and cannot grow dynamically. It’s sized by parameters like
row_engine_memory. - Heap Memory: The dynamic, general‑purpose area. Column store tables, intermediate query results, SQLScript variables, and caches all live here. This is where OOMs happen.
The heap allocator obtains memory from the OS in chunks and then sub‑allocates it to internal components. It also handles garbage collection and defragmentation. When the total heap memory approaches the global_allocation_limit, the allocator starts rejecting allocation requests, triggering an OOM error.
📊 Key System Views for Heap Analysis
In M_HEAP_MEMORY, the column “EXCLUSIVE_SIZE” is crucial: it shows memory that is exclusively owned by a component and cannot be shared. “MALLOC_SIZE” is the total memory requested from the OS allocator. A large difference between the two often indicates memory fragmentation.
📚 Column Store Memory Architecture – Where Your Data Lives
The column store is by far the largest memory consumer. Understanding its internal memory breakdown helps you predict usage and size tables correctly.
🔹 Main Store and Delta Store Memory
As you learned in Day 1, each column‑store table consists of a Main Store (compressed, read‑optimized) and a Delta Store (write‑optimized, row‑oriented). Both consume heap memory, but in very different ways:
- Main Store Memory: Depends on the uncompressed size of columns, compression ratio, and dictionary sizes. Typically 5‑10x smaller than original raw data. You can query
M_CS_TABLESto seeRAW_SIZE(uncompressed size) vsMEMORY_SIZE_IN_MAIN(actual memory used). - Delta Store Memory: Each row in the delta store is stored uncompressed (or lightly compressed), so memory grows linearly with the number of unmerged rows. A large delta can suddenly spike memory usage.
🔹 Other Hidden Memory Consumers
- MVCC (Multi‑Version Concurrency Control): Each update creates a new version. Old versions are kept until they are no longer needed by any active transaction. This can temporarily double the memory of updated rows. Seen via
M_CS_TABLES.MEMORY_SIZE_IN_MVCC. - Intermediate Results: When a query executes, it may create temporary structures (hash tables for joins, aggregation buffers) that allocate heap memory. The
statement_memory_limitcaps these per‑statement. - Plan Cache: Compiled execution plans for SQL statements are cached to avoid re‑compilation. This uses a modest amount of memory but can grow with thousands of unique queries.
🛡️ Memory Parameters – The Invisible Fences
HANA’s memory behavior is governed by several key parameters. Misconfiguring them is the leading cause of OOM. Let’s demystify the most important ones.
global_allocation_limit
This is the maximum total memory HANA can allocate from the OS. It’s usually set to a value slightly below physical RAM (e.g., 90% for dedicated appliances). If HANA’s total allocation reaches this limit, no new allocations are allowed – and any query asking for more memory gets an OOM. Check it with:
statement_memory_limit
Limits memory per individual SQL statement. It prevents a single query from starving all other processes. It’s expressed in bytes (e.g., 1073741824 = 1 GB). If a statement needs more, it aborts with an OOM error before consuming system‑wide resources. Tune this based on your workload.
column_engine_memory_limit (available in later HANA 2.0 SPS)
This restricts the total memory used by the column store engine, including main and delta stores. It helps prevent the column store from taking 100% of the heap and leaving nothing for other operations.
Workload Classes and Memory Quotas
In HANA 2.0, you can assign memory budgets to workload classes. For example, critical OLTP queries get a higher guarantee, while batch analytics get a hard limit. This is advanced tuning but extremely powerful in mixed S/4HANA environments.
global_allocation_limit to 85‑90% of physical RAM on a dedicated HANA server. For multi‑tenant, allocate per tenant. Never set it to 100%; the OS needs memory for networking, file caches, and HANA’s own non‑heap processes.💀 OOM Analysis – Decoding the Death Certificate
When HANA hits a memory limit, it generates an OOM dump – a snapshot of memory state at the time of failure. This dump is written to the trace directory and contains the information you need to identify the cause. The dump file name looks like indexserver_oom_20260520_143022.trc.
🧐 Step‑by‑Step OOM Diagnosis
- Locate the OOM dump: In the HANA trace directory (
/usr/sap/{SID}/HDB{instance}/trace) find the latest*.oom.trcfile. Download it to your local machine. - Open with HANA Studio: In Studio, use “Open OOM Dump” from the menu (or double‑click the file). You’ll see a graphical breakdown.
- Check the “Allocator Summary”: It shows how much memory each allocator (Heap, Shared, etc.) was using. Often the heap is near the global limit.
- Identify the top memory consumer: The dump includes a list of the largest allocations at the time of OOM. Look for a component with a huge chunk – often “ColumnStore”, “SqlScript”, or “IntermediateResults”.
- Find the offending statement: The dump often includes the last SQL statement that caused the allocation failure. Note the connection ID and the full SQL text.
- Correlate with M_EXPENSIVE_STATEMENTS: Use the timestamp and connection ID to find the statement in the expensive statement trace. You’ll see its duration, memory, and CPU usage.
🧪 Hands‑on Lab: Simulate an OOM and Analyze the Dump (Safe Sandbox Only!)
We’ll intentionally trigger an OOM by setting a very low statement_memory_limit and running a memory‑hungry query. This is for demonstration purposes in a non‑production system.
After the error, reset the limit to a normal value. Then analyze the dump to see the failed allocation size and statement. This exercise builds intuition for production situations.
📊 Real‑Life OOM Scenario: The Delta Merge Avalanche
A manufacturing company experienced OOM every Monday morning. Investigation revealed that over the weekend, a large number of invoices were posted via IDocs, filling the delta store of ACDOCA. On Monday, the scheduled merge could not complete because the delta merge itself needed extra memory to rewrite the main store, pushing the total column store memory over the limit. The fix: stagger the IDoc processing to spread delta growth, increase the global allocation limit slightly, and schedule merges during off‑peak hours. The OOM vanished.
📈 Monitoring Memory Proactively – Cockpit and SQL Queries
The best OOM is the one that never happens. Use these tools to stay ahead of memory pressure.
HANA Cockpit Memory Overview
Navigate to the tenant → “Memory” card. You’ll see a pie chart of memory usage by component: Column Store, Row Store, Heap Allocator, etc. Set alerts at 80% and 90% usage. The “Memory History” view shows trends over days/weeks, so you can spot slow leaks or weekend growth.
Custom Proactive Queries
Schedule these queries via a simple SQLScript job and feed them into a monitoring dashboard. Being proactive is always cheaper than reactive.
🔁 Best Practices, Pros & Cons of HANA Memory Management
✅ Best Practices
- Size memory correctly: Use SAP Quick Sizer and real‑world data samples. Always include a buffer for growth.
- Set global_allocation_limit to 85‑90% of physical RAM.
- Implement statement_memory_limit to prevent a single query from killing the system. Start with 2‑4 GB per statement for analytical workloads.
- Regularly monitor delta merges: A growing delta is a ticking memory bomb. Automate delta merge alerts.
- Partition large tables (Day 5) to reduce memory scan footprint and improve cache efficiency.
- Use workload classes to isolate critical OLTP from heavy reporting.
- Keep HANA version current: Memory allocator improvements in SPS07 and later HANA 2.0 versions significantly reduce OOM risks.
⚠️ Common Pitfalls
- Ignoring delta size: Some teams focus only on main store and are shocked when delta consumes 30% of memory.
- Over‑allocating memory to row store: S/4HANA uses row store only for small control tables. Over‑sizing row_engine_memory wastes heap that column store could use.
- Not monitoring MVCC memory: Long‑running transactions (like a forgotten open query) can hold old row versions, slowly bloating memory.
- Setting statement_memory_limit too high: Then a single bad report can exhaust the entire heap.
🔄 Alternatives & Cloud Considerations
- In SAP HANA Cloud, you don’t set global_allocation_limit directly; you choose an instance size. The service automatically manages memory parameters. However, you still need to understand statement limits and delta merge implications.
- On‑premise, some companies use an additional monitoring tool like Solution Manager or a third‑party APM to predict memory trends.
📌 What You’ve Gained Today – Memory Confidence
You now understand the heap allocator, column store memory breakdown, and the exact steps to diagnose an OOM. You can read system views, set limits, and proactively protect your S/4HANA system from memory‑related outages. Tomorrow, we’ll extend into predictive analytics: HANA PAL for regression and clustering directly in SQL. You’ll see how to apply machine learning without moving data out of the database.

0 Comments
thanks for your comments!