SAP HANA REVISION UPDATE – SPS10

March 4, 2016, 10:10 am

≫ Next: HANA System Rename (hostname) through hdblcmgui command

≪ Previous: Reset the SYSTEM User's Password in HANA DB

Reason for HANA DB patch level update

We are copying HANA DB from SLES 11.3 revision 102.01 to RHEL 6.5 revision 102.00 through Backup / Restore method using SWPM (homogeneous system copy). While restoring any HANA DB it is necessary to have at least same or higher patch level into the target environment. This is reason we are updating target HANA DB environment from revision 102.00 to latest available patch level 102.04.

Download SAP HANA patches

Download following updates (Database, Studio & Client) from SAP marketplace and transfer to HANA server

Current available patch level is 102.04 (PL04). So we are updating into PL 04. We will download Studio, Client & DB for update.

SAP HANA Backup before update

Take complete backup before rev update start

Extract HANA Patches

Move all SAR files into HANA host server and extract using switch –manifest SIGNATURE.SM

If you extract more than one component SAR into a single directory, you need to move the SIGNATURE.SMF file to the subfolder (SAP_HANA_DATABASE, SAP_HANA_CLIENT, SAP_HANA_STUDIO etc.), before extracting the next SAR in order to avoid overwriting the SIGNATURE.SMF file. For more information, see also SAP Note 2178665 in Related Information.

Do the same for client & studio as well

HANA Update via STUDIO

Run SAP HANA Platform Lifecycle Management from HANA STUDIO

Select the location from HANA host

This completes HANA patch level update.

↧

HANA System Rename (hostname) through hdblcmgui command

March 11, 2016, 11:12 pm

Latest and popular articles on SAP ERP

≫ Next: SAP HANA - interesting notes and other information

≪ Previous: SAP HANA REVISION UPDATE – SPS10

Prerequisites

You are logged in as root user.
The SAP HANA system has been installed with the SAP HANA database lifecycle manager (HDBLCM).
The SAP HANA database server is up and running. Otherwise, inconsistencies in the configuration might occur.

Go to HDBCLM directory of the HANA host

# cd /hana/shared/SEC/hdblcm

# ./hdblcmgui

Choose the ‘rename of SAP HANA System’

Enter <SID>ADM password of HANA DB and mention the new hostname that need to updated

Check the information in the screen and proceed to next

If you want to change otherwise proceed next

Click on rename button

Now HANA DB hostname changed to <<new hostname>>

↧

SAP HANA - interesting notes and other information

March 17, 2016, 1:03 am

Latest and popular articles on SAP ERP

≫ Next: HANA stopped unexpectedly due to the accidental deletion of shared memory lock

≪ Previous: HANA System Rename (hostname) through hdblcmgui command

Dear all,

for more than four years I'm working with SAP HANA now. During this period of time we had to find solutions for different kinds of problems. In many cases I received additional information extending my knowledge around SAP HANA, too.

The following notes and web pages might be very useful for your daily work with SAP HANA. The mentioned SAP notes explain various parts of SAP HANA very well. In the web you can find well written examples and explanations. All the information might help you administering and maintaining your HANA landscape.

note number	Description
2063657	HANA system replication takeover decision guidelines
1925267	forgot SYSTEM password
1999997	FAQ: SAP HANA memory
2044468	FAQ: SAP HANA partitioning
1999998	FAQ: SAP HANA lock analysis
2000002	FAQ: SAP HANA sql optimization
2100009	FAQ: SAP HANA savepoints
2147247	FAQ: SAP HANA statistics server
2000003	FAQ: SAP HANA
2114710	FAQ: SAP HANA threads and thread samples
1999993	how-to: interpreting SAP HANA mini check results
1514967	SAP HANA: central note
2186744	FAQ: SAP HANA parameters
2036111	configuration parameters for the SAP HANA systems
1969700	sql statement collection for SAP HANA
1999880	FAQ: HANA system replication

Interesting websites with useful information.

URL	Description
https://blogs.saphana.com	SAP HANA Blog
http://help.sap.com/hana	Help SAP HANA Platform (Core)
http://help.sap.com/hana_platform	SAP HANA Platform (Core)
http://hana.sap.com/abouthana.html	SAP HANA Information
http://scn.sap.com/community/business-suite/blog/2015/03/02/sap-s4hana-frequently-asked-questions--part-1	HANA FAQ - links to parts 2 and 3 are given in the article, too

On https://open.sap.com course for SAP HANA, too, are published. These courses will give you a deeper knowledge of SAP HANA. You can register for free to these courses.

Enjoy the given websites and find some useful information for your daily work. Please add additional comments if you like to.

Martin

↧

HANA stopped unexpectedly due to the accidental deletion of shared memory lock

March 19, 2016, 11:27 pm

Latest and popular articles on SAP ERP

≫ Next: Myth of HANA

≪ Previous: SAP HANA - interesting notes and other information

Symptom

One of my friend faces a strange problem regarding to the unpacked stop of HANA server after cleaning some files under /tmp directory.

After manual starting the system, HANA will run normal again (version: SPS 09 Rev.95).

In nameserver_hosta....trc, the following error is shown:

[79877]{-1}[-1/-1] 2016-02-24 01:15:07.912168 f NameServer

TREXNameServer.cpp(03342) : shared memory lock

'/tmp/.hdb_ABC_30_lock'was deleted -> stopping instance ...

[79877]{-1}[-1/-1] 2016-02-24 01:15:10.655484 i Service_Shutdown

transmgmt.cc(06027) : Preparing for TransactionManager shutdown

Analysis

File /tmp/.hdb_<SID>_<instance number>_lock is used by HANA as a shared memory lock. If the file is deleted by chance, the database cannot manage the access of shared memory segment anymore and therefore has to stop accordingly.

For more detailed information, please refer to 1984700 - HANA stopped unexpectedly

If you are using RedHat Enterprise Linux, please take care of tmpwatch which delete files order than sometime.

For HANA <= 09, the shared memory lock file is /tmp/.hdb_<sid>_<inst_id>_lock

For HANA >= 10, the shared memory lock file is /var/lib/hdb/<sid>/.hdb_<sid>_<inst_id>_lock

(1999998 - FAQ: SAP HANA Lock Analysis)

Solution

DO NOT delete shared memory lock file.

If you are running RedHat, please remove tmpwatch from the system's cron job。

Hope the blog can help you fix the same kind of problem you face. Thanks to Chiwo Lee for experience sharing.

Regards,

Ning

↧

Myth of HANA

March 20, 2016, 2:53 am

Latest and popular articles on SAP ERP

≫ Next: SAP HANA : The Row store , column store and Data Compression

≪ Previous: HANA stopped unexpectedly due to the accidental deletion of shared memory lock

Hi experts,

since SAP HANA was available in the year 2011 (GA), I come across a lot of untruth about the new in-memory platform. As consultant I was able to talk to many costumers and other consultants on events like TechED, DSAG, Business partner days etc. Every time I was impressed after this long time that so much dangerous smattering is still out there. Some of them can be easily eleminated by reading the note 2100010 (SAP HANA: Popular Misconceptions)

The most answers to the statements are pretty easy to find in the offical notes, guides and other documents (blogs, presentations, articles etc.), but may it is an overload of information.

1) start time

2) cross SID backup

3) col / row store conversion

4) sizing *2

5) statistics

6) data fragmentation
7) persistency layer

8) high memory consumption HANA vs. Linux

9) Backup

10) Backup catalog

S stand for statement and A for the answer

SQL scripts

Used SQL scripts are available in the attachment of note 1969700 - SQL statement collection for SAP HANA

1) Start time

S: "The start time (availability of the SAP system) must be 30 to 60min to load all data into memory"

A: Yes, to load all data into memory it takes some time, but for any DB it also takes time to fill its data buffer. For any DB the data buffer will be filled on first access of the data and stay there until the the LRU (least recently used) algorithm takes place and push it out of the buffer.

HANA is loading the complete row store on every start into memory. After this the system is available!

Short description of start procedure:

1) open data files

2) read out information about last savepoint ( mapping of logical pages to physical pages in the data file / open transaction list)

3) load row store (depends on the size and the I/O subsystem; about 5min for 100GB)

4) replay redo logs

5) roll back uncommited transactions

6) perform savepoint

7) load col table defined as preload and lazy load of col tables (async load of Column tables that were loaded before restart)

For more details have a look at the SAP HANA Administration guide (search for "Restart Sequence") or the SAP HANA Administration book => Thanks to Lars and Richard for this great summary!

Example:

Test DB 40GB NW 740 system with a none enterprise storage (=slow):

SQL HANA_IO_KeyFigures_Total:

read: 33mb/s
avg-read-size: 31kb
avg-read-time: 0,93ms
write: 83mb/s
avg-write-size: 243kb
avg-write-time: 2,85ms
row store size: 11GB
CPU: 8vcpu (vmware; CPU E5-2680 v2 @ 2.80GHz)

Start time without preload: AVG 1:48

Stop time without preload: AVG 2:15

start time with 5GB col table (REPORSRC)

SQL for preload (more information in the guide "SAP HANA SQL and System views Reference"):

alter table REPOSRC preload all

verify with HANA_Tables_ColumnStore_PreloadActive script from note 1969700 - SQL statement collection for SAP HANA

Start time with preload: AVG 1:49

Stop time with preload: AVG 2:18

Why the start time don't increase although 5GB more data have to be loaded?

Since SPS 7, the preloading, together with the reloading, of tables happens async directly after the HDB restart has finished. That way, the system is again available for SQL access that do not require the information of the columns that are still being loaded.

With enterprise hardware the start times are faster!

If you want to know how long it takes to load all data into memory you can execute a python script.

load all tables into memory with python script:

cdpy (/usr/sap/HDB/SYS/exe/hdb/python_support/)
python ./loadAllTables.py --user=System --password=<password> --address=<hostname> --port=3xx15 --namespace=<schema_name>

[140737353893632, 854.406] << ending loadAllTables, rc = 0 (RC_TEST_OK) (91 of 91 subtests passed), after 854.399 secs

In a simular enterprise system it takes about 140-200sec.

2) Cross SID backup

S: "It is not possible not refresh a system via Cross-SID-copy"

A: Cross SID copy (single container) from disk is already available since a long time. Since SPS09 it is also available via backint interface.

Multitenant Database Container (MDC) for a Cross-SID-copy are currently (SPS11) only able to restore via disk.

3) Col / row store conversion

S: "Column tables can't be converted to row store and vice versa. It is defined by sap which tables are stored in which type."

A: It is correct that during the migration the SWPM (used for syscopy) procedure creates files in which store the tables are created.

But you can technically change the type from row to column and vice versa on the fly. But there must be a reason for it, e.g. in advise of SAP Support. If you have no depencies to the application, e.g. custom tables or a standalone HANA installation for your own applications, you can choose freely.

In the past SAP delivered a rowstorelist.txt with note 1659383 (RowStore Liste für SAP Netweaver 7.30/7.31 auf SAP HANA Database). This approach is out-dated. Nowadays you can use the latest version of SMIGR_CREATE_DDL with the option "RowStore List" (Note 1815547 - Row/ColumnStore Check ohne rowstorelist.txt)

4) Sizing * 2

S: "You have to double the sizing the result of the sizing report."

A: Results of Sizing reports are final, you dont have to double them.

example(BWoH):

|SIZING DETAILS |

|============== |

| |

| (For 512 GB node) data [GB] total [GB] |

| incl. dyn. |

| MASTER: |

| ------- |

| |

| Row Store 53 106 |

| Master Column Store 11 21 |

| Caches / Services 50 50 |

| TOTAL (MASTER) 114 178 |

| |

| SLAVES: |

| ------- |

| |

| Slave Column Store 67 135 |

| Caches / Services 0 0 |

| TOTAL (SLAVES) 67 135 |

| --------------------------------------------------------------- |

| TOTAL (All Servers) 181 312 |

This is a scale up solution. So Master and Slave are functional on one host. In a scale out solution you have one host as master for the transaction load. This one holds all row store tables. SAP recommends to have a min. of 3 hosts in a BW scale out solution. The other 2 slaves are for the reporting load.

Static and dynamic RAM

SAP HANA Main Memory Sizing is divided into static and the dynamic RAM requirement. The static part relates to the amount of main memory that is used for the holding the table data. The dynamic part has exact the same size as the static one and is used for temp data => grouping, sorting, query temp objects etc.

In this example you have:

row store 53 *2 = 106GB

Master column 11*2 =21(rounded) + 67*2= 135 (rounded) => 156GB

Caches / Services 50GB is needed for every host

106+156+50 in sum 312GB

5) Statistics

S: "Statistics are not needed any more. So no collect runs are needed"

A: For the Col store the Statement is correct in cause of the known data distribution through the dictionary. For the row store there is an automatically collection of statistics on the fly. So you don't have to schedule them. Currently it is not documented how you can trigger the collection or change sample size.

6) Data Fragmentation

S: "You don't have to take care of data fragmentation. All is saved in memory via col store and there is no fragmention of data"

A: Some tables are created in the row store. The row store still follows the old rules and conditions which results in fragmentation of data. How to analyze it?

Please see note 1813245 - SAP HANA DB: Row store reorganization

SELECT HOST, PORT, CASE WHEN (((SUM(FREE_SIZE) / SUM(ALLOCATED_SIZE)) > 0.30)
AND SUM(ALLOCATED_SIZE) > TO_DECIMAL(10)*1024*1024*1024)
THEN 'TRUE' ELSE 'FALSE' END "Row store Reorganization Recommended",
TO_DECIMAL( SUM(FREE_SIZE)*100 / SUM(ALLOCATED_SIZE), 10,2)"Free Space Ratio in %"
,TO_DECIMAL( SUM(ALLOCATED_SIZE)/1048576, 10, 2) "Allocated Size in MB"
,TO_DECIMAL( SUM(FREE_SIZE)/1048576, 10, 2) "Free Size in MB"
FROM M_RS_MEMORY WHERE ( CATEGORY = 'TABLE' OR CATEGORY = 'CATALOG' ) GROUP BY HOST, PORT

Reorg advise: if row store is bigger than 10GB and more than 30% free space

!!!Please check all prerequesites in the notes before you start the reorg!!! (online / offline reorg)

Row Store offline Reorganization is triggered at restart time and thus service downtime is required. Since it's guaranteed that there are no update transactions during the restart time, it achieves the maximum compaction ratio.

Before

Row Store Size: 11GB

Freespace: ~3GB

in %: 27% (no reorg needed)

But for testing I configured the needed parameters in indexserver.ini (don't forget to remove them afterwards!):

4min startup time => while starting the row store will reorganized in offline mode

After

Row Store Size: 7,5GB

Freespace: ~250MB

in %: 3,5%

Additionally you should consider the tables with multiple containers if revision is 90+. Multiple containers are typically introduced when additional columns are added to an existing table. As a consequence of multiple containers the performance can suffer, e.g. because indexes only take effect for a subset of containers

HANA_Tables_RowStore_TablesWithMultipleContainers

The compression methods of the col store (incl. indexes) should also be considered.

As of SPS 09 you can switch the largest unique indexes to INVERTED HASH indexes. In average you can save more than 30 % of space. See SAP Note 2109355 (How-To: Configuring SAP HANA Inverted Hash Indexes) for more information. Compression optimization for those tables:

UPDATE "<table_name>" WITH PARAMETERS ('OPTIMIZE_COMPRESSION' = 'FORCE')

Details:2112604 - FAQ: SAP HANA Compression

7) Persistency layer

S: "The persistency layer consists of exactly the same data which are loaded into memory"

A: As descibed in statement 3) the memory is parted into 2 areas. The temp data won't be stored on disk. The persistency layer on disk consists of the payload of data, before&after images / shadow pages concept + snapshot data + delta log (for delta merge). The real delta structure of the merge scenario only exists in memory, but it is written to the delta logs.

Check out this delta by yourself:

SQL: HANA_Memory_Overview

check memory usage vs. disk size

8) High Memory consumption HANA vs. Linux

S: "The used memory of the processes is the memory which is currently in use by HANA"

A: No, for the Linux OS it is not transparent what HANA currently real uses. The numbers in "top" are never maching the ones in the hana studio. HANA communicates free pages not instantly to the OS. There is a time offset for freed memory.

There is a pretty nice document which explaines this behaviour in detail:

http://scn.sap.com/docs/DOC-60337

The garbage collection takes by default pretty late. If your system shows a high memory consumtion the root cause may not necessarily a bad sizing or high load. The reason could also be a late GC.

2169283 - FAQ: SAP HANA Garbage Collection

One kind of garbage collection we already discussed in 6) row and col fragmentation. Another one is for Hybrid LOBs and there is one for the whole memory. Check out your current heap memory usage with HANA_Memory_Overview.

In my little test system the value is 80GB. In this example we have 14GB for Pool/Statistics , 13GB for Pool/PersistenceManager/PersistentSpace(0)/DefaultLPA/Page and 9GB for Pool/RowEngine/TableRuntimeData

Check also the value of col EXCLUSIVE_ALLOCATED_SIZE in the monitoring view "M_HEAP_MEMORY". It contains the sum of all allocations in this heap allocator since the last startup.

select CATEGORY, EXCLUSIVE_ALLOCATED_SIZE,EXCLUSIVE_DEALLOCATED_SIZE,EXCLUSIVE_ALLOCATED_COUNT,
EXCLUSIVE_DEALLOCATED_COUNT from M_HEAP_MEMORY
where category = 'Pool/Statistics'
or category='Pool/PersistenceManager/PersistentSpace(0)/DefaultLPA/Page'
or category='Pool/RowEngine/TableRuntimeData';

Just look at the index server port 3xx03 (may be the xsengine is also listed if active)

CATEGORY	EXCL_ALLOC_SIZE	EXCL_DEALLOC_SIZE	EXCL_ALLOC_COUNT	EXCL_DEALLOC_COUNT
Pool/PersistenceManager/PersistentSpace(0)/DefaultLPA/Page	384.055.164.928	369.623.433.216	6.177.019	5.856.165
Pool/RowEngine/TableRuntimeData	10.488.371.360	792.726.992	83.346.945	26
Pool/Statistics	2.251.935.681.472	2.237.204.512.696	7.146.662.527	7.084.878.887

In cause of a lot of deallocation there is a gap between the EXCLUSIVE_ALLOCATED_SIZE and the currently allocated size. The difference is usually free for reuse and can be freed with a GC run.

But by default the memory GC will be triggered by default in the following cases:

Parameter + Default value	Details
async_free_target = 95 (%)	When proactive memory garbage collection is triggered, SAP HANA tries to reduce allocated memory below async_free_target percent of the global allocation limit.
async_free_threshold = 100 (%)	With the default of 100 % the garbage collection is quite "lazy" and only kicks in when there is a memory shortage. This is in general no problem and provides performance advantages, as the number of memory allocations and deallocations is minimized.
gc_unused_memory_threshold_abs = 0 (MB)	Memory garbage collection is triggered when the amount of allocated, but unused memory exceeds the configured value (in MB).
gc_unused_memory_threshold_rel = -1 (%)	Memory garbage collection is triggered when the amount of allocated memory exceeds the used memory by the configured percentage.

The % values are related to the configured global allocation limit.

Unnessarily triggered GC should be absolutely avoided, but it depends on your system load and sizing how you configure these values.

The unused memory will normally be reused by the HDB (free pool), so there is need to trigger the GC manually. But in some cases it is possible that a pool uses more memory. This should be analyzed (1999997 - FAQ: SAP HANA Memory 14. How can I identify how a particular heap allocator is populated?)

If we now trigger a manual GC for the memory area:

hdbcons 'mm gc -f'

Before:

heap: 80GB

free -m

total used free shared buffers cached

Mem: 129073 126877 2195 15434 142 32393

-/+ buffers/cache: 94341 34731

Garbage collection. Starting with 96247664640 allocated bytes.

82188451840 bytes allocated after garbage collection.

After:

heap: 72GB

free -m

total used free shared buffers cached

Mem: 129073 113680 15393 15434 142 32393

-/+ buffers/cache: 81144 47929

So at this time inside the hdb there is in this scenario not so much difference, but at the OS side the not allocated memory will be freed.

You don't have to do this manually! HANA is fully aware of the memory management!

If you get an alert (id 1 / 43) in cause of memory usage of your services, you should analyze not only row and col store. Take also care of the GC of the heap memory. In the past there were some bugs in this area.

Alert defaults:

ID 1: Host physical memory usage: low: 95% medium: 98% high:100%

ID43: memory usage of services: low: 80% medium: 90% high:95%

As you can see a GC will be triggered lazy at 100% fill ratio of the global allocationlimit by default may be it is too late for your system before the GC takes place or you can react to it.

In addition to the memory usage check the mini check script and the note advices. If you are not sure how to analyze or solve the issue you can order a TPO service at SAP (2177604 - FAQ: SAP HANA Technical Performance Optimization Service).

9) Backup

S: "Restore requires logs for consistent restore"

A: wrong, a HANA backup based on snapshot technology. So the backup is consistent without any additional log file. This means it is a full online copy of one particular consistent state which is defined by the log position at the time executing the backup.

Sure if you want to roll forward you have to apply Log Files for point in time recovery or most recent state.

10) Backup Catalog

S: "Catalog information are stored in a file like oracle *.anf which is needed for recovery"

A: The backup catalog is saved on every data AND log backup. It is not saved as human readable file! you can check the catalog in hana database studio or with command "strings log_backup_0_0_0_0.<backupid>" in the backup location of your system if you make backup-to-disk.

The backup catalog includes all needed information which file belongs to which backup set. If you delete your backups on disk/VTL/tape level the backup catalog still holds the unvalid information.

Housekeeping of the backup catalog

There is currently no automatism which clean it up. Just check the size of your backup catalog if it is bigger than about 20MB you should take care of housekeeping (depends on your backup retention and size of the system) the backup catalog, because it will be saved as already mentioned EVERY log AND data backup. This means more than 200 times a day! How big is your current backup catalog of your productive HANA system? Check your backup editor in hana studio and click on show log backups. Search for the backup catalog and select it => check the size.

Summary

At the end you also have to take care of your data housekeeping and resource management. You can save a lot of resources if you consider all the hints in the notes.

I hope I could clarify some statements for you.

###########

# Edit V4

###########

2100010 - SAP HANA: Popular Misconceptions

(Thanks to Lars for the hint)

Best Regards,

Jens Gleichmann

###########

# History

###########

V4: Updated statistics (5); row/col statement adjusted, format adjusted

V5: adjusted format and added details for backup catalog

↧

SAP HANA : The Row store , column store and Data Compression

March 22, 2016, 7:01 pm

Latest and popular articles on SAP ERP

≫ Next: Parallelization options with the SAP HANA and R-Integration

≪ Previous: Myth of HANA

Here is an attempt to explain the row store data layout, column store data layout and the data compression technique.

Row Store : Here all data connect to a row is placed next to each other. See below an example.

Table 1 :

Name	Location	Gender
…..	…..	….
Sachin	Mumbai	M
Sania	Hyderabad	F
Dravid	Bangalore	M
…….	……	……

Row store corresponding to above table is

Column store : Here contents of a column are placed next to each other. See below illustration of table 1.

Data Compression : SAP HANA provide series of data compression technique that can be used for data in the column store. To store contents of a column , the HANA database creates minimum two data structures. A dictionary vector and an attribute vector. See below table 2 and the corresponding column store.

Table 2.

Record	Name	Location	Gender
…..	…..	…..	….
3	Blue	Mumbai	M
4	Blue	Bangalore	M
5	Green	Chennai	F
6	Red	Mumbai	M
7	Red	Bangalore	F
……	…..	……	……

Here in the above example the column ‘Name’ has repeating values ‘Blue’ and ‘Red’. Similarly for ‘Location’ and ‘Gender’. The dictionary vector stores each value of a column only once in a sorted order and also a position is maintained against each value. With reference to the above example , the dictionary vectors of Name , Location and Gender could be as follows.

Dictionary vector : Name

Name	Position
….	……
Blue	10
Green	11
Red	12
…..	……

Dictionary vector : Location

Location	Position
….	……
Bangalore	3
Chennai	4
Mumbai	5
…..	……

Dictionary vector : Gender

Gender	Position
F	1
M	2

Now the Attribute vector corresponding the above table would be as follows. Here it stores the integer values , which is the positions in dictionary vector.

↧

Parallelization options with the SAP HANA and R-Integration

April 3, 2016, 3:06 pm

Latest and popular articles on SAP ERP

≫ Next: [SAP HANA Academy] Live3: Web Services - Using OData

≪ Previous: SAP HANA : The Row store , column store and Data Compression

Why is parallelization relevant?

The R-Integration with SAP HANA aims at leveraging R’s rich set of powerful statistical, data mining capabilities, as well as its fast, high-level and built-in convenience operations for data manipulation (eg. Matrix multiplication, data sub setting etc.) in the context of a SAP HANA-based application. To benefit from the power of R, the R-integration framework requires a setup with two separate hosts for SAP HANA and the R/Rserve environment. A brief summary of how R processing from a SAP HANA application works is described in the following:

SAP HANA triggers the creation of a dedicated R-process on the R-host machine, then
R-code plus data (accessible from SAP HANA) are transferred via TCP/IP to the spawned R-process.
Some computational tasks take place within the R-process, and
the results are sent back from R to SAP HANA for consumption and further processing.

For more details, see the SAP HANA R Integration Guide: http://help.sap.com/hana/SAP_HANA_R_Integration_Guide_en.pdf

There are certain performance-related bottlenecks within the default integration setup which should be considered. The main ones are the following:

Firstly, latency is incurred when transferring large datasets from SAP HANA to the R-process for computation on the foreign host machine.
Secondly, R inherently executes in a single threaded mode. This means that, irrespective of the number of CPU resources available on the R-host machine, an R-process will by default execute on a single CPU core. Besides full memory utilization on the R-host machine, the available CPU processing capabilities will remain underutilized.

A straightforward approach to gain performance improvements in the given setup is by leveraging parallelization. Thus I want to present an overview and highlight avenues for parallelization within the R-Integration with SAP HANA in this document.

Overview of parallelization options

The parallelization options to consider vary from hardware scaling (host box) to R-process scaling and are illustrated in the following diagram

The three main paths to leverage parallelization, as illustrated above, are the following:

(1) Trigger the execution of multiple R-calls in parallel from within SQLScript procedures in SAP HANA

(2) Use parallel R libraries to spawn child (worker) R processes within parent (master) R-process execution

(3) Scale the number of R-host machines connected to SAP HANA for parallel execution (scale memory and add computational power)

While each option can be implemented independently of one another, they can as well be combined and mixed. For example if you go for (3)– scaling number of R-hosts, you need (1)– Trigger the execution of multiple R-calls, for parallelism to take place. Without (1), you may remain “only” in a better high availability/fault tolerant scenario.

Based on the following use case, I would illustrate the different parallelization approaches using some code examples:

A Health Care unit wishes to predict cancer patient’s survival probability over different time horizons, after following various treatment options based on diagnosis. Let's assume the following information:

Survival periods for prediction are: half year, one year and two years
Accordingly, 3 predictive models have been trained (HALF, ONE, TWO) to predict a new patient’s survival probability over these periods, given a set predictor variables based on historical treatment data.

In a default approach without leveraging parallelization, you would have one R-CALL transferring a full set of new patient data to be evaluated, plus all three models from SAP HANA to the R-host. On the R-host, a single-threaded R process will be spawned. Survival predictions for all 3 periods would be executed sequentially. An example of the SAP HANA stored procedure of type RLANG is as shown below.

In the code above 3 trained models (variable tr_models) are passed to the R-Process for predicting the survival of new patient data (variable eval). The survival prediction based on each model takes place in the body of the “for loop” statement highlighted above.

Performance measurement: For dataset size of 1.038.024 (~16.15 MB) observations and 3 trained Blob model objects (each~26.8MB), an execution time of 8.900 seconds was recorded.

There are various sources of overhead involved in this scenario. The most notable ones are:

Network communication overhead, in copying one dataset + 3 models (BLOB) from SAP HANA to R.
Code complexity, sequentially executing each model in a single-threaded R-process. Furthermore, the “for” loop control construct, though in-built into base R, may not be efficient from a performance perspective in this case.

By employing parallelization techniques, I hope to achieve better results in terms of performance. Let the results of this scenario constitute our benchmark for parallelization.

Applying the 3 parallelization options to the example scenario

1. Parallelize by executing multiple R-calls from SAP HANA

We can exploit the inherent parallel nature of SAP HANA’s database processing engines by triggering multiple R-calls to run in parallel as illustrated as above. For each R-call triggered by SAP HANA, the Rserve-process would spawn an independent R-runtime process on the R-host machine.

An example illustrating how an SAP HANA SQLScript-stored procedure with multiple parallel calls of stored procedure type RLANG is given below. In the example, one thought is to separate patient survival prediction across 3 separate R-Calls as follows:

Create an RLANG stored procedure handling survival prediction for just one model ( see input variable tr_model).
Include expression “READS SQL DATA” (as highlighted above) in the RLANG procedure definition for parallel execution of R-operators to occur, when embedded in a procedure of type SQLScript. Without this instruction, R-calls embedded in an SQLScript will excute sequentially.
Then create an SQLSCRIPT procedure

Embed 3 RLANG procedure-calls within the SQLSCRIPT procedure as highlighted. Notice that I am calling the same RLANG procedure defined previously but I pass on different trained model objects (trModelHalf, trModelOne, trModelTwo) to separate survival predication across different R-calls.
In this SQLScript procedure you can include the READS SQL DATA expression (recommended for security reasons as documented in the SAP HANA SQLScript Reference guide) in the SQLSCRIPT procedure definition, but to trigger R-Calls in parallel it is not mandatory. If included however, you cannot use DDL/DML instructions (INSERT/UPDATE/DELETE etc) within the SQLSCRIPT procedure.
On the R host, 3 R processes will be triggered, and run in parallel. Consequently, 3 CPU cores will be utilized on the R machine.

Performance measurement: In this parallel R-calls scenario example, an execution time of 6.278 seconds was experienced. This represents a performance gain of roughly 29.46%. Although this indicates an improvement in performance, we may have theoretically expected a performance improvement close to 75%, given that we trigger 3 R-calls. The answer for this gab is overhead. But which one?

In this example, I parallelized survival prediction across 3 R-calls, but still transmit the same patient dataset in each R-call. While the improvement in performance could be explained, firstly, by the fact that now HANA transmits lesser data per R-call (only one model, as opposed to three in the default scenrio) and consequently the data transfer may be faster. Secondly, each model survival prediction is performed in 3 separate R-runtimes.

There are two other avenues we could explore for optimization in this use case scenario. One is to further parallelize R-runtime prediction itself (see section 2). The other is to further reduce the amount of data transmitted per R-call by splitting the patient dataset in HANA and parallelize the data transmitted across separate R-calls (see section 4).

Please note that without the READS SQL DATA instruction in the RLANG procedure definition an execution time of 13.868 seconds was experienced. This is because each R-CALL embedded in the SQLscript procedure is executed sequentially (3 R-call roundtrips).

2. Parallelize the R-runtime execution using parallel R libraries

By default, R execution is single threaded. No matter how much processing resource is available on the R-host machine (64, 32, 8 CPU cores etc.), a single R runtime process will only use one of them. In the following I will give examples of some techniques to improve the execution performance by running R code in parallel.

Several open source R packages exist which offer support for parallelism with R. The most popular packages for R-runtime parallelism on a single host are “parallel” and “foreach”. The “parallel” package offers a myriad of parallel functions, each specific to the nature of data (lists, arrays etc.) subject to parallelism. Moreover, for historical reasons, one can classify these parallel functions roughly under two broad categories, prefixed by “par-“ (parallel snow cluster) and “mc-“ (multicore).

In the following example I use the multicore function mcLapply() to invoke parallel R processes on the patient dataset. Within each of the 3 parallel R-runtimes triggered from HANA, split the patient data into 3 subsets, then, parallelize survival prediction on each subset. See figure below.

The script example above highlights the following:

3 CPU cores are used (variable n.cores)by the R-process
The patient data is split into 3 partitions, according to number of chosen cores, using the “splitIndices” function.
The task to be performed (survival prediction) by each CPU core is defined in function “scoreFun
Then I call the mclapply()split.idx) , how many CPU cores to use, and which function should be executed by each core.

In this example, 3 R-processes (master) are initially triggered in parallel on the R-host by the 3 R-calls. Then within each master R-runtime, 3 additional child R-processes (worker) are spawn by calling mclapply(). On the R-host, therefore, we will have 3 processing groups executing in parallel, each consisting of 4 R-Runtimes (1 for master and 3 for workers). Each group is dedicated to predict patient survival based one model. For this setup 12 CPUs will be used in total.

Performance measurement: In this parallel R package scenario using mclapply(), an execution time of 4.603 seconds was observed. This represents roughly 48.28% gain in performance over the default (benchmark) scenario and a roughly 20% improvement over the parallel R-call example presented in section 2.

3. Parallelize by scaling the number of R-Host machines connected to HANA for parallel execution

It is also possible to connect SAP HANA to multiple R-hosts, and exploit this setup for parallelization. The major motivation for choosing this option is to increase the number of processing units (as well as memory) available for computation, provided the resources of a single host are not sufficient. With this constellation, however, it would not be possible to control which R-host receives which R request. The choice will be determined randomly via an equally-weighted round-robin technique. From an SQLScript procedure perspective, nothing changes. You can reuse the same parallel R-call scripts as exemplified in section 1 above.

Setup Prerequisites

Include more than one IPv4 addresses in CalcEngine parameter cer_rserve_addressesindexserver.inixsengine.ini file (see section 3.3 of SAP HANA R Integration Guide)
Setup parallel R-Calls within as SQLSCRIPT procedure, as described in section

I configure 2 R-host addresses in the calcengine rserve address option shown above. While still using the same SQLScript procedure as in the 3 R-Calls scenario example (I change nothing in the code), I trigger parallelization of 3 R-calls across two R-host machines.

Performance measurement: The scenario took 6.342 seconds to execute. This execution time is similar to the times experienced in the parallel R-calls example. This example only demonstrates that parallelism works in a multi R-host setup. Its real benefit for parallelization comes into play when it believed the computational resources (CPUs, memory) available on one R-box are not enough.

4. Optimizing data transfer latency between SAP HANA and R

As discussed in section 1, one performance overhead is in the transmission of the full patient data set in each parallel R-call from HANA to R (see example in section 1). We could further reduce the latency in data transfer by splitting data set into 3 subsets in HANA, then using 3 parallel R-calls we transfer each subset from HANA to R for prediction. In each R call, however, we would have to also transfer all 3 models.

An example illustrating this concept is provided in the next figure.

In the example above, the following is performed

The patient dataset (eval) is split into 3 subsets in HANA (eval1, eval2, eval3).
3 R-calls are triggered, each with the transferring a data subset together with all 3 models.
On the R-host, 3 master R-process will be triggered. Within each master R-Process I parallelize survival prediction across 3 cores using pair functions mcpallelel()/mccollect() for task parallelism in the “parallel” R-package from the (task parallelism) as shown below.

I create and R funtion (scoreFun) to specify a particular task. This function focuses on predicting survival based on one model input parameter.
For each call of mcparallel() function an R process is started in parallel and will evaluate the expression in R function definition scoreFun. I assign each model individually.
With a list of assigned tasks I then call mccollect() to retrieve the results of parallel survival prediction.

In this manner, the overall data transfer latency is reduced to the size of data in each subset. Furthermore, we still maintaining completeness of data via parallel R-calls. The consistency in the results of this approach is guaranteed if there is no dependency in the result computation for each observation in the data set.

Performance measurement: With this scenario, an execution time of 2.444 seconds was observed. This represents a 72.54% performance gain over the default benchmark scenario. This represents roughly 43% improvement over the parallel R-call scenario example in section 1, and a 24.26% improvement over the parallel R-runtime execution (with parallel R-libraries) example in section 2. A fantastic result supporting the case for parallelization.

Concluding Remarks

The purpose of this document is to illustrate how techniques of parallelization can be implemented to address performance-related bottlenecks within the default integration setup between SAP HANA and R. The document presented 3 parallelization options one could consider:

Trigger parallel R-calls from HANA
Use parallel R libraries to parallelize the R-execution
Parallelize R-calls across multiple R-hosts.

With parallel R libraries you can improve the performance of a triggered R-process execution by spawning additional R-runtime instances executing on the R-host (see section 2). You can either parallelize by data (split data set computation across multiple R-runtimes), or by task (split algorithmic computation across multiple R-runtimes). A good understanding of the nature of the data and the algorithm is, therefore, fundamental to choosing how to parallelize. When executing parallel R runtimes using R-libraries we should remember that there is an additional setup overhead incurred by the system when spawning child (worker) R-processes and terminating them. The benefits of parallelism using option should, therefore, be appreciated after prior testing in an environment similar to the productive environment it will eventually run.

On the other hand, when using the trigger parallel R-calls option, no additional overhead is incurred on the overall performance. This option provides us with a means to increase the number of data transmission lanes between HANA and the R-host, as well as allows us spawn multiple parent R-runtime processes in the R-host. Exploiting this option led to the following key finding: The data transfer latency between HANA and R can, in fact, be significantly reduced by splitting the data set in HANA, and then parallelize the transfer of each subset from HANA to R using parallel R-calls (as illustrated in section 4).

[SAP HANA Academy] Live3: Web Services - Using OData

April 29, 2015, 12:40 pm

Latest and popular articles on SAP ERP

≫ Next: [SAP HANA Academy] Live3: Web Service - Setup XSJS

≪ Previous: Parallelization options with the SAP HANA and R-Integration

[Update: April 5th, 2016 - The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with series using the most recent version of the free developer trail edition of HCP.]

Continuing the Live3 on the SAP HANA Cloud Platform course the SAP HANA Academy’s Philip Mugglestone provides a closer examination of the previously setup OData web services by running some example queries. Watch Philip's tutorial video below.

(0:20 – 4:20) Viewing Meta Data and Entities in JSON Format

Running the services.xsodata file has generated a URL based on the trail account (p number), SAP HANA instance (dev), project (live3), and file (services.xsodata). Calling the file lists out the existing entities (Tweets, Tweeters, TweetersClustered and Clusters).

With OData we can make requests via URL based syntax. For example appending /$metadata to the end of the URL displays the full meta data for all of the properties within each entity. The data you get from OData is self referencing and is very important as SAPUI5 can read this meta data automatically to generate the screens.

Be careful when looking at the individual entities in OData as there may be 100,000s of, for example Tweets, and you don’t want to read them all. So appending /Tweets?$top=3 to the URL only displays the top 3 Tweets in XML format.

The XML format appears a bit messy so you can convert it to JSON format by adding &$format=json to the URL. By default the JSON format isn’t as readable as possibly desired so you download for free JSONView from the chrome store in order to display it in a nice readable format.

To see only certain parts of an entity's data, for instance the id and text columns, you can append &$select= id,text to the URL. This returns only the id and text values, as well as the meta data for the Tweets entity.

(4:20 – 6:30) OData's Filter, Expand and Count Parameters

Philip next shows the data for his Clusters entity by adding /Clusters?$format=json to the URL. Similar to a where clause in SQL, Philip filters his results by adding &$filter=clusterNumber eq1 to display only his first cluster.

To see the Twitters association from the Clusters entity Philip adds an expand parameter by entering &$expand=Tweeters to the end of the URL. This returns all of the information for each of the individual Twitters in cluster 1.

To see the number of rows for an entity add /$count after the entity’s name in the URL.

Follow along with the Live3 on HCP course here.

SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.

Follow @saphanaacademy

↧

[SAP HANA Academy] Live3: Web Service - Setup XSJS

May 5, 2015, 3:31 pm

Latest and popular articles on SAP ERP

≫ Next: [SAP HANA Academy] Live3: Web Services - Debugging

≪ Previous: [SAP HANA Academy] Live3: Web Services - Using OData

Part of the SAP HANA Academy’s Live3 on HCP course, the below video tutorial from Philip Mugglestone shows how to add server-side scripting capabilities to the live3 web services project. With this you can configure actions to refresh the clustering and reset the database. Watch Philip’s video below.

(0:35 – 3:00) Inserting the Proper Schema Name and P Number into the services.xsjs Code

With the live3 project selected in the SAP Web-based Development Workbench, open the services folder of the Live3 GitHub code repository and drag the services.xsjs file in the Multi-File Drop Zone. First you must do a global replace to insert your schema name. Also you must insert your account p number where marked in the code as the code checks to verify if the user has the execute privilege. After verification the user can preform the reset and/or cluster operation.

(3:00 – 6:00) Examining the Code’s Logic

The code is very straight forward. It first checks if the user has the privilege to execute. If so then the URL command (cmd) parameter will be returned. It will pause there and wait for the command. If cmd=reset then it will call the reset function and if cmd=cluster than it will call the cluster function. If neither reset nor cluster is entered then it will display invalid command. If the user isn’t authorized then a not authorized message will appear.

The reset function’s code first sets the schema and then truncates (empties) the Tweets table that is loaded directly via node.js. Next it empties the PAL results and centers tables. Then the full text analysis index is first cleaned out and then recreated using the same code that was used earlier in the setup text analysis piece. The only difference from earlier is that the code is modified with a backslash in front of every single quotation mark in the SQL.

The cluster function’s code is similar to the setup Predictive SQL code. The schema is set and the PAL results and centers tables are truncated. Then the procedure is called. On the web as opposed to seeing the results directly, instead the results table will display questions marks first. Then it will loop around a set of results and then insert those results into the table using JavaScript.

(6:00 – 7:30) Testing services.xsjs

Executing the services.xsjs file will open a web page that displays invalid command: undefined. This should happen as it didn’t recognize the default command that was specified. So you must delete the default anti caching system that appears after /service.xsjs? in the URL and them add a valid command. For instance cmd=cluster.

Entering the command for cluster won’t display anything on the web page at this point. However to show that the file has run with a valid command open the developer tools (control+shift+I in Chorme) and go to the network tab. In the network tab there will be information about the call.

Follow along with the Live3 on HCP course here.

SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.

Follow @saphanaacademy

↧

[SAP HANA Academy] Live3: Web Services - Debugging

May 7, 2015, 10:24 am

Latest and popular articles on SAP ERP

≫ Next: [SAP HANA Academy] Live3: Web Services - Authentication

≪ Previous: [SAP HANA Academy] Live3: Web Service - Setup XSJS

The SAP HANA Academy’s Philip Mugglestone continues the Live3 on HCP course by showing how the server-side scripting application can be easily debugged using the SAP HANA Web-based Development Workbench. Check out Philip’s tutorial video below.

(0:15 – 4:10) How to Debug the XSJS Application

First identify the user account. This is listed near the top right corner of the SAP HANA Web-based Development Workbench. Right click on the user name (in Philip’s case it begins with DEV_) and select inspect element. Then copy the user account name so it can be used later on in the debugging.

Now a definition must be created that enables this user to preform debugging. When logged into the server go to the URL displayed below ending with /sap/hana/xs/debugger. On the Grant Access screen paste in the copied account name into the Username text box. Set an expiration date and time for when the debugging access will cease and then click the grant button. Now this user can debug the session.

Back in the SAP HANA Web-based Development Workbench choose the services.xsjs file and hit the execute button to open it up in a new browser tab. Append cmd=cluster1 to the end of the URL to return an invalid command. Now open the developer tools (control+shift+I in chrome) and navigate to the resources tab. Then expand the Cookies folder and open the session cookie file. Identify the value of the xxSessionId.

Now back in the SAP HANA Web-based Development Workbench click the settings button. Then choose the value of the xxSessionId as the session to debug and click apply. A message will appear that the debugger has been attached to the session. Next set a break point where the command is being processed in the code.

Now make a call in the URL. Philip enteres cmd=cluster2. The screen won’t change from earlier and will still say Invalid Command: cluster1 as it will say waiting for hanaxs.trail.ondemand. This is because the debugger has been opened in the SAP HANA Web-based Development Workbench. You will see that the cluster 2 command has been entered and the debugger has come to the break point that was set. You have the normal debugging options such as step in, step over, step through, etc. If you hit the resume button on the debugger than on the file page it will now say Invalid Command: cluster2.

This is how you can access the debugger to preform real-time debugging when using xs in SAP HANA.

Follow along with the Live3 on HCP course here.

SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.

Follow @saphanaacademy

↧

[SAP HANA Academy] Live3: Web Services - Authentication

May 8, 2015, 10:06 am

Latest and popular articles on SAP ERP

≫ Next: [SAP HANA Academy] Live3: Web Services - Authentication Setup Proxy

≪ Previous: [SAP HANA Academy] Live3: Web Services - Debugging

In the next part of the SAP HANA Academy’s Live3 on HCP course Philip Mugglestone explains why a “proxy” authentication server is needed to access your SAP HANA Cloud Platform web services from a SAP HANA Cloud HTML5 application. Watch Philip’s tutorial video below.

(0:12 – 3:00) Issue with HTML5 Authentication for the HCP Developer Trail Edition

Prior to this tutorial the web services were set up using the SAP HANA instance. We now want to access our Live3 app, OData, and server side JavaScript from a front end application UI.

Back in the SAP HANA Cloud Platform Cockpit our SAP HANA instance now has one application. Clicking on the application shows the URL, which you can navigate to and then enter a command like we've done in the earlier videos in the Live3 course.

There is one slight complication to building a HTML5 front end application. Our SAP HANA instances in the developer trail edition of HCP use SAML 2.0 authentication. Normally to access a backend system when working with a HTML5 application you use a destination in order to reference a folder or URL. The destination appears to be local to where the HTML5 application is hosted. However, it is pushed out to a backend system that can be hosted anywhere on the internet (even behind a firewall if you use the cloud connector). The destination is very important as it allows you get around the restriction of most browsers.

The trail edition of the SAP HANA Cloud Platform uses only SAML 2.0 as the authentication for the SAP HANA instance. SAML 2.0 is not an authentication method available in the destination configuration in the SAP HANA Cloud Platform Cockpit. Fortunately there is workaround.

(3:00 – 4:45) Explanation for Proxy’s Necessity via the Live3 Course Architecture

Normally the browser or mobile HTML5 app would access the SAP HANA Cloud Platform where the HTML5 app is hosted. It would then access a backend system, which is SAP native web services, through a destination. However, we can’t connect the destination to the SAP HANA XS instance. So a destination can be defined that goes through the SAP HANA Cloud Connector that is installed locally on the desktop. Then a proxy will be inserted in-between the SAP HANA Cloud Connector and the native web services to account for the SAML 2.0 authentication and then connect back to the destination. This would not be run in production but is being used in this course purely as a work around of a technical limitation of the free trail developer edition of the SAP HANA Cloud Platform.

(4:45 – 5:45) Locating the Proxy

The necessary proxy was created by SAP Mentor, Gregor Wolf. Search Google for “Gregor Wolf GitHub” and click on the link to his page. Under the popular repositories section open the hanatrail-auth-proxy file. Written in node.js the file will allow us to access the SAP HANA web services via a destination. The next video will detail how to download and install the proxy.

Follow along with the SAP HANA Academy's Live3 on HCP course here.

SAP HANA Academy - Over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.

Follow @saphanaacademy

↧

[SAP HANA Academy] Live3: Web Services - Authentication Setup Proxy

May 11, 2015, 11:15 am

Latest and popular articles on SAP ERP

≫ Next: SAP HANA Data Sheet

≪ Previous: [SAP HANA Academy] Live3: Web Services - Authentication

Continuing from the previous tutorial video of the SAP HANA Academy’s Live3 on HCP course, Philip Mugglestone shows how to setup the “proxy” authentication server for the HCP trail developer edition. Watch Philip's tutorial video below.

(0:20 – 3:30) Installing the Prerequisites for the hanatrail-auth-proxy File and Modifying its Code

On the hanatrail-auth-proxy page located on SAP Mentor, Gregor Wolf’s GitHub, click on the download ZIP button. Extract the downloaded zip and then open a command window on the hantrail-auth-proxy file.

First a few prerequisite node.js modules (cheerio and querystirng) must be installed. In the command window enter npm install cheerio. Wait a few seconds for the cheerio installation to be completed before entering npm install querystring.

*Note – The component has been updated since this video was recorded. Simply use “npm install” from the main hanatrail-auth-proxy folder. There is now no need to install cheerio and querstring explicitly.*

Next we need to make a few changes to the hanatrail-auth-proxy code. First right click to edit the config.js file with notepad++. First you must set a port to use. This will create a web server that is similar to the nodejs we created earlier for loading the Twitter data.

You also must insert the correct host. The host is the beginning of the services.xsodata URL. For example Philip’s host is s7hanaxs.hanatrail.ondemand.com. Leave the timeout and https as is before saving the code.

*Note – The config.js and server-basic-auth files have moved to the examples subfolder. You must still verify that the “host” option in examples/config.js matches your SAP HANA XS instance.*

(3:30 – 6:30) Running the Proxy

To start the proxy application, back in the command window enter node server-basic-auth.js. A message will appear saying the SAP HANA Cloud Platform trail proxy is running on the port host number.

Open a new web browser tab and enter localhost:portnumber/URL of application. So in Philip’s example he enters the URL displayed below.

After logging in with your HCP p number the authentication for the SAP HANA instance using SAML 2.0 should be preformed automatically. Effectively now the proxy, acting as a local web server, talks as if it’s the SAP HANA Cloud Platform trial edition. You can now make all of the calls that were demonstrated in previous videos (e.g. metadata, clusters) using the localhost URL.

Follow along with the SAP HANA Academy's Live3 on HCP course here.

SAP HANA Academy - Over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.

Follow @saphanaacademy

↧

SAP HANA Data Sheet

August 13, 2013, 11:35 am

Latest and popular articles on SAP ERP

≫ Next: Unable to Open Alert History Information Due to Large Table _SYS_STATISTICS.STATISTICS_ALERTS_BASE

≪ Previous: [SAP HANA Academy] Live3: Web Services - Authentication Setup Proxy

SAP HANA is built on the next generation, massively parallel, in-memory data processing design paradigm to enable faster information processing. This new architecture enables converged OLTP and OLAP data processing within a single in-memory column based data store with ACID compliance, while eliminating data redundancy and latency. By providing advanced capabilities, such as predictive, text analytics, search, spatial processing, graph, time series, streaming, togather with data integration, data quality and application services on the same architecture, it further simplifies application development and processing across IoT, big data sources and structures. This makes SAP HANA the most suitable platform for building and deploying next-generation, realtime, applications and analytics.

This data sheet explains the capabilities, features and benefits of SAP HANA platform.

SAP HANA Platform Data Sheet

Last Update : April 2016

↧

Unable to Open Alert History Information Due to Large Table _SYS_STATISTICS.STATISTICS_ALERTS_BASE

April 11, 2016, 6:02 pm

Latest and popular articles on SAP ERP

≫ Next: HANA Rules Framework

≪ Previous: SAP HANA Data Sheet

Recently, a customer says that there are huge amounts of alerts shown in SAP HANA Studio/DBACOCKPIT in one of SAP HANA system which has not been monitored for long time.

The alert detailed information page is hanging and does not return or returns errors listed below after clicking high priority alerts for example (the overview page is hanging in the worse situation).

From DBACOCKPIT -> System Information -> Large Tables, I see that the size of table _SYS_STATISTICS.STATISTICS_ALERTS_BASE which contains alert history has more than 30GB.

According to note 2170779 - SAP HANA DB: Big Statistics Server Table STATISTICS_ALERTS_BASE Leads to Performance Impact on the System

Firstly, customer using embedded statistics server with MDC environment, I have to disable embedded statistics server within System DB to prevent endless delete situation (the configuration takes effect immediately, no need to restart HANA DB).

nameserver.ini [statisticsserver] active = false

Secondly, cleanup the old alerts which more than 1 day for example then check and fix the latest alerts which takes around 30 minutes for me.

DELETE FROM "_SYS_STATISTICS"."STATISTICS_ALERTS_BASE" WHERE "ALERT_TIMESTAMP" < add_days(CURRENT_TIMESTAMP, -25);

Then I see the latest alerts and their detail information and try to fix one by one. For alerts do not need to be keep for long time, I set the shorten retetion date.

update _SYS_STATISTICS.STATISTICS_SCHEDULE set RETENTION_DAYS_CURRENT = 10 where ID = 79

Thirdly, enable embedded statistics server.

nameserver.ini [statisticsserver] active = true

Last but not least, I try to persuad customer to monitor the system in their daily or weekly tasks.

↧

HANA Rules Framework

April 7, 2015, 5:57 am

Latest and popular articles on SAP ERP

≫ Next: Creating a copy of user SYSTEM in SAP HANA

≪ Previous: Unable to Open Alert History Information Due to Large Table _SYS_STATISTICS.STATISTICS_ALERTS_BASE

Welcome to the SAP HANA Rules Framework (HRF) Community Site!

SAP HANA Rules Framework provides tools that enable application developers to build solutions with automated decisions and rules management services, implementers and administrators to set up a project/customer system, and business users to manage and automate business decisions and rules based on their organizations' data.

In daily business, strategic plans and mission critical tasks are implemented by a countless number of operational decisions, either manually or automated by business applications. These days - an organization's agility in decision-making becomes a critical need to keep up with dynamic changes in the market.

HRF Main Objectives are:

To seize the opportunity of Big Data by helping developers to easily build automated decisioning solutions and\or solutions that require business rules management capabilities
To unleash the power of SAP HANA by turning real time data into intelligent decisions and actions
To empower business users to control, influence and personalize decisions/rules in highly dynamic scenarios

HRF Main Benefits are:

Rapid Application Development |Simple tools to quickly develop auto-decisioning applications

Built-in editors in SAP HANA studio that allow easy modeling of the required resources for SAP HANA rules framework
An easy to implement and configurable SAPUI5 control that exposes the framework’s capabilities to the business users and implementers

Business User Empowerment | Give control to the business user

Simple, natural, and intuitive business condition language (Rule Expression Language)

Simple and intuitive UI control that supports text rules and decision tables

Simple and intuitive web application that enables business users to manage their own rules

Scalability and Performance |HRF as a native SAP HANA solution leverages all the capabilities and advantages of the SAP HANA platform.

For more information on HRF please contact shuki.idan@sap.com and/or noam.gilady@sap.com

Interesting links:

SAP solutions already utilizing HRF:

Here are some (partial list) SAP solutions that utilizes HRF in different domains:

SAP Fraud Management - link to SCN page
SAP hybris Marketing: - link to SCN page
- Predictive Models
- Offer Recommendations
- Simple Scores
SAP Operational Process Intelligence - link to SCN page
SAP Transportation Resource Planning (TRP) - link to help page
SAP Agile Data Preparation (ADP)
SAP Integrated Business Planning (IBP) - link to SCN page
IoT SIM management for SAP HANA (SIMM) - link to help page

Use cases of SAP solutions already utilizing HRF:

SAP Transportation Resource Planning

SAP Fraud Management

SAP hybris Marketing (formerly SAP Customer Engagement Intelligence)

SAP Operational Process Intelligence

↧

Creating a copy of user SYSTEM in SAP HANA

April 8, 2016, 2:43 am

Latest and popular articles on SAP ERP

≫ Next: SAP HANA Database Campus – Open House 2016 in Walldorf

≪ Previous: HANA Rules Framework

In the SAP HANA Security Guide SAP recommends to use user SYSTEM only at the beginning. After various users, e. g. for backup and monitoring purposes, had been created user SYSTEM shall be deactivated. Following you'll find some of my experiences I made when trying to copy user SYSTEM.

Trying to copy user SYSTEM to another user shows the upcoming diffculties. The new user only receives role PUBLIC, no package, no privilege object that already had been assigned to user SYSTEM is copied to the new user. Only repository roles will be copied. Unfortunately currently there's no way to create an sql script automatically that contains all objects, packages and roles assigned to user SYSTEM. All objects assigned to user SYSTEM have to be assigned manually to the new user. The reason is that user SYSTEM in some cases isn't allowed to grant the respective object, package or role. Therefore no object, package or role is copied from user SYSTEM to the new one.

The copy process is purely an UI functionality, and thus cannot be automated. There's no sql command "COPY USER". Only the sql command "CREATE USER" is available.

User SYSTEM automatically receives the rights for new objects and packages created in the HANA system. The new user will not receive theses automatically. They have to be assigned one by one manually.

The password of the new user can be altered with the sql statement "ALTER USER newuser DISABLE PASSWORD LIFETIME;". By this the given password hasn't to be changed during the first logon.

User SYSTEM can be deactivated with the sql statement "ALTER USER SYSTEM DEACTIVATE USER NOW".

If you need to reset the password of user SYSTEM please follow the description given in note 1925267.

If you like to exclude user SYSTEM from the current password policy please follow note 2251556. Please bear in mind that this procedure isn't recommended by SAP AGS.

↧

SAP HANA Database Campus – Open House 2016 in Walldorf

April 27, 2014, 11:00 am

Latest and popular articles on SAP ERP

≫ Next: SAP HANA TDI - Overview

≪ Previous: Creating a copy of user SYSTEM in SAP HANA

The SAP HANA Database Campus invites students, professors, and faculty members interested in database research to join our third Open House at SAP's headquarters. Throughout your day, you will get an overview of database research at SAP, meet the architects of SAP HANA and learn more about academic collaborations. There are a couple of interesting presentations by developers and academic partners. Current students and PhD candidates present their work and research. For external students and faculty members it is a great chance to find interesting topics for internships and theses.

The event takes place on June 2nd, 2016, during 09:30–16:00 in Walldorf, Germany. Free lunch and snacks are provided for all attendees. The entire event is held in English.

Register here

Looking forward to seeing you in Walldorf,

The SAP HANA Database Campus

students-hana@sap.com

Location:

SAP Headquarters,WDF03, Robert-Bosch-Str. 30, 69190, Walldorf, Germany
Room E4.02, Check-In Desk in the lobby of WDF03

Agenda:

09:00-09:30 Arriving
09:30-10:00 Check-In
10:00-10:15 Opening
10:15-11:00 Keynote
- Daniel Schneiss (Head of SAP HANA Development) – Topic will be announced
11:00-12:00 Poster Session Part 1 & Career Booth
12:00-12:45 Lunch
12:45-13:00 Office Tour
13:00-14:00 Session 1 – Academic
- Prof. Anastasia Ailamaki (EPFL) – Scaling Analytical and OLTP Workloads on Multicores: Are we there yet? [30 min]
- Ismail Oukid (SAP HANA PhD student, TU Dresden) – FPTree: A Hybrid SCM-DRAM Persistent and Concurrent BTree for Storage Class Memory [15 min]
- SAP HANA PhD Student – Speaker and Topic will be announced [15 min]
14:00-15:00 Poster Session Part 2, Career Booth & Coffee Break
15:00-15:45 Session 2 – SAP
- Hinnerk Gildhoff (SAP) – SAP HANA Spatial & Graph[20 min]
- Daniel Booss (SAP)– SAP HANA Basis [20 min]
15:45-16:00 Best Student/PhD-Student Poster & Open House Closing

Archive of previous events

By participating you agree to appear in photos and videos taken during the event and published on SCN and CareerLoft.

↧

SAP HANA TDI - Overview

April 5, 2015, 10:05 am

Latest and popular articles on SAP ERP

≫ Next: SAP Hana EIM Connection Scenario Setup - Part 1

≪ Previous: SAP HANA Database Campus – Open House 2016 in Walldorf

SAP HANA tailored data center integration (TDI) was released in November 2013 to offer an additional approach of deploying SAP HANA. While the deployment of an appliance is easy and comfortable for customers, appliances impose limitations on the flexibility of selecting the hardware components for compute servers, storage, and network. Furthermore, operating appliances may require changes to established IT operation processes. For those who prefer leveraging their established processes and gaining more flexibility in hardware selection for SAP HANA, SAP introduced SAP HANA TDI. For more information please download this overview presentation.

View this Document

↧

SAP Hana EIM Connection Scenario Setup - Part 1

April 14, 2016, 7:52 pm

Latest and popular articles on SAP ERP

≫ Next: SAP Hana EIM Connection Scenario Setup - part 3

≪ Previous: SAP HANA TDI - Overview

In my documentation I’ll explain how to setup and configure a SAP Hana SP10 EIM (SDI/SDQ) connection with Data Provisioning agent based on Cloud and On-Premise scenario.

This documentation is build in 3 part:

SAP Hana EIM Connection Scenario Setup - Part 1 (current)

SAP Hana EIM Connection Scenario Setup - Part 2

SAP Hana EIM Connection Scenario Setup - Part 3

In my first documentation I have explain how to replicate data over by using Hana SDI capabilities with SAP Hana adapter, in this document I’ll explain how to configure and connect the DP Agent to several source system to retrieve and replicated data for On-Site and Cloud scenario.

I will show in detail step and configuration point to achieve this setup with the following adapter:

Log Reader (Oracle, DB2, MSSQL) / SAP ASE / Teradata and Twitter

Note: The Data Provisioning Agent must be installed on the same operating system as your source database, but not necessarily on the same machine

In order execution

Security (Roles and Privileges)
Configuration for On-Premise scenario
Configuration for Cloud scenario
Log Reader adapter setup (Oracle / DB2 / MS SQL)
SAP ASE adapter setup
Teradata adapter setup
Twitter adapter setup
Real time table replication

Guide used

SAP Hana EIM Administration Guide SP10

SAP Hana EIM Configuration guide SP10

Note used

2179583 - SAP HANA Enterprise Information Management SPS 10 Central Release Note

2091095 - SAP HANA Enterprise Information Management

Link used

http://help.sap.com/hana_eim

Overview Architecture

On-premise landscape

Cloud landscape

Security (Roles and Privileges)

Before to start trying to make any connection form DP Agent server and Hana, it’s important to provide the necessary credential to the user involve in the configuration based upon your landscape scenario.

For the on-premise and cloud scenario as an administrator ensure you have:

Privileges: AGENT ADMIN and ADAPTER ADMIN
Application privilege: sap.hana.im.dp.admin::Administrator

For the cloud scenario an additional user is required, its use as a technical user aka “xs agent” required when you need to register an agent:

Privileges: AGENT MESSAGING
Application privilege : sap.hana.im.dp.proxy::AgentMessaging

Configuration for On-Premise scenario

In the case of on-premise scenario, the interaction with the DP agent and Hana server is done through TCP/IP connection, similar to connection through the Hana Studio.

Configuration for Cloud Landscape scenario

The cloud scenario require in case of SSL connection a specific setup, Hana can be access directly from the internal webdispatcher or access through a proxy, for my configuration I use the direct connection over SSL.

One of the requirement to make Hana available over HTTPS is to have a valid “CommonCryptoLib” library (libsapcrypto.so), by default when Hana is installed it comes with it.

Now from the webdispatcher page in SSL and Trust Configuration tab, I create CA request and send it to my CA Authority and import the response

Once signed I import the CA response in my trusted list of PSE

And change the format view from TEXT to Perm in order to review the chain

Once completed I’ll change the default port used by the webdispatcher in order to use standard port 80/443. In order to do this, from the webdispatcher.ini change the default port to the one you want to use and add the parameter “EXTBIND=1”

Once saved at the os layer you need to bind the default SSL port to use, by default when hana is installed it create an “icmbnd.new” file, rename it to “icmbnd” and change the right on it. You must be root to do this.

Now my Hana instance is available from HTTPS access

The Hana certificate needs to be imported in the DP Agent server, to do this into the “ssl” directory of the DPA.

First change the default “cacerts” password with the following keytool command where the cacerts file is located

keytool -storepasswd -new [new password ] -keystore cacerts

Then create a “SAPSSL.cer” file, open it with your favorite editor and paste the entire chain from the imported webdispatcher certificate

And import it into the “cacerts”

Keytool –importcert –keystore cacerts –storepass <password> -file SAPSSL.cer -noprompt

I can now configure DPA to use SSL connection for Hana

LogReader adapter setup

Log Reader adapters provide real-time changed-data capture capability to replicate changed data from Oracle, Microsoft SQL Server, and IBM DB2 databases to SAP HANA in real time; In certain case, you can also write back to a virtual table.

Oracle 12c LogReader adapter

The first point to take in consideration before to start the configuration of any LogReader adapter, is to download the necessary JDBC libraries specific to the source used and store them into the lib directory of the data provisioning agent.

Download the libraries from:

Oracle

Microsoft SQL

IBM DB2

Note for all the database setup, I will not explain how to install them but focus on the step which needs to be perform in order to work with SDI configuration.

In order to enable the real time replication capability in Oracle, a specific script needs to be run on the Oracle database, this script is located in the “scripts” into the DP Agent server

Note: the script assume that the default user for the replication is LR_USER

Before to run it, check if the database is in archivelog mode, if not enabled it needs to be changed

Since the DP agent and Oracle doesn’t reside on the same server, we need to copy the timezome_11.dat file from the Oracle server to the DP Agent server.

And specify the location of the fie from the Oracle preference adapter

Now done I’ll use Oracle SQL Developer to execute the script, which will also create the LR_USER

Note : don’t forget to change the script in order to setup the password for the user.

Now done, I can register my adapter and create my remote connection

From the studio when you specify the OracleLogReader adapter, it’s important to specify the administrator LogReader port and the user define for the replication.

From a connection point with oracle we are done, next MS SQL setup

Microsoft SQL 2008 R2 LogReader adapter

Since EIM relies on database log to perform data movement, which means that logs must be available until the data is successfully read and replicated to Hana, MS SQL Server must be configure in Full Recovery Mode.

For my SQL based scenario it will make my database CDC (change data capture) enable, this feature is supported by EIM but the “truncate” operation on table is not

Once activated, make the check

After the feature is enabled on the database, the cdc schema, cdc user, data capture metadata tables are automatically created

Since I don’t have create table to replicate for now, I’ll explain later how to enable this feature for each table I want to track

Once done enable DAC to allow remote connection from facets

And make the log files readable, copy the sybfilter and sybfiltermgr from the dp server from the logreader folder to the MSSQL server

Anywhere on the server create a file named “LogPath.cfg” and set the variable environment “RACFGFilePath” which point to the location

Open the LogPath.cfg file and provide the location of the .ldf file

↧

SAP Hana EIM Connection Scenario Setup - part 3

April 15, 2016, 11:23 am

Latest and popular articles on SAP ERP

≫ Next: SAP Hana EIM Connection Scenario Setup - part 2

≪ Previous: SAP Hana EIM Connection Scenario Setup - Part 1

Twitter adapter setup

In order to replicate and consume content into Hana from twitter, I need to create a “Twitter apps” from the developer space (https://dev.twitter.com)

From the documentation link click on “Manage my Apps”

It will lead you to the application management page and click on the “Create New App” button

Provide the necessary information, accept the license term and hit click on the “create your Twitter application” at the bottom of the page

The application now create, four information will be required in order to create the remote connection with Hana:

Consumer Key (API Key)
Consumer Secret (API Secret)
Access Token
Access Token Secret

From the created application page click on the “keys and Access Tokens”

From the page note the two keys for the consumer

And from the bottom of the page create the access token to generate them

Now completed I need to register my adapter and create my new connection in Hana

Real time table replication

All my remote source connection are now created I can proceed with table replication, for my test lab I have created the same table to replicate in all remote source database named “Store”

MS SQL

For MS SQL when you setup the database to use “Change Data Capture” to track change, you need to specify on which table you want the track to encore

From the Workbench editor create we need to create the replication task and uncheck “initial load”

ORACLE

I earlier ran the “oracle_init.sql” script with the default user LR_USER for the replication, I did create my “store” table which belong to this user for the replication

From the workbench repeat the procedure to create a replication task and uncheck “initial load”

For DB2, Teradata and ASE, from the workbench repeat the procedure to create a replication task and uncheck “initial load”

Once the replication is working you can check the task from the “DP Provisioning Task Monitor”

TWITTER

For Twitter replication when the remote connection is created two table should appear, the one I will use for my tweet replication is the “status” table

From the workbench I start to setup the live replication and check the replication task

From the studio, I can see the content of the table which contain all the tweets and news

Replicate information for the status table bring a lot of element, for my test I create a tweet on my tweeter page and see if this one appear in the tale

And I can see my tweet in the table

The next step now is to filter my content, I’ll create additional tweet and filter the replication on them only

In order to filter, from the workbench apply a filter on “ScreenName” column, basically the screen name value should be your account name.

And refresh my status table

My HomeLab replication is now completed.

Link to :

SAP Hana EIM Connection Scenario Setup - Part 1

SAP Hana EIM Connection Scenario Setup - Part 2

↧