Quantcast
Channel: SCN : Document List - SAP HANA and In-Memory Computing
Viewing all 1183 articles
Browse latest View live

SAP Hana EIM Connection Scenario Setup - part 2

$
0
0

Install the sybfilter and start it

11.2.jpg

 

11.1.1.jpg

 

11.2.1.jpg

 

Share the folder which contain the log file .ldf and the log with the dp agent server

11.3.jpg

 

11.4.jpg

 

In the dp agent server in order to map the relationship of the directory of the database edit the mssql_log_path_mapping.props file in the logreader config directory

11.5.jpg

 

And finally check the TCP/IP enablement on the SQL server

12.jpg

 

SQL server is ready, I can register my adapter and create my remote connection

12.1.jpg

12.2.jpg

 

Now completed let’s move to DB2 database setup

 

 

 

IBM DB2 10.5 LogReader adapter

 

In order to have DB2 working with LogReader few step needs to be achieved, I will first add a new pool buffer and a temporary user tablespace, turn DB2 in archivelog mode on and create a specific user for the replication.

 

 

You can either do it over command line or using the studio.

13.jpg

 

14.jpg

 

15.jpg

 

Check the current log setting

16.jpg

 

And last step is to create the OS technical user (ra_user) for the replication

17.jpg

 

And grant the necessary privileges to the user

18.jpg

 

Now completed I can register my adapter and create my remote connection

19.jpg

 

20.jpg

 

 

Sybase ASE adapter setup

 

The SAP ASE adapter provides realtime replication and change data capture functionality to SAP HANA or back to a virtual table.

 

In the interface file add the additional necessary entries:

  • The entry name must be the same as the Adapter Instance Name specified when creating remote source.
  • The host name or IP must be the name of server where SAP ASE adapter will be running.
  • The port must be the same as the SAP ASE Adapter Server port that you set up in the ASE adapter interface file, located in <DPAgent_root>/Sybase/interfaces.

22.jpg

 

From an ASE perspective I’ll create 2 users which are required when the remote connection is created from Hana

rep_user : for replication with the role “replication”

mnt_user : for maintenance

21.jpg

 

Once done register your remote adapter and create the remote source in Hana

23.jpg

 

24.jpg

 

The connection completed we can start the other configuration for Teradata

 

 

 

Teradata adapter setup

 

 

For my Teradata database, I will use the revision 15.0 packaged from the Teradata website for ESXi available at the following link “Teradata Virtual Machine Community Edition for VMware

 

 

In order to create a remote connection another account is require with the following select privilege on “dbc” tables:

  • "DBC"."UDTInfo"
  • "DBC"."DBase"
  • "DBC"."AccessRights"
  • "DBC"."TVM"
  • "DBC"."TVFields"

 

33.jpg

 

34.jpg

 

Once done register your remote adapter and create the remote source in Hana

 

Note : The official documentation on SP10 doesn’t mention that you need to load the jdbc driver of Teradata into the lib folder of the dp agent.

You can download this driver from the Teradata website according your revision and extract it into the lib directory

35.jpg

 

36.jpg

 

37.jpg

 

38.jpg

 

20.png

 

Link to :

SAP Hana EIM  Connection Scenario Setup - Part 1

SAP Hana EIM  Connection Scenario Setup - Part 3


Myth of HANA

$
0
0

Hi experts,

 

since SAP HANA was available in the year 2011 (GA), I come across a lot of untruth about the new in-memory platform. As consultant I was able to talk to many costumers and other consultants on events like TechED, DSAG, Business partner days etc. Every time I was impressed after this long time that so much dangerous smattering is still out there. Some of them can be easily eleminated by reading the note 2100010 (SAP HANA: Popular Misconceptions)

The most answers to the statements are pretty easy to find in the offical notes, guides and other documents (blogs, presentations, articles etc.), but may it is an overload of information.

 

1) start time

2) cross SID backup

3) col / row store conversion

4) sizing *2

5) statistics

6) data fragmentation
7) persistency layer

8) high memory consumption HANA vs. Linux

9) Backup

10) Backup catalog

 

S stand for statement and A for the answer

 

SQL scripts

Used SQL scripts are available in the attachment of note 1969700 - SQL statement collection for SAP HANA

 

 

1) Start time

S: "The start time (availability of the SAP system) must be 30 to 60min to load all data into memory"

A: Yes, to load all data into memory it takes some time, but for any DB it also takes time to fill its data buffer. For any DB the data buffer will be filled on first access of the data and stay there until the the LRU (least recently used) algorithm takes place and push it out of the buffer.

HANA is loading the complete row store on every start into memory. After this the system is available!

Short description of start procedure:

1) open data files

 

2) read out information about last savepoint ( mapping of logical pages to physical pages in the data file / open transaction list)

 

3) load row store (depends on the size and the I/O subsystem; about 5min for 100GB)

 

4) replay redo logs

 

5) roll back uncommited transactions

 

6) perform savepoint

 

7) load col table defined as preload and lazy load of col tables (async load of Column tables that were loaded before restart)

For more details have a look at the SAP HANA Administration guide (search for "Restart Sequence") or the SAP HANA Administration book => Thanks to Lars and Richard for this great summary!

 

Example:

Test DB 40GB NW 740 system with a none enterprise storage (=slow):

SQL HANA_IO_KeyFigures_Total:

read: 33mb/s
avg-read-size: 31kb
avg-read-time: 0,93ms
write: 83mb/s
avg-write-size: 243kb
avg-write-time: 2,85ms
row store size: 11GB
CPU: 8vcpu (vmware; CPU E5-2680 v2 @ 2.80GHz)

Start time without preload: AVG 1:48

Stop time without preload: AVG 2:15

 

start time with 5GB col table (REPORSRC)

SQL for preload (more information in the guide "SAP HANA SQL and System views Reference"):

alter table REPOSRC preload all

 

verify with HANA_Tables_ColumnStore_PreloadActive script from note 1969700 - SQL statement collection for SAP HANA

 

Start time with preload: AVG 1:49

Stop time with preload: AVG 2:18

 

Why the start time don't increase although 5GB more data have to be loaded?

Since SPS 7, the preloading, together with the reloading, of tables happens async directly after the HDB restart has finished. That way, the system is again available for SQL access that do not require the information of the columns that are still being loaded.

 

With enterprise hardware the start times are faster!

 

If you want to know how long it takes to load all data into memory you can execute a python script.

load all tables into memory with python script:

cdpy (/usr/sap/HDB/SYS/exe/hdb/python_support/)
python ./loadAllTables.py --user=System --password=<password> --address=<hostname> --port=3xx15 --namespace=<schema_name>

[140737353893632, 854.406] << ending loadAllTables, rc = 0 (RC_TEST_OK) (91 of 91 subtests passed), after 854.399 secs

 

In a simular enterprise system it takes about 140-200sec.

 

 

 

2) Cross SID backup

S: "It is not possible not refresh a system via Cross-SID-copy"

A: Cross SID copy (single container) from disk is already available since a long time. Since SPS09 it is also available via backint interface.

Multitenant Database Container (MDC) for a Cross-SID-copy are currently (SPS11) only able to restore via disk.

 

 

 

3) Col / row store conversion

S: "Column tables can't be converted to row store and vice versa. It is defined by sap which tables are stored in which type."

A: It is correct that during the migration the SWPM (used for syscopy) procedure creates files in which store the tables are created.

But you can technically change the type from row to column and vice versa on the fly. But there must be a reason for it, e.g. in advise of SAP Support. If you have no depencies to the application, e.g. custom tables or a standalone HANA installation for your own applications, you can choose freely.

 

In the past SAP delivered a rowstorelist.txt with note 1659383 (RowStore Liste für SAP Netweaver 7.30/7.31 auf SAP HANA Database). This approach is out-dated. Nowadays you can use the latest version of SMIGR_CREATE_DDL with the option "RowStore List" (Note 1815547 - Row/ColumnStore Check ohne rowstorelist.txt)

 

 

 

4) Sizing * 2

S: "You have to double the sizing the result of the sizing report."

A: Results of Sizing reports are final, you dont have to double them.

 

example(BWoH):

|SIZING DETAILS                                                                |

|==============                                                                |

|                                                                              |

| (For 512 GB node)      data [GB]     total [GB]                              |

|                                      incl. dyn.                              |

| MASTER:                                                                      |

| -------                                                                      |

|                                                                              |

|  Row Store                    53            106                              |

|  Master Column Store          11             21                              |

|  Caches / Services            50             50                              |

|  TOTAL (MASTER)              114            178                              |

|                                                                              |

| SLAVES:                                                                      |

| -------                                                                      |

|                                                                              |

|  Slave  Column Store          67            135                              |

|  Caches / Services             0              0                              |

|  TOTAL (SLAVES)               67            135                              |

| ---------------------------------------------------------------              |

|  TOTAL (All Servers)         181            312                              |

 

This is a scale up solution. So Master and Slave are functional on one host. In a scale out solution you have one host as master for the transaction load. This one holds all row store tables. SAP recommends to have a min. of 3 hosts in a BW scale out solution. The other 2 slaves are for the reporting load.

 

Static and dynamic RAM

SAP HANA Main Memory Sizing is divided into static and the dynamic RAM requirement. The static part relates to the amount of main memory that is used for the holding the table data. The dynamic part has exact the same size as the static one and is used for temp data => grouping, sorting, query temp objects etc.

 

In this example you have:

row store 53 *2 = 106GB

Master column 11*2 =21(rounded) + 67*2= 135 (rounded) => 156GB

Caches / Services 50GB is needed for every host

106+156+50 in sum 312GB

 

 

 

5) Statistics

S: "Statistics are not needed any more. So no collect runs are needed"

A: For the Col store the Statement is correct in cause of the known data distribution through the dictionary. For the row store there is an automatically collection of statistics on the fly. So you don't have to schedule them. Currently it is not documented how you can trigger the collection or change sample size.

 

 

 

6) Data Fragmentation

S: "You don't have to take care of data fragmentation. All is saved in memory via col store and there is no fragmention of data"

A: Some tables are created in the row store. The row store still follows the old rules and conditions which results in fragmentation of data. How to analyze it?

Please see note 1813245 - SAP HANA DB: Row store reorganization

 

SELECT HOST, PORT, CASE WHEN (((SUM(FREE_SIZE) / SUM(ALLOCATED_SIZE)) > 0.30)
AND SUM(ALLOCATED_SIZE) > TO_DECIMAL(10)*1024*1024*1024)
THEN 'TRUE' ELSE 'FALSE' END "Row store Reorganization Recommended",
TO_DECIMAL( SUM(FREE_SIZE)*100 / SUM(ALLOCATED_SIZE), 10,2)"Free Space Ratio in %"
,TO_DECIMAL( SUM(ALLOCATED_SIZE)/1048576, 10, 2) "Allocated Size in MB"
,TO_DECIMAL( SUM(FREE_SIZE)/1048576, 10, 2) "Free Size in MB"
FROM M_RS_MEMORY WHERE ( CATEGORY = 'TABLE' OR CATEGORY = 'CATALOG' ) GROUP BY HOST, PORT

Reorg advise: if row store is bigger than 10GB and more than 30% free space

 

!!!Please check all prerequesites in the notes before you start the reorg!!! (online / offline reorg)

Row Store offline Reorganization is triggered at restart time and thus service downtime is required. Since it's guaranteed that there are no update transactions during the restart time, it achieves the maximum compaction ratio.

 

Before

Row Store Size: 11GB

Freespace: ~3GB

in %: 27% (no reorg needed)

 

But for testing I configured the needed parameters in indexserver.ini (don't forget to remove them afterwards!):

4min startup time => while starting the row store will reorganized in offline mode

 

After

Row Store Size: 7,5GB

Freespace: ~250MB

in %: 3,5%

 

Additionally you should consider the tables with multiple containers if revision is 90+. Multiple containers are typically introduced when additional columns are added to an existing table. As a consequence of multiple containers the performance can suffer, e.g. because indexes only take effect for a subset of containers

HANA_Tables_RowStore_TablesWithMultipleContainers

 

The compression methods of the col store (incl. indexes) should also be considered.

As of SPS 09 you can switch the largest unique indexes to INVERTED HASH indexes. In average you can save more than 30 % of space. See SAP Note 2109355 (How-To: Configuring SAP HANA Inverted Hash Indexes) for more information. Compression optimization for those tables:

UPDATE "<table_name>" WITH PARAMETERS ('OPTIMIZE_COMPRESSION' = 'FORCE')

Details:2112604 - FAQ: SAP HANA Compression

 

 

 

7) Persistency layer

S: "The persistency layer consists of exactly the same data which are loaded into memory"

A: As descibed in statement 3) the memory is parted into 2 areas. The temp data won't be stored on disk. The persistency layer on disk consists of the payload of data, before&after images / shadow pages concept + snapshot data + delta log (for delta merge). The real delta structure of the merge scenario only exists in memory, but it is written to the delta logs.

 

Check out this delta by yourself:

SQL: HANA_Memory_Overview

check memory usage vs. disk size

 

 

 

8) High Memory consumption HANA vs. Linux

S: "The used memory of the processes is the memory which is currently in use by HANA"

A: No, for the Linux OS it is not transparent what HANA currently real uses. The numbers in "top" are never maching the ones in the hana studio. HANA communicates free pages not instantly to the OS. There is a time offset for freed memory.

There is a pretty nice document which explaines this behaviour in detail:

http://scn.sap.com/docs/DOC-60337

 

The garbage collection takes by default pretty late. If your system shows a high memory consumtion the root cause may not necessarily a bad sizing or high load. The reason could also be a late GC.

 

2169283 - FAQ: SAP HANA Garbage Collection

One kind of garbage collection we already discussed in 6) row and col fragmentation. Another one is for Hybrid LOBs and there is one for the whole memory. Check out your current heap memory usage with HANA_Memory_Overview.

 

In my little test system the value is 80GB. In this example we have 14GB for Pool/Statistics , 13GB for Pool/PersistenceManager/PersistentSpace(0)/DefaultLPA/Page and 9GB for Pool/RowEngine/TableRuntimeData

Check also the value of col EXCLUSIVE_ALLOCATED_SIZE in the monitoring view "M_HEAP_MEMORY". It contains the sum of all allocations in this heap allocator since the last startup.

 

select CATEGORY, EXCLUSIVE_ALLOCATED_SIZE,EXCLUSIVE_DEALLOCATED_SIZE,EXCLUSIVE_ALLOCATED_COUNT,
EXCLUSIVE_DEALLOCATED_COUNT from M_HEAP_MEMORY
where category = 'Pool/Statistics'
or category='Pool/PersistenceManager/PersistentSpace(0)/DefaultLPA/Page'
or category='Pool/RowEngine/TableRuntimeData';

Just look at the index server port 3xx03 (may be the xsengine is also listed if active)

 

CATEGORYEXCL_ALLOC_SIZEEXCL_DEALLOC_SIZEEXCL_ALLOC_COUNTEXCL_DEALLOC_COUNT
Pool/PersistenceManager/PersistentSpace(0)/DefaultLPA/Page384.055.164.928369.623.433.2166.177.0195.856.165
Pool/RowEngine/TableRuntimeData10.488.371.360792.726.99283.346.94526
Pool/Statistics2.251.935.681.4722.237.204.512.6967.146.662.5277.084.878.887

 

In cause of a lot of deallocation there is a gap between the EXCLUSIVE_ALLOCATED_SIZE and the currently allocated size. The difference is usually free for reuse and can be freed with a GC run.

 

But by default the memory GC will be triggered by default in the following cases:

Parameter + Default valueDetails
async_free_target = 95 (%)When proactive memory garbage collection is triggered, SAP HANA tries to reduce allocated memory below async_free_target percent of the global allocation limit.
async_free_threshold = 100 (%)With the default of 100 % the garbage collection is quite "lazy" and only kicks in when there is a memory shortage. This is in general no problem and provides performance advantages, as the number of memory allocations and deallocations is minimized.
gc_unused_memory_threshold_abs = 0 (MB)Memory garbage collection is triggered when the amount of allocated, but unused memory exceeds the configured value (in MB).
gc_unused_memory_threshold_rel = -1 (%)Memory garbage collection is triggered when the amount of allocated memory exceeds the used memory by the configured percentage.

 

The % values are related to the configured global allocation limit.

 

Unnessarily triggered GC should be absolutely avoided, but it depends on your system load and sizing how you configure these values.

The unused memory will normally be reused by the HDB (free pool), so there is need to trigger the GC manually. But in some cases it is possible that a pool uses more memory. This should be analyzed (1999997 - FAQ: SAP HANA Memory 14. How can I identify how a particular heap allocator is populated?)

If we now trigger a manual GC for the memory area:

hdbcons 'mm gc -f'

 

Before:

heap: 80GB

 

free -m

             total       used       free     shared    buffers     cached

Mem:        129073     126877       2195      15434        142      32393

-/+ buffers/cache:     94341      34731

 

 

Garbage collection. Starting with 96247664640 allocated bytes.

82188451840 bytes allocated after garbage collection.

 

After:

heap: 72GB

 

free -m

             total       used       free     shared    buffers     cached

Mem:        129073     113680      15393      15434        142      32393

-/+ buffers/cache:     81144      47929


 

So at this time inside the hdb there is in this scenario not so much difference, but at the OS side the not allocated memory will be freed.

You don't have to do this manually! HANA is fully aware of the memory management!

 

If you get an alert (id 1 / 43) in cause of memory usage of your services, you should analyze not only row and col store. Take also care of the GC of the heap memory. In the past there were some bugs in this area.

Alert defaults:

ID 1: Host physical memory usage:      low: 95% medium: 98% high:100%

ID43: memory usage of services:         low: 80% medium: 90% high:95%

As you can see a GC will be triggered lazy at 100% fill ratio of the global allocationlimit by default may be it is too late for your system before the GC takes place or you can react to it.

 

In addition to the memory usage check the mini check script and the note advices. If you are not sure how to analyze or solve the issue you can order a TPO service at SAP (2177604 - FAQ: SAP HANA Technical Performance Optimization Service).

 

 

 

9) Backup

S: "Restore requires logs for consistent restore"

A: wrong, a HANA backup based on snapshot technology. So the backup is consistent without any additional log file. This means it is a full online copy of one particular consistent state which is defined by the log position at the time executing the backup.

Sure if you want to roll forward you have to apply Log Files for point in time recovery or most recent state.

 

 

 

10) Backup Catalog

S: "Catalog information are stored in a file like oracle *.anf which is needed for recovery"

A: The backup catalog is saved on every data AND log backup. It is not saved as human readable file! you can check the catalog in hana database studio or with command "strings log_backup_0_0_0_0.<backupid>" in the backup location of your system if you make backup-to-disk.

 

The backup catalog includes all needed information which file belongs to which backup set. If you delete your backups on disk/VTL/tape level the backup catalog still holds the unvalid information.

 

Housekeeping of the backup catalog

There is currently no automatism which clean it up. Just check the size of your backup catalog if it is bigger than about 20MB you should take care of housekeeping (depends on your backup retention and size of the system) the backup catalog, because it will be saved as already mentioned EVERY log AND data backup. This means more than 200 times a day! How big is your current backup catalog of your productive HANA system? Check your backup editor in hana studio and click on show log backups. Search for the backup catalog and select it => check the size.

 

 

Summary

At the end you also have to take care of your data housekeeping and resource management. You can save a lot of resources if you consider all the hints in the notes.


I hope I could clarify some statements for you.



###########

# Edit V4

###########

2100010 - SAP HANA: Popular Misconceptions

(Thanks to Lars for the hint)

 


Best Regards,

Jens Gleichmann



###########

# History

###########

V4: Updated statistics (5); row/col statement adjusted, format adjusted

V5: adjusted format and added details for backup catalog

SAP HANA : The Row store , column store and Data Compression

$
0
0

Here is an attempt to explain the row store data layout, column store data layout and the data compression technique.

 

Row Store : Here all data connect to a row is placed next to each other. See below an example.

 

Table 1 :

Name

Location

Gender

…..

…..

….

Sachin

Mumbai

M

Sania

Hyderabad

F

Dravid

Bangalore

M

…….

……

……

 

Row store corresponding to above table is

 

row store.jpg

 

 

Column store : Here contents of a column are placed next to each other. See below illustration of table 1.

 

column store.png

 

 

Data Compression : SAP HANA provide series of data compression technique that can be used for data in the column store. To store contents of a column , the HANA database creates minimum two data structures. A dictionary vector and an attribute vector. See below table 2 and the corresponding column store.

 

Table 2.

Record

Name

Location

Gender

…..

…..

…..

….

3

Blue

Mumbai

M

4

Blue

Bangalore

M

5

Green

Chennai

F

6

Red

Mumbai

M

7

Red

Bangalore

F

……

…..

……

……

 

column store2.png

 

 

Here in the above example the column ‘Name’ has repeating values ‘Blue’ and ‘Red’. Similarly for ‘Location’ and ‘Gender’. The dictionary vector stores each value of  a column only once in a sorted order and also a position is maintained against each value. With reference to the above example , the dictionary vectors of Name , Location and Gender could be as follows.

 

Dictionary vector : Name

Name

Position

….

……

Blue

10

Green

11

Red

12

…..

……

 

Dictionary vector : Location

Location

Position

….

……

Bangalore

3

Chennai

4

Mumbai

5

…..

……

 

 

Dictionary vector : Gender

Gender

Position

F

1

M

2

 

 

Now the Attribute vector corresponding the above table would be as follows. Here it stores the integer values , which is the positions in dictionary vector.

 

  dictionary enco.png

Parallelization options with the SAP HANA and R-Integration

$
0
0

Why is parallelization relevant?

 

The R-Integration with SAP HANA aims at leveraging R’s rich set of powerful statistical, data mining capabilities, as well as its fast, high-level and built-in convenience operations for data manipulation (eg. Matrix multiplication, data sub setting etc.) in the context of a SAP HANA-based application. To benefit from the power of R, the R-integration framework requires a setup with two separate hosts for SAP HANA and the R/Rserve environment. A brief summary of how R processing from a SAP HANA application works is described in the following:

 

  • SAP HANA triggers the creation of a dedicated R-process on the R-host machine, then
  • R-code plus data (accessible from SAP HANA) are transferred via TCP/IP to the spawned R-process.
  • Some computational tasks take place within the R-process, and
  • the results are sent back from R to SAP HANA for consumption and further processing.


For more details, see the SAP HANA R Integration Guide: http://help.sap.com/hana/SAP_HANA_R_Integration_Guide_en.pdf

 

There are certain performance-related bottlenecks within the default integration setup which should be considered. The main ones are the following:

  • Firstly, latency is incurred when transferring large datasets from SAP HANA to the R-process for computation on the foreign host machine.
  • Secondly, R inherently executes in a single threaded mode. This means that, irrespective of the number of CPU resources available on the R-host machine, an R-process will by default execute on a single CPU core. Besides full memory utilization on the R-host machine, the available CPU processing capabilities will remain underutilized.


A straightforward approach to gain performance improvements in the given setup is by leveraging parallelization. Thus I want to present an overview and highlight avenues for parallelization within the R-Integration with SAP HANA in this document.


Overview of parallelization options


The parallelization options to consider vary from hardware scaling (host box) to R-process scaling and are illustrated in the following diagram


0-overview.png

The three main paths to leverage parallelization, as illustrated above, are the following:

     (1) Trigger the execution of multiple R-calls in parallel from within SQLScript procedures in SAP HANA

     (2) Use parallel R libraries to spawn child (worker) R processes within parent (master) R-process execution

     (3) Scale the number of R-host machines connected to SAP HANA for parallel execution (scale memory and add computational power)


While each option can be implemented independently of one another, they can as well be combined and mixed. For example if you go for (3)– scaling number of R-hosts, you need (1)– Trigger the execution of multiple R-calls, for parallelism to take place. Without (1), you may remain “only” in a better high availability/fault tolerant scenario.  


Based on the following use case, I would illustrate the different parallelization approaches using some code examples:

A Health Care unit wishes to predict cancer patient’s survival probability over different time horizons, after following various treatment options based on diagnosis.  Let's assume the following information:

  • Survival periods for prediction are: half year, one year and two years
  • Accordingly, 3 predictive models have been trained (HALF, ONE, TWO) to predict a new patient’s survival probability over these periods, given a set predictor variables based on historical treatment data.


In a default approach without leveraging parallelization, you would have one R-CALL transferring a full set of new patient data to be evaluated, plus all three models from SAP HANA to the R-host. On the R-host, a single-threaded R process will be spawned. Survival predictions for all 3 periods would be executed sequentially. An example of the SAP HANA stored procedure of type RLANG is as shown below.


0-serial.png

In the code above 3 trained models (variable tr_models) are passed to the R-Process for predicting the survival of new patient data (variable eval). The survival prediction based on each model takes place in the body of the “for loop” statement highlighted above.

 

Performance measurement: For dataset size of 1.038.024 (~16.15 MB) observations and 3 trained Blob model objects (each~26.8MB), an execution time of 8.900 seconds was recorded.


There are various sources of overhead involved in this scenario. The most notable ones are:

  • Network communication overhead, in copying one dataset + 3 models (BLOB) from SAP HANA to R.
  • Code complexity, sequentially executing each model in a single-threaded R-process. Furthermore, the “for” loop control construct, though in-built into base R, may not be efficient from a performance perspective in this case.

 

By employing parallelization techniques, I hope to achieve better results in terms of performance. Let the results of this scenario constitute our benchmark for parallelization.



Applying the 3 parallelization options to the example scenario


1. Parallelize by executing multiple R-calls from SAP HANA


We can exploit the inherent parallel nature of SAP HANA’s database processing engines by triggering multiple R-calls to run in parallel as illustrated as above. For each R-call triggered by SAP HANA, the Rserve-process would spawn an independent R-runtime process on the R-host machine.

 

An example illustrating how an SAP HANA SQLScript-stored procedure with multiple parallel calls of stored procedure type RLANG is given below. In the example, one thought is to separate patient survival prediction across 3 separate R-Calls as follows:

1-1 Rlang.png

  • Create an RLANG stored procedure handling survival prediction for just one model ( see input variable tr_model).
  • Include expression “READS SQL DATA” (as highlighted above) in the RLANG procedure definition for parallel execution of R-operators to occur, when embedded in a procedure of type SQLScript. Without this instruction, R-calls embedded in an SQLScript will excute sequentially.
  • Then create an SQLSCRIPT procedure

1-2 SQLScript.png


  • Embed 3 RLANG procedure-calls within the SQLSCRIPT procedure as highlighted. Notice that I am calling the same RLANG procedure defined previously but I pass on different trained model objects (trModelHalf, trModelOne, trModelTwo) to separate survival predication across different R-calls.
  • In this SQLScript procedure you can include the READS SQL DATA expression (recommended for security reasons as documented in the SAP HANA SQLScript Reference guide) in the SQLSCRIPT procedure definition, but to trigger R-Calls in parallel it is not mandatory. If included however, you cannot use DDL/DML instructions (INSERT/UPDATE/DELETE etc) within the SQLSCRIPT procedure.
  • On the R host, 3 R processes will be triggered, and run in parallel. Consequently, 3 CPU cores will be utilized on the R machine.


Performance measurement: In this parallel R-calls scenario example, an execution time of 6.278 seconds was experienced. This represents a performance gain of roughly 29.46%. Although this indicates an improvement in performance, we may have theoretically expected a performance improvement close to 75%, given that we trigger 3 R-calls. The answer for this gab is overhead. But which one?


In this example, I parallelized survival prediction across 3 R-calls, but still transmit the same patient dataset in each R-call. While the improvement in performance could be explained, firstly, by the fact that now HANA transmits lesser data per R-call (only one model, as opposed to three in the default scenrio) and consequently the data transfer may be faster. Secondly, each model survival prediction is performed in 3 separate R-runtimes.

 

There are two other avenues we could explore for optimization in this use case scenario. One is to further parallelize R-runtime prediction itself (see section 2). The other is to further reduce the amount of data transmitted per R-call by splitting the patient dataset in HANA and parallelize the data transmitted across separate R-calls (see section 4).

 

Please note that without the READS SQL DATA instruction in the RLANG procedure definition an execution time of 13.868 seconds was experienced. This is because each R-CALL embedded in the SQLscript procedure is executed sequentially (3 R-call roundtrips).


2. Parallelize the R-runtime execution using parallel R libraries



By default, R execution is single threaded. No matter how much processing resource is available on the R-host machine (64, 32, 8 CPU cores etc.), a single R runtime process will only use one of them. In the following I will give examples of some techniques to improve the execution performance by running R code in parallel.

 

Several open source R packages exist which offer support for parallelism with R. The most popular packages for R-runtime parallelism on a single host are “parallel” and “foreach. The “parallel” package offers a myriad of parallel functions, each specific to the nature of data (lists, arrays etc.) subject to parallelism. Moreover, for historical reasons, one can classify these parallel functions roughly under two broad categories, prefixed by “par-“ (parallel snow cluster) and “mc-“ (multicore).

 

In the following example I use the multicore function mcLapply() to invoke parallel R processes on the patient dataset. Within each of the 3 parallel R-runtimes triggered from HANA, split the patient data into 3 subsets, then, parallelize survival prediction on each subset. See figure below.


2-1.png

The script example above highlights the following:

  • 3 CPU cores are used (variable n.cores)by the R-process
  • The patient data is split into 3 partitions, according to number of chosen cores, using the “splitIndices” function.
  • The task to be performed (survival prediction) by each CPU core is defined in function “scoreFun
  • Then I call the mclapply()split.idx) , how many CPU cores to use, and which function should be executed by each core.


In this example, 3 R-processes (master) are initially triggered in parallel on the R-host by the 3 R-calls. Then within each master R-runtime, 3 additional child R-processes (worker) are spawn by calling mclapply(). On the R-host, therefore, we will have 3 processing groups executing in parallel, each consisting of 4 R-Runtimes (1 for master and 3 for workers). Each group is dedicated to predict patient survival based one model. For this setup 12 CPUs will be used in total.

 

Performance measurement: In this parallel R package scenario using mclapply(), an execution time of 4.603 seconds was observed. This represents roughly 48.28% gain in performance over the default (benchmark) scenario and a roughly 20% improvement over the parallel R-call example presented in section 2.


3. Parallelize by scaling the number of R-Host machines connected to HANA for parallel execution


It is also possible to connect SAP HANA to multiple R-hosts, and exploit this setup for parallelization. The major motivation for choosing this option is to increase the number of processing units (as well as memory) available for computation, provided the resources of a single host are not sufficient. With this constellation, however, it would not be possible to control which R-host receives which R request. The choice will be determined randomly via an equally-weighted round-robin technique. From an SQLScript procedure perspective, nothing changes. You can reuse the same parallel R-call scripts as exemplified in section 1 above.


Setup Prerequisites


  • Include more than one IPv4 addresses in CalcEngine parameter cer_rserve_addressesindexserver.inixsengine.ini file (see section 3.3 of SAP HANA R Integration Guide)
  • Setup parallel R-Calls within as SQLSCRIPT procedure, as described in section

3-1 config.png

I configure 2 R-host addresses in the calcengine rserve address option shown above. While still using the same SQLScript procedure as in the 3 R-Calls scenario example (I change nothing in the code), I trigger parallelization of 3 R-calls across two R-host machines.


3-2 Parallel R -call.png

Performance measurement: The scenario took 6.342 seconds to execute. This execution time is similar to the times experienced in the parallel R-calls example. This example only demonstrates that parallelism works in a multi R-host setup. Its real benefit for parallelization comes into play when it believed the computational resources (CPUs, memory) available on one R-box are not enough.


4. Optimizing data transfer latency between SAP HANA and R


As discussed in section 1, one performance overhead is in the transmission of the full patient data set in each parallel R-call from HANA to R (see example in section 1). We could further reduce the latency in data transfer by splitting data set into 3 subsets in HANA, then using 3 parallel R-calls we transfer each subset from HANA to R for prediction. In each R call, however, we would have to also transfer all 3 models.


An example illustrating this concept is provided in the next figure.


4-1 split in hana.png


In the example above, the following is performed

  • The patient dataset (eval) is split into 3 subsets in HANA (eval1, eval2, eval3).
  • 3 R-calls are triggered, each with the transferring a data subset together with all 3 models.
  • On the R-host, 3 master R-process will be triggered. Within each master R-Process I parallelize survival prediction across 3 cores using pair functions mcpallelel()/mccollect() for task parallelism in the “parallel” R-package from the  (task parallelism) as shown below.


4-2 parallelize in R.png

 

  • I create and R funtion (scoreFun) to specify a particular task. This function focuses on predicting survival based on one model input parameter.
  • For each call of mcparallel() function an R process is started in parallel and will evaluate the expression in R function definition scoreFun. I assign each model individually.
  • With a list of assigned tasks I then call mccollect() to retrieve the results of parallel survival prediction.


In this manner, the overall data transfer latency is reduced to the size of data in each subset. Furthermore, we still maintaining completeness of data via parallel R-calls. The consistency in the results of this approach is guaranteed if there is no dependency in the result computation for each observation in the data set.

 

Performance measurement: With this scenario, an execution time of 2.444 seconds was observed. This represents a 72.54% performance gain over the default benchmark scenario. This represents roughly 43% improvement over the parallel R-call scenario example in section 1, and a 24.26% improvement over the parallel R-runtime execution (with parallel R-libraries) example in section 2. A fantastic result supporting the case for parallelization.


Concluding Remarks


The purpose of this document is to illustrate how techniques of parallelization can be implemented to address performance-related bottlenecks within the default integration setup between SAP HANA and R. The document presented 3 parallelization options one could consider:


  • Trigger parallel R-calls from HANA
  • Use parallel R libraries to parallelize the R-execution
  • Parallelize R-calls across multiple R-hosts.

 

With parallel R libraries you can improve the performance of a triggered R-process execution by spawning additional R-runtime instances executing on the R-host (see section 2). You can either parallelize by data (split data set computation across multiple R-runtimes), or by task (split algorithmic computation across multiple R-runtimes). A good understanding of the nature of the data and the algorithm is, therefore, fundamental to choosing how to parallelize. When executing parallel R runtimes using R-libraries we should remember that there is an additional setup overhead incurred by the system when spawning child (worker) R-processes and terminating them. The benefits of parallelism using option should, therefore, be appreciated after prior testing in an environment similar to the productive environment it will eventually run.


On the other hand, when using the trigger parallel R-calls option, no additional overhead is incurred on the overall performance. This option provides us with a means to increase the number of data transmission lanes between HANA and the R-host, as well as allows us spawn multiple parent R-runtime processes in the R-host. Exploiting this option led to the following key finding: The data transfer latency between HANA and R can, in fact, be significantly reduced by splitting the data set in HANA, and then parallelize the transfer of each subset from HANA to R using parallel R-calls (as illustrated in section 4).





Other Blog Links

Install R Language and Integrate it With SAP HANA

Custom time series analytics with HANA, R and UI5

New SQLScript Features in SAP HANA 1.0 SPS9

How to see which R packages are installed on an R server using SAP HANA Studio.

Quick SAP HANA and R usecase

Let R Embrace Data Visualization in HANA Studio

Connect ABAP with R via FastRWeb running on Rserve

HANA meets R    

Creating an OData Service using R

SAP HANA Application Example : Personal Spending Analysis - HANA Models

SAP HANA Database Campus – Open House 2016 in Walldorf

$
0
0

The SAP HANA Database Campus invites students, professors, and faculty members interested in database research to join our third Open House at SAP's headquarters. Throughout your day, you will get an overview of database research at SAP, meet the architects of SAP HANA and learn more about academic collaborations. There are a couple of interesting presentations by developers and academic partners. Current students and PhD candidates present their work and research. For external students and faculty members it is a great chance to find interesting topics for internships and theses.


The event takes place on June 2nd, 2016, during 09:3016:00 in Walldorf, Germany. Free lunch and snacks are provided for all attendees. The entire event is held in English.

 

Register here

 

 

Looking forward to seeing you in Walldorf,

The SAP HANA Database Campus

students-hana@sap.com

 

 

Location:

  • SAP Headquarters,WDF03, Robert-Bosch-Str. 30, 69190, Walldorf, Germany
  • Room E4.02, Check-In Desk in the lobby of WDF03

 

Agenda:

  • 09:00-09:30 Arriving
  • 09:30-10:00 Check-In
  • 10:00-10:15 Opening
  • 10:15-11:00 Keynote
    • Daniel Schneiss (Head of SAP HANA Development) – Topic will be announced
  • 11:00-12:00 Poster Session Part 1 & Career Booth
  • 12:00-12:45 Lunch
  • 12:45-13:00 Office Tour
  • 13:00-14:00 Session 1 – Academic
    • Prof. Anastasia Ailamaki (EPFL) Scaling Analytical and OLTP Workloads on Multicores: Are we there yet? [30 min]
    • Ismail Oukid (SAP HANA PhD student, TU Dresden) FPTree: A Hybrid SCM-DRAM Persistent and Concurrent BTree for Storage Class Memory [15 min]
    • SAP HANA PhD Student Speaker and Topic will be announced [15 min]
  • 14:00-15:00 Poster Session Part 2, Career Booth & Coffee Break
  • 15:00-15:45 Session 2 – SAP
    • Hinnerk Gildhoff (SAP) – SAP HANA Spatial & Graph[20 min]
    • Daniel Booss (SAP)– SAP HANA Basis [20 min]
  • 15:45-16:00 Best Student/PhD-Student Poster & Open House Closing

 

 

Archive of previous events


By participating you agree to appear in photos and videos taken during the event and published on SCN and CareerLoft.

SAP HANA TDI - Overview

$
0
0

SAP HANA tailored data center integration (TDI) was released in November 2013 to offer an additional approach of deploying SAP HANA. While the deployment of an appliance is easy and comfortable for customers, appliances impose limitations on the flexibility of selecting the hardware components for compute servers, storage, and  network. Furthermore, operating appliances may require changes to established IT operation processes. For those who prefer leveraging their established processes and gaining more flexibility in hardware selection for SAP HANA, SAP introduced SAP HANA TDI. For more information please download this overview presentation.

View this Document

HANA Rules Framework (HRF)

$
0
0

Welcome to the SAP HANA Rules Framework (HRF) Community Site!


SAP HANA Rules Framework provides tools that enable application developers to build solutions with automated decisions and rules management services, implementers and administrators to set up a project/customer system, and business users to manage and automate business decisions and rules based on their organizations' data.

In daily business, strategic plans and mission critical tasks are implemented by a countless number of operational decisions, either manually or automated by business applications. These days - an organization's agility in decision-making becomes a critical need to keep up with dynamic changes in the market.


HRF Main Objectives are:

  • To seize the opportunity of Big Data by helping developers to easily build automated decisioning solutions and\or solutions that require business rules management capabilities
  • To unleash the power of SAP HANA by turning real time data into intelligent decisions and actions
  • To empower business users to control, influence and personalize decisions/rules in highly dynamic scenarios

HRF Main Benefits are:

Rapid Application Development |Simple tools to quickly develop auto-decisioning applications

  • Built-in editors in SAP HANA studio that allow easy modeling of the required resources for SAP HANA rules framework
  • An easy to implement and configurable SAPUI5 control that exposes the framework’s capabilities to the business users and implementers

Business User Empowerment | Give control to the business user

  • Simple, natural, and intuitive business condition language (Rule Expression Language)

Untitled.png

  • Simple and intuitive UI control that supports text rules and decision tables

NewTable.png

  • Simple and intuitive web application that enables business users to manage their own rules

Rules.png   

Scalability and Performance |HRF as a native SAP HANA solution leverages all the capabilities and advantages of the SAP HANA platform.


For more information on HRF please contact shuki.idan@sap.com  and/or noam.gilady@sap.com

Interesting links:

SAP solutions already utilizing HRF:

Here are some (partial list) SAP solutions that utilizes HRF in different domains: 

Use cases of SAP solutions already utilizing HRF:

SAP Transportation Resource Planning

TRP_Use_Case.jpg

SAP Fraud Management

Fraud_Use_Case.JPG

SAP hybris Marketing (formerly SAP Customer Engagement Intelligence)

hybris_Use_Case.JPG

SAP Operational Process Intelligence

OPInt_Use_Case.JPG

[SAP HANA Academy] Live3: Explain Clustering

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


In the next part of the Live3 course Philip Mugglestone explains how the SAP HANA predictive analysis library (PAL) can be used to cluster similar Tweeters together based on their influence and stance scores. This video will review the k-means clustering algorithm. Check out Philip’s tutorial video below.

Screen Shot 2015-03-18 at 6.16.17 PM.png

(0:35 – 3:20) Overview of PAL

 

For an extensive set of in-depth information about PAL browse through and view this playlist of 84 videos from Philip in the SAP HANA Academy. The Playlist covers many of PAL’s native algorithms including clustering with the K-means algorithm.

 

Reading through the SAP HANA PAL documentation is vital for getting a full understanding of the myriad capabilities PAL offers. In a web browser visit help.sap.com/hana and click on SAP HANA options. Select the SAP HANA Predictive link and then you can choose to view the PAL documentation in a PDF or online.

 

PAL is embedded data mining algorithms in the SAP HANA engine (where the data actually resides). By navigating though the page you can find information on K-means clustering.

 

(3:20 – 4:40) K-means Clustering Information

 

K-means uses input data (in Live3 Twitter users) and then lists out information (Influence and Stance) about each piece of data so clustering can be preformed based on similarities in the data.

 

K-means clustering is a table-based mechanism. This documentation is the go-to source for K-means clustering information including the data types of your input data, what parameters are required, how many clusters you have to create and what are their centers.

 

(4:40 – 7:20) Visualizing Tweeters' Stance and Influence Scores

 

Back in Eclipse do a data preview on the Tweeters table we just created. This Tweeters table will be the input table for the predictive analysis. Our id will be the Twitter users' handles and our inputs will be the stance and influence scores.

 

Clicking on the Distinct values tab quickly displays the range of the stance and influence values for all of the Twitter users. For Philip's data on the Australian Open over 67% of the users have a 0 stance score so they are considered neutral while over 70% have a -1 influence score.

Screen Shot 2015-03-18 at 6.57.16 PM.png

To further analyze the data Philip clicks the Analytics tab and then drags both the stance and influence Numerics to the Values axis. The he selects a scatter charter to visualize a cross section of scores for each user. This divides of all of the users into quadrants based on their stance and influence. One business value we could quickly derive would be to target the people in the top left quadrant who are highly influential and expressing negative views with educational outreach.

Screen Shot 2015-03-18 at 7.01.20 PM.png


Follow along with the Live3 course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy


[SAP HANA Academy] Live3: Preform Clustering

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


The SAP HANA Academy’s Philip Mugglestone continues along in Live3 course by reviewing and running the SQL Script necessary to setup and execute K-means clustering analysis using the SAP HANA predictive analysis library. Philip also shows how to create views that will be accessed via web services down the road. Please watch Philip’s video below.

Screen Shot 2015-04-16 at 1.04.43 PM.png

(0:30 – 2:00) Pasting Predictive SQL code and Replacing Schema Name

 

With Notepad++ open file 05 setupPredictive.sql from the scripts folder of the Live3 code repository on GitHub. Copy and paste the code into a SQL console in Eclipse.

 

First make sure you’re set to the proper schema. Second you must paste your schema name everywhere in the code where it currently has neo_ as a placeholder. There are five lines of code that must be corrected and you do have the option of preforming a global replace.

 

(2:00 – 4:35) Examination of the Setup Predictive Code - HCP AFL Wrapper Generator Stored Procedure

 

The first part of the code will preform a clean up in case the syntax has been previously run.

 

The next section creates a stored procedure using a special HCP stored procedure called HCP AFL Wrapper Generator. As HCP is a multi-tenant version of SAP HANA, this stored procedure is slightly different than the on-premise SAP HANA AFL Wrapper Generator. This stored procedure creates a set of input table types that reflects the input data table, the parameter table and the two results tables.

Screen Shot 2015-04-16 at 1.15.25 PM.png

A reference to these table types is put into a signature table. Then the Wrapper Generator is called with the name of the procedure we want it to create (Tweeters_Cluster), the name of the algorithm we want to use (KMEANS) and the four input/output tables we want to use from the signature table. This allows SAP HANA to generate a stored procedure to preform the clustering that we can then call later on.

 

After executing the stored procedure, Philip examines it in the procedures folder of the newly created SYS_AFL schema.

 

(4:35 – 6:15) Examination of the Setup Predictive Code - Creating Parameter and Output Tables

 

The next part of the code creates the parameter table that will be used. The table’s type has already been defined in the stored procedure so a column table can be created using the like statement. Then values are inserted into the table. Philip recommends optimizing the number of clusters that will be created by setting a minimum value of 5 and a maximum value of 10. Execute that section of code to create the parameter table.

Screen Shot 2015-04-16 at 1.56.00 PM.png

Continuing on in the code is two lines that create a pair of empty output tables called PAL_RESULTS and PAL_Centers. These output tables are created again with the like statement so they use the table types that have already been set up. The table types respect the structure that was listed in the documentation.

 

(6:15 - 8: 35) Running the Code and Examining the Tables


To actually run the clustering is rather simple. First the results tables must be empty and then the stored procedure must be called. This is the section that is run on a regular basis, possibly every minute or hour. The input table/view is the Twitter view, which will send real-time data straight into the stored procedure in the SAP HANA engine. The stored procedure does an intense job of reading the entire set of data and iterating it up to 100 times to work out the clusters.

 

At the end of the code is a set of select statements that let us see the data in the output tables. The first result table shows a cluster number between 0 and 9 for each of the Twitter users from the input data. The distance number in the results table details how far a user is away from the center of their assigned cluster. The second output table shows the stance and influence value for each cluster.

Screen Shot 2015-04-16 at 2.17.32 PM.png

(8:35 – 10:30) Examination of the Setup Predictive Code - Creating Views for Web Services

 

The code also creates a pair of views. The TweetersClustered view pulls in the original tweeting information, adds 1 to the cluster number and counts the total number of tweets each user sent. This enables us to see the stance, influence, the total number of tweets and the cluster number for each Twitter user.

Screen Shot 2015-04-16 at 2.16.40 PM.png

The Clusters view shows the center of the clusters. This view adds one to the cluster’s value and shows the number of people in the cluster using a join of a select of the number of users assigned to each cluster.

 

(10:30 – 12:00) Analysis of the TweetersClustered View

 

Open a data preview for the TweetsClustered view in the HCP schema and go to the Analysis tab. Drag the user and the clusterNumber over on the Labels axis and the stance and influence over on the Values axis. Change to a bubble chart to see a plot of the influence and stance for each individual Twitter user with a color for each individual cluster number.

Screen Shot 2015-04-16 at 2.14.34 PM.png

Follow along with the Live3 on HCP course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

[SAP HANA Academy] Live3: SAP HANA Web-Based Development Workbench

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


Philip Mugglestone from the SAP HANA Academy continues the Live3 on HCP course by introducing the SAP HANA Web-based Development Workbench and showing how to create the live3 web services project. Check out Philip's video below.

Screen Shot 2015-04-17 at 4.43.54 PM.png

(0:18 – 3:10) Examination of the SAP HANA Web-based Development Workbench

 

You have the option to preform your live3 application development either with the SAP HANA Development perspective in Eclipse or with the SAP HANA Web-based Development Workbench in a web browser.

 

In this course Philip opts for the web-based option. To follow along with Philip in your SAP HANA Cloud Platform Cockpit navigate to the HANA instances section and click on the link to the SAP HANA Web-based Development Workbench in your dev trial instance. In the workbench you can develop your web services projects that will appear on the web. Already displayed on the workbench is a project with your p-number trial account name and within that is a project called dev. A new live3 project will be built in dev.

 

The SAP HANA Web-based Development Workbench Catalog provides a very similar look to the Catalog in Eclipse. So you can look at look at all of the data in the $TA_Tweets table in your application schema on the web. You can run SQL on the web to see the data that you want and you will the same rights as you do in Eclipse.

Screen Shot 2015-04-17 at 4.49.42 PM.png

(3:10 – 5:30) Creating the live3 Application

 

Philip recommends that you create the live3 project in the dev project and name it live3 in lowercase as there will be some dependences in the sample code from GitHub. Right click on dev, choose create application and name the sub-package live3. Choose blank application as the template before clicking create.

Screen Shot 2015-04-17 at 4.48.20 PM.png

This will create a live3 project inside dev with three files. The index.html file is a boilerplate of some html. The .xsapp file basically says that this is an application in SAP HANA extended application services. The .xs file is currently empty. Typically the .xs file will have JSON syntax.


The .xsaccess file allows you to control who can access the application. To make the application visible on the web you need to set the exposed value as true. Also you can specify how people can authenticate when using the application.

 

Now click on the index.html file and then click the green arrow button to run on the server. This will run the application and generate a URL for the application.


Follow along with the Live3 on HCP course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

[SAP HANA Academy] Live3: Web Services Authorization

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


Continuing the Live3 on HCP series the SAP HANA Academy’s Philip Mugglestone details how to configure the authorizations for the web services aspect of the Live3 application. Philip will define application privileges and user roles. Check out Philip’s video below.

Screen Shot 2015-04-24 at 2.03.01 PM.png

(0:20 – 5:05) Adding and Configuring the .xsprivileges and user.hadrole File

 

Picking up from the previous video Philip first removes the index.html file from the live3 project in the SAP Web-based Development Workbench. Next he opens the services folder from the Live3 code repository on GitHub and select the user.hdbrole and the .xsprivileges files. Then he drags and drops the files into the Multi-File Drop Zone on the SAP HANA Web-based Development Workbench. This will automatically install and run the files on the server.

 

The .xsprivileges file gives the privilege to run the application called execute. The user.hdbrole file creates a role that has access to tables and views. The code has failed as currently it’s just boiler plate code with a template account and schema name. So insert your personal trial account name in both places and paste in your schema name into the two marked lines in the code. Hitting the save button will save the role.

Screen Shot 2015-04-24 at 2.11.21 PM.png

(5:05 – 9:30) - Setting the Proper Authorization for the HDB Role

 

Next Philip makes sure that the .xsaccess file specifies that the user has the proper role before they can use the application. Drag and drop the .xsaccess file into SAP HANA Web-based Development Workbench and then hit f5 to re-execute. Now the .xsaccess file will have a comment instructing you to paste in your trial name. After pasting and save the file now the authentication method will be form based. However, in the HCP developer trail edition the authentication will be done automatically.

 

Now we will use another specific HCP stored procedure to authorize a user to this HDB role. Open the 06 setupAuthorizations.sql file from the scripts GitHub folder and paste the lines of code into a new SQL console in Eclipse. Make sure you add your p trial account number before executing the call in your SAP HANA system to successfully grant the role.

Screen Shot 2015-04-24 at 2.27.40 PM.png

Follow along with the Live3 on HCP course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

[SAP HANA Academy] Live3: Web Services - Setup OData

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


In the next installment of the Live3 on HCP course the SAP HANA Academy’s Philip Mugglestone introduces OData and shows how to set up an OData service for the live3 web services project so you can access tables and views on the web. Check out Philip’s video below.

Screen Shot 2015-04-28 at 12.25.31 PM.png

(0:12 – 1:05) Introduction to OData

 

OData is short for open database protocol. OData is rest services that allow you to access the data in databases via URLs. You can liken it to ODBC for the web. SAP HANA supports rest services through OData, so there is nothing to setup and install. Therefore you can simple activate OData against your tables and views. For more information about OData visit odata.org.

 

(1:05 – 6:30) Configuring and Examining the services.xsodata File

 

With the live3 project selected in the SAP HANA Web-based Development Workbench drag the services.xsodata file from the services folder in the Live3 Github code repository and drop it into the Multi-File Drop Zone.

 

The powerful code in the services.xsodata file will reference data from the application schema so you must run a global replace to insert your personal schema name. After clicking the save button you will have activated the OData service.

Screen Shot 2015-04-28 at 12.46.23 PM.png

The code makes a reference in the service to the name of a table or view to access the data across the web. The code specifies the name of the table or view with the schema and gives a name for an entity because OData works with entities and properties as opposed to tables and columns. The first part of the code references the Tweets table and makes it so that the entity Tweets will be available through the OData format. The create, update and delete functions will be forbidden as we only want to read the data on the web.

 

A similar process is done for the Tweeters view. However, its slightly different as OData requires a key and views don’t have primary keys. So you must specify the name of the column, in this case user, which will then be the primary key.

 

The next section of code is for the TweetersClustered data and has a key of user. This piece of code is slightly different than the prior two sections of code as we want to make a navigation to associate it with another table. This is similar to a SQL join.

 

The 4th part of the code setups the OData service for the Clusters view with ClusterNumber as the primary key and a navigation to setup a association.

 

(6:30 – 9:00) How Associations Work

 

The final two sections of the code are association statements that setup the associations’ definitions between pairs of entities. In the first statement the principle is the Clusters entity with ClusterNumber as the key. For each ClusterNumber there may be multiple people in that cluster. So the dependent entity is the TweetersClustered entity with ClusterNumber as the key and a multiplicity of many (*). This is effectively equivalent to an inner join where data is read from the Clusters table and joined with the data read from the TwettersClustered table in order to get all of the Tweeters.

 

Names are given in order to reference the associations. For example in the Clusters2Tweeters association a property is inside the association that contains the name for each individual Tweeter. This same one to many type of relationship is created for all of the Tweeters that have been clustered so we can see their individual tweets. So the Tweeters2Tweets association will be referred to as Tweets.

 

Hitting the execute button with services.xsodata selected will setup a working OData service. It will generate the XML file pictured below on the web.

Screen Shot 2015-04-28 at 12.59.39 PM.png

Follow along with the Live3 on HCP course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

[SAP HANA Academy] Live3: Web Services - Using OData

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


Continuing the Live3 on the SAP HANA Cloud Platform course the SAP HANA Academy’s Philip Mugglestone provides a closer examination of the previously setup OData web services by running some example queries. Watch Philip's tutorial video below.

Screen Shot 2015-04-29 at 12.10.32 PM.png

(0:20 –  4:20) Viewing Meta Data and Entities in JSON Format

 

Running the services.xsodata file has generated a URL based on the trail account (p number), SAP HANA instance (dev), project (live3), and file (services.xsodata). Calling the file lists out the existing entities (Tweets, Tweeters, TweetersClustered and Clusters).

 

With OData we can make requests via URL based syntax. For example appending /$metadata to the end of the URL displays the full meta data for all of the properties within each entity. The data you get from OData is self referencing and is very important as SAPUI5 can read this meta data automatically to generate the screens.

Screen Shot 2015-04-29 at 12.16.49 PM.png

Be careful when looking at the individual entities in OData as there may be 100,000s of, for example Tweets, and you don’t want to read them all. So appending /Tweets?$top=3 to the URL only displays the top 3 Tweets in XML format.

Screen Shot 2015-04-29 at 12.24.54 PM.png

The XML format appears a bit messy so you can convert it to JSON format by adding &$format=json to the URL. By default the JSON format isn’t as readable as possibly desired so you download for free JSONView from the chrome store in order to display it in a nice readable format.

Screen Shot 2015-04-29 at 12.34.42 PM.png

To see only certain parts of an entity's data, for instance the id and text columns, you can append &$select= id,text to the URL. This returns only the id and text values, as well as the meta data for the Tweets entity.

Screen Shot 2015-04-29 at 12.37.03 PM.png

(4:20 – 6:30) OData's Filter, Expand and Count Parameters

 

Philip next shows the data for his Clusters entity by adding /Clusters?$format=json to the URL. Similar to a where clause in SQL, Philip filters his results by adding &$filter=clusterNumber eq1 to display only his first cluster.

Screen Shot 2015-04-29 at 12.38.13 PM.png

To see the Twitters association from the Clusters entity Philip adds an expand parameter by entering &$expand=Tweeters to the end of the URL. This returns all of the information for each of the individual Twitters in cluster 1.

Screen Shot 2015-04-29 at 12.39.49 PM.png

To see the number of rows for an entity add /$count after the entity’s name in the URL.


Follow along with the Live3 on HCP course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

[SAP HANA Academy] Live3: Web Service - Setup XSJS

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


Part of the SAP HANA Academy’s Live3 on HCP course, the below video tutorial from Philip Mugglestone shows how to add server-side scripting capabilities to the live3 web services project. With this you can configure actions to refresh the clustering and reset the database. Watch Philip’s video below.

Screen Shot 2015-05-05 at 3.27.02 PM.png

(0:35 – 3:00) Inserting the Proper Schema Name and P Number into the services.xsjs Code

 

With the live3 project selected in the SAP Web-based Development Workbench, open the services folder of the Live3 GitHub code repository and drag the services.xsjs file in the Multi-File Drop Zone. First you must do a global replace to insert your schema name. Also you must insert your account p number where marked in the code as the code checks to verify if the user has the execute privilege. After verification the user can preform the reset and/or cluster operation.

 

(3:00 – 6:00) Examining the Code’s Logic

 

The code is very straight forward. It first checks if the user has the privilege to execute. If so then the URL command (cmd) parameter will be returned. It will pause there and wait for the command. If cmd=reset then it will call the reset function and if cmd=cluster than it will call the cluster function. If neither reset nor cluster is entered then it will display invalid command. If the user isn’t authorized then a not authorized message will appear.

Screen Shot 2015-05-05 at 3.52.21 PM.png

The reset function’s code first sets the schema and then truncates (empties) the Tweets table that is loaded directly via node.js. Next it empties the PAL results and centers tables. Then the full text analysis index is first cleaned out and then recreated using the same code that was used earlier in the setup text analysis piece. The only difference from earlier is that the code is modified with a backslash in front of every single quotation mark in the SQL.

Screen Shot 2015-05-05 at 3.48.42 PM.png

The cluster function’s code is similar to the setup Predictive SQL code. The schema is set and the PAL results and centers tables are truncated. Then the procedure is called. On the web as opposed to seeing the results directly, instead the results table will display questions marks first. Then it will loop around a set of results and then insert those results into the table using JavaScript.

Screen Shot 2015-05-05 at 3.51.17 PM.png

(6:00 – 7:30) Testing services.xsjs

 

Executing the services.xsjs file will open a web page that displays invalid command: undefined. This should happen as it didn’t recognize the default command that was specified. So you must delete the default anti caching system that appears after /service.xsjs? in the URL and them add a valid command. For instance cmd=cluster.

 

Entering the command for cluster won’t display anything on the web page at this point. However to show that the file has run with a valid command open the developer tools (control+shift+I in Chorme) and go to the network tab. In the network tab there will be information about the call.

Screen Shot 2015-05-05 at 3.53.00 PM.png

Follow along with the Live3 on HCP course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

[SAP HANA Academy] Live3: Web Services - Debugging

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


The SAP HANA Academy’s Philip Mugglestone continues the Live3 on HCP course by showing how the server-side scripting application can be easily debugged using the SAP HANA Web-based Development Workbench.  Check out Philip’s tutorial video below.

Screen Shot 2015-05-07 at 10.29.27 AM.png

(0:15 – 4:10) How to Debug the XSJS Application

 

First identify the user account. This is listed near the top right corner of the SAP HANA Web-based Development Workbench. Right click on the user name (in Philip’s case it begins with DEV_) and select inspect element. Then copy the user account name so it can be used later on in the debugging.

Screen Shot 2015-05-07 at 10.37.01 AM.png

Now a definition must be created that enables this user to preform debugging. When logged into the server go to the URL displayed below ending with /sap/hana/xs/debugger. On the Grant Access screen paste in the copied account name into the Username text box. Set an expiration date and time for when the debugging access will cease and then click the grant button. Now this user can debug the session.

Screen Shot 2015-05-07 at 10.38.24 AM.png

Back in the SAP HANA Web-based Development Workbench choose the services.xsjs file and hit the execute button to open it up in a new browser tab. Append cmd=cluster1 to the end of the URL to return an invalid command. Now open the developer tools (control+shift+I in chrome) and navigate to the resources tab. Then expand the Cookies folder and open the session cookie file. Identify the value of the xxSessionId.

Screen Shot 2015-05-07 at 10.44.59 AM.png

Now back in the SAP HANA Web-based Development Workbench click the settings button. Then choose the value of the xxSessionId as the session to debug and click apply. A message will appear that the debugger has been attached to the session. Next set a break point where the command is being processed in the code.

Screen Shot 2015-05-07 at 10.46.16 AM.png

Now make a call in the URL. Philip enteres cmd=cluster2. The screen won’t change from earlier and will still say Invalid Command: cluster1 as it will say waiting for hanaxs.trail.ondemand. This is because the debugger has been opened in the SAP HANA Web-based Development Workbench. You will see that the cluster 2 command has been entered and the debugger has come to the break point that was set. You have the normal debugging options such as step in, step over, step through, etc. If you hit the resume button on the debugger than on the file page it will now say Invalid Command: cluster2.

Screen Shot 2015-05-07 at 10.55.56 AM.png

This is how you can access the debugger to preform real-time debugging when using xs in SAP HANA.

 

Follow along with the Live3 on HCP course here.


SAP HANA Academy over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy


[SAP HANA Academy] Live3: Web Services - Authentication

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


In the next part of the SAP HANA Academy’s Live3 on HCP course Philip Mugglestone explains why a “proxy” authentication server is needed to access your SAP HANA Cloud Platform web services from a SAP HANA Cloud HTML5 application. Watch Philip’s tutorial video below.

Screen Shot 2015-05-08 at 10.05.31 AM.png

(0:12 – 3:00) Issue with HTML5 Authentication for the HCP Developer Trail Edition

 

Prior to this tutorial the web services were set up using the SAP HANA instance. We now want to access our Live3 app, OData, and server side JavaScript from a front end application UI.

 

Back in the SAP HANA Cloud Platform Cockpit our SAP HANA instance now has one application. Clicking on the application shows the URL, which you can navigate to and then enter a command like we've done in the earlier videos in the Live3 course.

 

There is one slight complication to building a HTML5 front end application. Our SAP HANA instances in the developer trail edition of HCP use SAML 2.0 authentication. Normally to access a backend system when working with a HTML5 application you use a destination in order to reference a folder or URL. The destination appears to be local to where the HTML5 application is hosted. However, it is pushed out to a backend system that can be hosted anywhere on the internet (even behind a firewall if you use the cloud connector). The destination is very important as it allows you get around the restriction of most browsers.

 

The trail edition of the SAP HANA Cloud Platform uses only SAML 2.0 as the authentication for the SAP HANA instance. SAML 2.0 is not an authentication method available in the destination configuration in the SAP HANA Cloud Platform Cockpit. Fortunately there is workaround.

Screen Shot 2015-05-08 at 10.32.13 AM.png

(3:00 – 4:45) Explanation for Proxy’s Necessity via the Live3 Course Architecture

 

Normally the browser or mobile HTML5 app would access the SAP HANA Cloud Platform where the HTML5 app is hosted. It would then access a backend system, which is SAP native web services, through a destination. However, we can’t connect the destination to the SAP HANA XS instance. So a destination can be defined that goes through the SAP HANA Cloud Connector that is installed locally on the desktop. Then a proxy will be inserted in-between the SAP HANA Cloud Connector and the native web services to account for the SAML 2.0 authentication and then connect back to the destination. This would not be run in production but is being used in this course purely as a work around of a technical limitation of the free trail developer edition of the SAP HANA Cloud Platform.

Screen Shot 2015-05-08 at 10.35.08 AM.png

(4:45 – 5:45) Locating the Proxy

 

The necessary proxy was created by SAP Mentor, Gregor Wolf. Search Google for “Gregor Wolf GitHub” and click on the link to his page. Under the popular repositories section open the hanatrail-auth-proxy file. Written in node.js the file will allow us to access the SAP HANA web services via a destination. The next video will detail how to download and install the proxy.


Follow along with the SAP HANA Academy's Live3 on HCP course here.


SAP HANA Academy - Over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

[SAP HANA Academy] Live3: Web Services - Authentication Setup Proxy

$
0
0

[Update: April 5th, 2016 -  The Live3 on HCP tutorial series was created using the SAP HANA Cloud Platform free developer trial landscape in January 2015. The HCP landscape has significantly evolved over the past year. Therefore one may encounter many issues while following along with the series using the most recent version of the free developer trail edition of HCP.]


Continuing from the previous tutorial video of the SAP HANA Academy’s Live3 on HCP course, Philip Mugglestone shows how to setup the “proxy” authentication server for the HCP trail developer edition. Watch Philip's tutorial video below.

Screen Shot 2015-05-11 at 10.43.55 AM.png

(0:20 – 3:30) Installing the Prerequisites for the hanatrail-auth-proxy File and Modifying its Code

 

On the hanatrail-auth-proxy page located on SAP Mentor, Gregor Wolf’s GitHub, click on the download ZIP button. Extract the downloaded zip and then open a command window on the hantrail-auth-proxy file.

 

First a few prerequisite node.js modules (cheerio and querystirng) must be installed. In the command window enter npm install cheerio. Wait a few seconds for the cheerio installation to be completed before entering npm install querystring.

Screen Shot 2015-05-11 at 12.04.10 PM.png

*Note – The component has been updated since this video was recorded. Simply use “npm install” from the main hanatrail-auth-proxy folder. There is now no need to install cheerio and querstring explicitly.*

 

Next we need to make a few changes to the hanatrail-auth-proxy code. First right click to edit the config.js file with notepad++. First you must set a port to use. This will create a web server that is similar to the nodejs we created earlier for loading the Twitter data.


You also must insert the correct host. The host is the beginning of the services.xsodata URL. For example Philip’s host is s7hanaxs.hanatrail.ondemand.com. Leave the timeout and https as is before saving the code.

Screen Shot 2015-05-11 at 12.09.28 PM.png

*Note – The config.js and server-basic-auth files have moved to the examples subfolder. You must still verify that the “host” option in examples/config.js matches your SAP HANA XS instance.*

 

(3:30 – 6:30) Running the Proxy

 

To start the proxy application, back in the command window enter node server-basic-auth.js. A message will appear saying the SAP HANA Cloud Platform trail proxy is running on the port host number.

Screen Shot 2015-05-11 at 12.17.15 PM.png

Open a new web browser tab and enter localhost:portnumber/URL of application. So in Philip’s example he enters the URL displayed below.

Screen Shot 2015-05-11 at 11.55.39 AM.png

After logging in with your HCP p number the authentication for the SAP HANA instance using SAML 2.0 should be preformed automatically. Effectively now the proxy, acting as a local web server, talks as if it’s the SAP HANA Cloud Platform trial edition. You can now make all of the calls that were demonstrated in previous videos (e.g. metadata, clusters) using the localhost URL.

 

Follow along with the SAP HANA Academy's Live3 on HCP course here.


SAP HANA Academy - Over 900 free tutorial videos on using SAP HANA and SAP HANA Cloud Platform.


Follow @saphanaacademy

SAP HANA Solution Brief

Enabling Historical Alerts in SAP HANA with DB Control Center

$
0
0

Introduction

SAP HANA is constantly collecting various health and performance statistics that can be viewed by database administrators (DBAs) to ensure the system is operating at optimal capacity.  Situations may arise where SAP HANA encounters problems and that will typically trigger an alert to notify DBAs of pending or potential issues. By analyzing the patterns of alerts in the past, the DBA can develop insights into the behavior of their systems, for instance, to learn how system configuration changes may be affecting performance. This document describes how to enable Historical Alerts on your SAPHANA systems using SAP HANA DB Control Center (DCC).

 

Requirements

  • An SAP HANA system with DCC, both SPS11 or higher
  • SAP HANA Studio

 

Steps

We will be using DCC and  SAP HANA Studio in order to accomplish our task. DCC collects both system health data (as viewed in the Enterprise Health Monitor), and HANA alert data (as viewed in the Alert Monitor). We will enable historical system health data using the SAP DCC Setup wizard, then complete our setup by enabling and configuring Historical Alerts collection in SAP HANA Studio. Finally, we will learn how to undo the changes made to our system.

  1. Preparing Your System
  2. Enable Historical Alerts in DCC
  3. Complete Setup of Historical Alerts in SAP HANA Studio
  4. Enable Historical Alerts on Registered Systems
  5. Turn Off Historical Alerts and Undo Changes


Preparing Your System

For a number of steps in this tutorial, we will be interacting with the tables inside the catalog of your HANA system. To access the necessary tables, log in to your system in HANA Studio, then in the Systems view expand your system along Catalog > SAP_HANA_DBCC > Tables, as per the screenshot below. The tables we will be using are named Alerts.HistoryConfig, Alerts.HistoryData, and Site.PreferenceValues. To see the contents of these tables, right click on each table and select “Open Content” from the context menu. At this time, you should also open a SQL Console for your system by right clicking on the System name in the Systems view, and selecting “Open SQL Console”  from the context menu.

Catalog_Structure.png

 

Enable Historical Alerts in DCC

1. Navigate to your DCC System's Launchpad, using the URL format below, and provide credentials at the login prompt. Ensure that your user has the sap.hana.dbcc.roles::DBCCConfig role (without this role, the user cannot use the DCC Setup app).
    http://<host_name>:<port>/sap/hana/dbcc/


2. Click on the SAP DCC Setup tile. You should see a list of DCC settings as the default tab.


Info: By Default,Data History should be Disabled and History in days should be 30 days. If Data History is Enabled on your DCC System, skip Step 3.


3. Click the Edit button in the bottom-right corner. Check the Enabled checkbox and choose a Length of history (Default 30 days). Click the Save button.Data_History_SS.png

 

At this point, we have enabled the collection of historical system health data for the systems registered in DCC. We will confirm that this operation was successful in the next two steps.


4. In SAP HANA Studio, open/refresh the Site.PreferenceValues table. If the previous steps were performed correctly, you will notice an entry under name“apca.historical.enabled” with v_int value of 1, and under name“apca.historical.purge.max_age” with v_int value of 43,200 (in minutes, equivalent to 30 days).


Note: If the Length of history in DCC was never changed from the default value, the “apca.historical.purge.max_age” may not be present. In this case SAP HANA will use the default value of 30 days. To generate this record, either adjust the Length of history in DCC (You may set it back to default and the record will remain) or execute the following SQL statement:


upsert "SAP_HANA_DBCC"."sap.hana.dbcc.data::Site.PreferenceValues" ("name", "v_int") 
values ('apca.historical.purge.max_age', 43200) with primary key;


5. (Optional) As a check to ensure the previous steps were performed correctly, you can execute the following SQL statement to ensure system health data (Availability/Capacity/Performance) is being collected correctly. The result should be historical system health data populated to the current minute, as shown below:

   

select top 1000 * from "SAP_HANA_DBCC"."sap.hana.dbcc.data::APCA.Historical" order by "timestamp" desc;

APCA_Historical_SS.png

 

Complete Setup of Historical Alerts in SAP HANA Studio

Now that Historical system health data has been enabled, we must enable the Historical Alert data. To prepare for this step, open a SQL console for your system, as shown in the Preparing Your System step.


1. Generate two new records in the Site.PreferenceValues table. These records will enable the collection of Historical Alerts data. To do this, execute the following SQL Statements:

   

upsert "SAP_HANA_DBCC"."sap.hana.dbcc.data::Site.PreferenceValues" ("name", "v_int") 
values ('alert.historical.enabled', 1) with primary key;

   

upsert "SAP_HANA_DBCC"."sap.hana.dbcc.data::Site.PreferenceValues" ("name", "v_int")
values ('alert.historical.purge.max_age', 43200) with primary key;


Info: The “alert.historical.enabled” record acts as a master toggle switch for Historical Alert collections. When it is set to 1, DCC allows Historical Alert Collections to occur. As we will see later in this document, Historical Alerts can still be turned off for an individual system registered in DCC. However, if the “alert.historical.enabled” record is set to 0, no Historical Alert collection will occur, whether or not it is enabled on each system.

 

Info: The “alert.historical.purge.max_age” acts as a global purge age default. As we will see later in this document, the purge age can be overridden for an individual system registered in DCC.


2. (Optional) In order to check the values in the Site.PreferenceValues table have been entered correctly, execute the following SQL Statement:

select top 1000 * from "SAP_HANA_DBCC"."sap.hana.dbcc.data::Site.PreferenceValues";

Site.PreferenceValues_SS.png

 

Enable Historical Alerts on Registered Systems

At this point, we have configured DCC to allow Historical Alert collection to occur. Now, all that is left to do is configure the individual systems registered in DCC to allow or block Historical Alert collection, according to our landscape requirements. For the purposes of this document, we will enable Historical Alert Collection for all but one registered system, and we will override the purge age for another system.

For this step, we will need to open the content of the Alerts.HistoryConfig table. This table contains one record for each registered system, with a variety of associated alert collection parameters.

Alerts.HistoryConfig_SS.png

 

1. You can identify each system by its historyUrl, which contains the system host name. For any systems that require additional configuration, it is recommended that you note the resourceId value to avoid using the long historyUrl in your SQL statements. In our example, we will be disabling alerts for the system with resourceId = 132, and overriding the purge age for the system with resourceId = 140.


2. To enable Historical Alerts for all registered systems, execute the following SQL Statement:

   

update "SAP_HANA_DBCC"."sap.hana.dbcc.data::Alerts.HistoryConfig"
set "isEnabled" = 1;


3. After opening/refreshing the Alerts.HistoryConfig table, we notice that the isEnabled field is set for each system. If you wish to have Historical Alert data collected for all systems, and you do not require any additional system-specific configuration, you may skip to Step 6.


4. To disable Historical Alert collection for the “mdc-tn2” system only (with resourceId = 132)¸ we will run the following SQL Statement:

   

update "SAP_HANA_DBCC"."sap.hana.dbcc.data::Alerts.HistoryConfig"
set "isEnabled" = 0 where "resourceId" = 132;


5. To set the purge age for the “dewdflhana2314” system only (with resourceId = 140) to 60 days (86400 minutes), we will run the following SQL Statement:

   

update "SAP_HANA_DBCC"."sap.hana.dbcc.data::Alerts.HistoryConfig"
set "maxAge" = 86400 where "resourceId" = 140;


6. Finally, refresh the Alerts.HistoryData table (you may need to order by “collectTimestamp” DESC). You should now see the Historical Alerts data populated this table, for the systems on which you have enabled Historical Alert collection.

 

 

Turn off Historical Alerts and Undo Changes

The simplest way to turn off Historical Alerts collection is to navigate back to the SAP DCC Setup tile in SAP DB Control Center, and uncheck the Enabled box. This action will set “v_int” = 0 in the “apca.historical.enabled” record of the Site.PreferenceValues table, causing DCC to stop collecting Historical System Health data. However, if you wish to undo the changes made in this document, you can execute the following SQL Statements:

   

delete from "SAP_HANA_DBCC"."sap.hana.dbcc.data::Site.PreferenceValues"
where "name" = 'alert.historical.enabled';
delete from "SAP_HANA_DBCC"."sap.hana.dbcc.data::Site.PreferenceValues"
where "name" = 'alert.historical.purge.max_age';
update "SAP_HANA_DBCC"."sap.hana.dbcc.data::Alerts.HistoryConfig"
set "isEnabled" = 0, "maxAge" = 0;

Conclusion

Historical Alerts can prove useful for deeper insight and analysis of system health and performance. SAP HANA provides the capability to fine tune Historical Alert collection and storage over your landscape of systems, using SAP DB Control Center and SAP HANA Studio. For a collection of all the SQL Statements used in this document, please refer to the available files, enableAlerts.sql and disableAlerts.sql.

It is also possible to move the historical alerts into extended storage using SAP HANA Dynamic Tiering.  For complete details on this topic, please refer to this document: http://scn.sap.com/docs/DOC-69205.

Table Transpose in SAP HANA Modeling

$
0
0

This document is prepared based on HANA SPS6 revision 63.


Jody Hesch beautifully explained how to do Transpose and its use cases in the document (How To: Dynamic Transposition in HANA) which involves an additional table. If our requirement is to do table transpose without creating any additional table then we can do this completely by modeling in HANA studio. There could be many ways for doing and this is just another way of doing it.


Once the table is available in HANA studio, modeling will be done based on HANA base table and output of the Information view will be the transposed data.


Based on comments, this document was modified on Jan 8, 2014 and 2 other approaches have been added. Special thanks to Justin Molenaur and  Krishna Tangudu for making this better.


Based on comment, this document was modified on May 2, 2016. Special thanks to Pradeep Gupta for making this document more better than before.

 

Approach 1:

  • Analytic view will be built on each base table column which needs transposition.
  • In this case 6 columns need transposition, hence 6 Analytic views will be created.
  • Calculated Column (VALUE) is created in each Analytic view which derives the value of a particular month in a year.
  • Create Calculation View based on Analytic Views created above and join them together using Union with Constant Value.
  • No need to create Calculated Column (MONTH) in each Analytic view as this can be derived in Calculation View to improve performance.

 

Approach 2:

  • 1 general Analytic view will be created instead of several Analytic views in which selected attributes and measures will be selected.
  • In this case we select 6 measures M_JAN, M_FEB, M_MAR, M_APR, M_MAY, M_JUN in addition to common attributes.
  • Create Calculation View based on general Analytic View created above and join them together using Union with Constant Value.
  • Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.

 

Approach 3:

  • No Analytic view will be created instead base table will be used directly.
  • Create Calculation View based on direct base table in each projection node.
  • Here also 6 projection nodes will be used.
  • Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.

 

---------------------------------------------------------------------------------------------------

Approach 4 (Recommended):

  • With single SQLScript calculation view, the table can be easily transposed.
  • This is the most easiest way and better as compared to other approaches.

---------------------------------------------------------------------------------------------------




DDL used for workaround is given below:
---------------------------------------------------------------------------------------------------
CREATE COLUMN TABLE TEST.ACTUALS (
     ID INTEGER NOT NULL,
     NAME VARCHAR (20) NOT NULL,
     YEAR VARCHAR (4),
     M_JAN INTEGER,
     M_FEB INTEGER,
     M_MAR INTEGER,
     M_APR INTEGER,
     M_MAY INTEGER,
     M_JUN INTEGER,
     PRIMARY KEY (ID));

INSERT INTO TEST.ACTUALS VALUES (1,'NAME1','2012',101,102,103,104,105,106);
INSERT INTO TEST.ACTUALS VALUES (2,'NAME2','2012',111,112,113,114,115,116);
INSERT INTO TEST.ACTUALS VALUES (3,'NAME3','2012',121,122,123,124,125,126);
INSERT INTO TEST.ACTUALS VALUES (4,'NAME4','2012',131,132,133,134,135,136);
INSERT INTO TEST.ACTUALS VALUES (5,'NAME5','2012',141,142,143,144,145,146);

INSERT INTO TEST.ACTUALS VALUES (6,'NAME6','2013',201,202,203,204,205,206);
INSERT INTO TEST.ACTUALS VALUES (7,'NAME7','2013',211,212,213,214,215,216);
INSERT INTO TEST.ACTUALS VALUES (8,'NAME8','2013',221,222,223,224,225,226);
INSERT INTO TEST.ACTUALS VALUES (9,'NAME9','2013',231,232,233,234,235,236);
INSERT INTO TEST.ACTUALS VALUES (10,'NAME10','2013',241,242,243,244,245,246);
---------------------------------------------------------------------------------------------------

 

The data in the table is:

01.Table_Data.jpg

Transposed data:

  02.Example.jpg

Implementation steps for Approach 1:

  • Analytic view will be built on each base table column which needs transposition.
  • In this case 6 columns need transposition, hence 6 Analytic views will be created.
  • Calculated Column (VALUE) is created in each Analytic view which derives the value of a particular month in a year.
  • Create Calculation View based on Analytic Views created above and join them together using Union with Constant Value.
  • No need to create Calculated Column (MONTH) in each Analytic view as this can be derived in Calculation View to improve performance.

 

Now let us see this in action.

 

Let’s start with building Analytic view (AN_M_JAN) based on column M_JAN and in the Data foundation select the attributes ID, NAME, YEAR which will be common in all Analytic views andonly month M_JAN and skip other columns as shown below.

 

03.M_JAN.jpg
In the Logical Join, create Calculated Column (VALUE) and hard-code the value with the name same as base table column name (“M_JAN”) and validate the syntax as shown below.

04.CC_M_JAN.jpg

In the Semantics, hide the attribute M_JAN as it is not required in the output as shown below.
  05.M_JAN_Semantics.jpg
Now Validate and Activate the Analytic view and do data preview. You will see only the values corresponding to M_JAN only.

06.M_JAN_DATA_PREVIEW.jpg
Create second analytic view AN_M_FEB based on column M_FEB and the process will be the same as created above for M_JAN. In the data foundation make sure that you select month M_FEB not M_JAN.

07.M_FEB.jpg

08.CC_M_FEB.jpg

 

Date preview for AN_M_FEB corresponds to M_FEB only.

10.M_FEB_DATA_PREVIEW.jpg

Similarly create other 4 Analytic views AN_M_MAR, AN_M_APR, AN_M_MAY, AN_M_JUN.

 

Create Calculation View (CA_ACTUALS_MONTH). From the scenario panel, drag and drop the "Projection" node and add the Analytic view in it. Do not select M_JAN column as the Calculated column VALUE used instead. Similarly add the Projection node for other Analytic vies. Totally 6 Projection nodes are required for each Analytic view.

  12.P_MONTHS.jpg

Now add the "Union" node above the six "Projection" node and join them. In details section click "Auto Map by Name". The only attribute missing in the output is "Month".  In Target(s) under Details section, click on create target as MONTH with datatype as VARCHAR and size as 3 which contains 3 letter month names (eg. JAN, FEb, MAR, etc.)

13.Union.jpg

Right click on MONTH and choose "Manage Mappings" and enter the value for constant for Source model accordingly.

14.UCV.jpg

The final Calculation view would be like:

15.CA.jpg

Save and Validate, Activate, and Do the data preview:

  16.Final_Output.jpg

which is our desired output of the view with data transposed

 

 

But what about the performance?

 

Total number of records the information view contains:

17.Total_Count.jpg

To check if the filters are pushed down to the Analytic search, you need to find the “BWPopSearch” operation and check the details on the node in the visual plan. Please refer to awesome document by Ravindra Channe explaining "Projection Filter push down in Calculation View" which in turn points to the Great Lars Breddemann blog "Show me the timelines, baby!"


Let us apply filter for the year 2012.

 

 

 

SELECT NAME, YEAR, MONTH, VALUE FROM "_SYS_BIC"."MDM/CA_ACTUALS_VALUE" WHERE YEAR = '2012';

18.VP_CS.jpg

The Analytic search when expanded will show:

  20.BWPopSearch.jpg

Though the table size is small in our case, Irrespective of table size, the filter is pushed down and fetching only the required records from the base table which helps in improving performance

 

Implementation steps for Approach 2:

  • 1 general Analytic view will be created instead of several Analytic views in which selected attributes and measures will be selected.
  • In this case we select 6 measures M_JAN, M_FEB, M_MAR, M_APR, M_MAY, M_JUN in addition to common attributes.
  • Create Calculation View based on general Analytic View created above and join them together using Union with Constant Value.
  • Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.


Let us see this in action.


Create general Analytic view with no calculated columns, simple and straight forward as shown below:

  21.AN_VIEW.jpg

Create Calculation view. Drag and drop the Projection node and add general Analytic view, select the measure M_JAN only in addition to common attributes. Create Calculated column VALUE as shown below:

  22.CC_Proj1.jpg

Now add 5 more projection nodes with same Analytic view adding to it. Create Calculated Column VALUE in each projection node corresponding to respective month M_FEB M_MAR, etc. 

  23.Other_Projections.jpg

Now add Union node above these projections and the rest of the process is already seen in  Approach1.

  24.Final_CA_VIEW.jpg
Implementation steps for Approach 3:

  • No Analytic view will be created instead base table will be used directly.
  • Create Calculation View based on direct base table in each projection node.
  • Here also 6 projection nodes will be used.
  • Calculated Column (VALUE) is created in each Projection node which derives the value of a particular month in a year.

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------

Implementation steps for Approach 4: (recommended)

 

Create the SQLScript as below:

 

BEGIN

  var_out = 

       SELECT ID, NAME, YEAR, 'JAN' as "MONTH", M_JAN as "VALUE" from TEST.ACTUALS

       UNION

       SELECT ID, NAME, YEAR, 'FEB' as "MONTH", M_FEB as "VALUE" from TEST.ACTUALS

       UNION

       SELECT ID, NAME, YEAR, 'MAR' as "MONTH", M_MAR as "VALUE" from TEST.ACTUALS

       UNION

       SELECT ID, NAME, YEAR, 'APR' as "MONTH", M_APR as "VALUE" from TEST.ACTUALS

       UNION

       SELECT ID, NAME, YEAR, 'MAY' as "MONTH", M_MAY as "VALUE" from TEST.ACTUALS

       UNION

       SELECT ID, NAME, YEAR, 'JUN' as "MONTH", M_JUN as "VALUE" from TEST.ACTUALS

  ;

END

 

Script1.jpg

Output:

Output.jpg

Isn't it simple as compared to other approaches? yes it is.

 

 

Now you are familiar with different approaches of doing table transpose

Thank You for your time.

Viewing all 1183 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>