Quantcast
Channel: SCN : Document List - SAP HANA and In-Memory Computing
Viewing all articles
Browse latest Browse all 1183

SAP Hana Vora 1.2 setup with SAP Hana SP11 integration

$
0
0

In my documentation I’ll explain how install/configure SAP Hana Vora 1.2 with SAP Hana SP11 integration, I will demonstrate in detail how to setup a Horthonworks ecosystem in order to realize this configuration.

 

For my setup I’ll use my own lab on Vmware Vsphere 6.0, run SAP Hana Vora 1.2, SAP Hana Revision 112 and use Hadoop HDFS stack 2.7.2.

 

Disclaimer: My deployment is only for test purpose, I make the security simple from a network perspective in order to realize this configuration and use open source software.

 

 

In order execution

 

  • Deploy Horthonworks ecosystem
  • Install SAP Hana Vora for Ambari
  • Install SAP Hana Spark Controller 1.5
  • Install Spark assembly and dependent library
  • Configure Hive Metastore
  • Configure Spark queue
  • Adjust MapReduce2 class path
  • Connect SAP Hana to SAP Hana Vora

 

 

Guide used

 

SAP HANA Vora Installation and Developer Guide

SAP HANA Administration Guide

 

 

Note used

 

2284507 - SAP HANA Vora 1.2 Release Note

2203837 - SAP HANA Vora: Central Release Note

2213226 - Prerequisites for installing SAP HANA Vora: Operating Systems and Hadoop Components

 

 

Link used

 

Help SAP Hana for SAP HANA Vora 1.2

HDP Documentation Ver 2.3.4

 

 

Overview Architecture

 

5-9-2016 9-14-03 AM.jpg

 

The architecture is based on a full virtual environment, running SAP Hana Vora 1.2 require mandatory component as part of the Hadoop ecosystem:

• HDFS 2.6.x or 2.7.x

• ZooKeeper

• Spark 1.5.2

• Yarn cluster manager

 

For my configuration all my server are registered in my DNS and sync with an NTP server.

 

 

 

Deploy Horthonworks Ecosystem

 

The Horthonworks ecosystem deployment consist of several step

1. Prepare the server by sharing SSH Public Key

2. Install MySQL connector

3. Install Ambari

4. Install Hive database

5. Install and configure HDP cluster

 

 

For the installation in order to make it simple, I decide to use the “Ambari Automated Installation” based on HDP vreison 2.3.4 which can be deploy with SPARK version 1.5.2.

8.2.jpg

 

To realize this configuration my deployment will comport 3 vms:

Ambari: ambari.will.lab

Yarn: yarn.will.lab

Hana: vmhana02.will.lab


 

Prepare the server by sharing SSH Public Key

 

My 3 severs up and running we have to set the SSH Public key on Ambari server in order to allow it to install Ambari agent on host which are part of the cluster.

 

I first create the rsa key-pair

1.jpg

 

And copy the public key on the remote server “yarn”

2.jpg

 

And try to ssh my remote server to confirm that I don’t need to use the password

3.jpg

 

Install MySQL connector

 

Hive requires a relational database to store Hive Metastore, I install the MySQL connect and note the path, it will be required during the initialization setup of Ambari

3.1.jpg

 

3.2.jpg

 

Install Ambari

 

On the Ambari server we have download the Ambari repository for SLES11:

wget -nv http://public-repo-1.hortonworks.com/ambari/suse11/2.x/updates/2.2.0.0/ambari.repo -O /etc/zypp/repos.d/ambari.repo

4.jpg

 

And finally install Ambari

5.jpg

 

Now installed, the Ambari server needs to setup:

Note: I decide to use Oracle JDK 1.8 and the embedded database for Ambari PostgreSQL

6.jpg

 

Once done start the server and check the status

8.jpg

 

Note: I did not specify the MySQL connector path at the beginning of the initialization of Ambari, in order to include it stop Amabri and load it by re-executing the following command

8.1.jpg

 

Install Hive Database

 

By default on RHEL/CentOS/Oracle Linux 6, Ambari will install an instance of MySQL on the Hive Metastore host. Since i'm using SLES i need to create an instance of MySQL for Hive Metastore.

20.jpg

 

Install and configure HDP cluster

 

The server up and running we can start the installation and the configuration of the HDP cluster components, to proceed launch the Apache Ambari url and execute the wizard with the default user and password “admin/admin”

9.jpg

 

Follow the step provided by the wizard to create your cluster

10.jpg

 

11.jpg

 

For this section provide the private key generated earlier on Ambari server

12.jpg

 

13.jpg

 

Host added successfully, but check the warning message

14.jpg

 

Choose the necessary services you wants to deploy

17.jpg

 

Assign the service you wants to run on the selected master node, since I’m using one host only it’s a no brainer. Additional host can be assigned later upon your needs

18.jpg

 

Assign Slave and client

18.1.png

 

Customize your service upon your needs as well, in my case I use a MySQL database so I need to provide the database information

19.jpg

 

19.1.png

 

Review the configuration for all service and execute

21.jpg

 

21.2.jpg

 

21.3.jpg

 

Once completed, access the Ambari web page and make some checks to see the running services

22.jpg

The Horthonwork ecosystem now installed we can proceed with the SAP Hana Vora for Amabri installation

 

 

 

SAP Hana Vora for Amabri

 

SAP HANA Vora 1.2 is now available for download as a single installation package for the Ambari and Cloudera cluster provisioning tools. These packages also contain the SAP HANA Vora Spark extension library (spark-sap-datasources-<VERSION>-assembly.jar), which no longer needs to be downloaded separately.

23.jpg

 

The following components will be deployed from the provisioning tool

24.jpg

 

For Vora Dlog component a specific library is required on the server “libaio”, make sure it’s installed

25.jpg

 

Once download, from Ambari server copy the VORA_AM* file into:

/var/lib/ambari-server/resources/stacks/HDP/2.3/service folder

26.jpg

 

And decompress it, it will generate the several vora application folder

27.jpg

 

Then restart the Ambari server in order to load the new service

27.1.png

  

Once completed install the new Vora service form the Ambari dashboard

29.jpg

 

Select the vora application to deploy and hit Next to install it

28.jpg

 

The Vora Discovery and Thriftserver will required some customization entry such as hostname and java location

30.jpg


30.1.png

 

31.jpg

 

The new service appear now, yes I have red services but will be fixed.

31.1.jpg

 

The Vora engine installed, I need to install the Spark Controller

 

 

 

Install SAP Hana Spark Controller 1.5

 

The Spark controller needs to be download from the marketplace, this is an .rpm package.

32.jpg

 

Once downloaded execute the rpm command to install it

33.jpg

 

When the installation is completed the /usr/sap/spark/controller folder is normally generated

33.1.jpg

 

The next phase is now to install the Spark assembly file and Dependent libraries

 

 

 

Install Spark assembly and dependent library

 

The Spark assembly file and Depend libraries needs to be copied into spark controller external lib folder.

Note: up to now only the assembly.jar lib version 1.5.2 is the only supported version to works with t Vora 1.2, I’ll download page at https://spark.apache.org/download.html

34.jpg

 

Decompress the folder and copy the necessary library into “/usr/sap/spark/controller/lib/external” folder

34.1.jpg

 

And I will update the hanaes-site.xml file in /usr/sap/spark/controller/conf folder to update the content

34.2.jpg

 

Spark and Yarn create staging directories in /hana/hanaes directory in HDFS, this directory needs to be created manually by the following command as hdfs user:

hdfs dfs –mkdir /user/hanaes

35.jpg

 

 

 

Configure Hive Metastore

 

Since SAP Hana Spark Controller connect to the Hive Metastore, the hive-site.xml file needs to be available in controller’s class path.

To do it I will create a symbolic link in the /usr/sap/spark/controller/conf folder

36.jpg

 

And adjust the hive-site.xml file with the following parameter:

• Hive.execution.engine = mr

• Hive.metastore.client.connect.retry.delay = remove the (s)

• Hive.metastore.client.connect.socket.timeout = remove the (s)

• Hive.security.authorization.manager = org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider

 

Note this change are made only because for our example we are using Horthonworks distribution, with Cloudera it’s not required

 

 

 

Configure Spark queue

 

 

In order to  avoid Spark to take all available resources from Yarn manager and thus leaving no resource for any other application running on Yarn resource manager, I need to configure Spark dynamic Allocation by setting up a queue in ‘Queue Manager”

37.jpg

 

Create it then save and refresh from the action button

38.jpg

 

Once done from hanaes-site.xml file add the spark.yarn.queue property

39.jpg

 

39.1.jpg

 

 

 

Adjust Mapreduce2 class path

 

One import point to take in consideration about Spark Controller, is the fact that the component library path call during startup doesn’t support variable such as “${hdp.version}”.

 

This variable is declared in the MapReduce2 configuration

39.2.jpg

 

Expand the Advanced mapred-site property and locate the parameter “mapreduce.application.classpath

39.3.jpg

 

Copy/past the whole string in your favorite editor and change all reference of ${hdp.version} entries by the current hdp version

39.4.jpg

 

Before the change

$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure

 

After the change

$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.3.0.0-2557/hadoop/lib/hadoop-lzo-0.6.0.2.3.0.0-2557.jar:/etc/hadoop/conf/secure

 

Once done, as “hanaes” user, start the Spark Controller from the directory /usr/sap/spark/controller/bin

40.2.jpg

 

Check the Spark log to see if it’s running properly in /var/log/hanaes/hana_controller.log

As we can see I have an error in my config file

40.1.jpg

 

 

 

Connect SAP Hana to SAP Hana Vora

 

My Horthonworks ecosystem in place and SAP Hana Vora 1.2 deployed, I can connect my Hana instance to it over the Spark adapter.

Before trying to make any connection one specific library needs to be copy into “/usr/sap/spark/controller/lib” folder, from “/var/lib/ambari-agent/cache/stacks/HDP/2.3/services/vora-base/package/lib/vora-spark/lib” copy the spark-sap-datasources-1.2.33-assembly.jar file

41.jpg

 

Once done restart the Spark Controller

 

Now to connect on my Hadoop from Hana in need to create a new remote connection by using the following SQL statement

42.jpg

 

 

Since I did not create any table in my Hadoop environment this is why nothing appear below default, in order to test it I’ll create a new schema and load a table (csv) into it and see the result in hana

43.jpg

 

Note: you can download some csv sample here

Sample insurance portfolio

Real estate transactions

Sales transactions

Company Funding Records

Crime Records

44.jpg

 

Once done check the result from Hive view

46.jpg

 

And make the check in Hana by creating and querying the virtual table

47.jpg

 

48.jpg

 

49.jpg

 

It’s all good I have my data

50.jpg

 

My configuration is now completed with SAP Hana Vora 1.2 setup and connection with SAP Hana SP11.


Viewing all articles
Browse latest Browse all 1183

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>