Forum

Information and discussion related to the Kognitio on Hadoop product
Contributor
Offline
Posts: 184
Joined: Wed May 29, 2013 2:10 pm
Location: Bracknell

Getting started with Hadoop using Kognitio Console (part 1)

by MikeAtkinson » Tue Jun 14, 2016 11:53 am

This note is about how to set up a Kognitio system using Console after Kognitio has been started on hadoop and is running.

Connect to Kognitio

First connect to the Kognitio system. Use the IP address you have used as the Kognitio system, port 6550 (the default), the SYS user and the SYS user's password.

Image

Create the HADOOP module

Next, ensure that the HADOOP plugin (module) has been created and enabled.

Image

If the HADOOP module is not in this list then create it and activate it, with SQL like:

Code: Select all

CREATE MODULE hadoop;
ALTER MODULE hadoop SET MODE active;
Click on the SQL icon to create a script, enter the text, then click on the green arrow to execute it.

Image

Create the HDFS connector

Next, we need to create a connector between the Kognitio system and hadoop (called in this instance HDFS_CON), using SQL like:

Code: Select all

CREATE CONNECTOR hdfs_con SOURCE hdfs TARGET 'namenode 172.17.5.220:8020, bitness 64';
Again click on the SQL icon to create another script, enter the text, then click on the green arrow to execute it.

Image

Check that the connector is present by expanding the Connectors item in the System metadata tree.

Image

In this case there are two other HDFS connectors defined to perform TPCDS tests.

Create a test user

Now create a test user, Mike in this case. Right click on the Users item in the metadata tree, select "New User ..." and enter the data for the user into the New User dialog.

Image

Create a group

Also create a new group, I've called it HDFS_ACCESS. Right click on the Groups item in the metadata tree, select "New Group ..." and enter the data for the user into the New Group dialog.

Image

Now double click on the HDFS_ACCESS item in the metadata tree, which brings up the Group Object View for HDFS_ACCESS. We need to add the test user to this group which may be done by dragging it to the "Users" area of the HDFS_ACCESS object view and clicking "Save Changes".

Image

Add privileges to the group

Now privileges need to be added to the group to access the HDFS_CON connector. Single click on the HDFS_CON while the HDFS_ACCESS object view is showing. Within the Privileges item in the metadata tree the available privileges for the HDFS_CON should be enabled, drag and drop them into the "privileges" area. I've chosen the drag and drop the Connector All privilege and the Table (connector-wide) All privilege, which allows users inheriting from the HDFS_ACCESS group complete control over the HDFS_CON connector and any external tables created using that connector. A more locked down system would give fewer privileges to ordinary users and only allow full privileges to a few special users (which inherit from a group granting full privileges).

Note: privileges may also be assigned to users, but it is almost always better to assign privileges to a group and then make the users members of that group.

Image

In the same way I've also given the group HDFS_ACCESS select and view privileges on two system tables SYS.IPE_EXTERNAL_DATA_SOURCE and SYS.IPE_EXTERNAL_DATA_CONNECTOR.

The SQL to grant these privileges is as follows:

Code: Select all

grant  all on connector HDFS_CON to HDFS_ACCESS;
grant  all on every table in connector HDFS_CON to HDFS_ACCESS;
grant  view on view SYS.IPE_EXTERNAL_DATA_SOURCE to HDFS_ACCESS;
grant  select on view SYS.IPE_EXTERNAL_DATA_SOURCE to HDFS_ACCESS;
grant  select on view SYS.IPE_EXTERNAL_CONNECTOR to HDFS_ACCESS;
grant  view on view SYS.IPE_EXTERNAL_CONNECTOR to HDFS_ACCESS;
Create an External Data Source

The last thing that needs to be done is create an External Data Source. We are doing this as SYS because SYS has the privileges to do it by default and the External Data Source can then be visible to all users who can view and select from the base connector. It would have been possible to give privileges to another user to do the External Data Source creation instead.

Right-click on the External Data Sources item in the metadata tree, then select "New External Data Source ..." from the pop-up menu.

Image

Clicking next and change the name to something appropriate, in this example HDFS_DATA

Image

The next page allows various parameters to be changed, this is because the External Data Browser is a general purpose tool and is designed to work with other potential data sources. Leave these parameters unchanged and click "Next".

Image

The final page of the dialog shows the SQL which will be run to create the External Data Source.

Image

Click on "Finish" to create the External Data Source.

Having done that we can disconnect as SYS and reconnect again as the test user (MIKE) in this case. Another post will show how to use Console to browse the Hadoop data and create external tables from it.
Reply with quote Top

Who is online

Users browsing this forum: No registered users and 1 guest

cron