Using memory images

Kognitio is designed for fast, efficient processing for complex SQL queries and advanced analytics. For best performance, your data needs to be in memory so the CPUs can access it quickly without any disk read/write overhead. Data held in memory is called an image.

The commands to build images are simple extensions of standard SQL. This section takes you through the basics of creating memory images in Kognitio.

The sections on this page show how to create views and then create memory images of those views:

  • Create a view and image of the main fact data.

  • Replicate look-up views in memory.

  • Create views and images for the product hierarchy.

  • Create dependent views.

  • Create a dates look-up view.

  • Calculate statistics for views and images.

Create a view and image of the main fact data

1. Create a view for the sale data.

The simplest view you can create is a direct copy of all the data in a table:

CREATE VIEW demo_ret.v_ret_sale AS
SELECT * FROM demo_ret.t_ret_sale;

When you create a view, Kognitio doesn’t perform any processing. It stores the SQL statement as metadata. Then it executes the statement whenever you use the view in a query. You can think of a view as an alias for a derived table statement.

2. Run a query on the data in your view.

You can now test your view:

SELECT TOP 10 * FROM demo_ret.v_ret_sale;

We have not imaged the view yet so the query won’t run quickly.

3. Create an image for the sale data.

The IMAGE keyword is one of Kognitio’s extensions to standard SQL:

CREATE VIEW IMAGE demo_ret.v_ret_sale;

A view image is a snapshot of the data defined by the view, taken at the time you create the image.

Whenever you create a view for multiple queries or multiple users it is worth pinning it into memory as an image.

4. Run a query on the data in your image.

Now that you have imaged the view, a copy of the data is in memory so the performance will be much better:

SELECT TOP 10 * FROM demo_ret.v_ret_sale;

5. Verify that the sale data image is in memory.

You can use the DESCRIBE command to check whether a view has been imaged:

DESCRIBE VIEW demo_ret.v_ret_sale;

If you see “location RAM” in the response, you’ve created the view and imaged it successfully.

If you see “no RAM image” in the response, you’ve created the view but you haven’t imaged it.

Tip

If you’re using Kognitio Console, you can easily see if a view has been imaged. In the System pane, the default black font for a view name turns blue if the view has been imaged.

Replicate look-up views in memory

6. Create a view for the store data.

You can create views in Kognitio based on any SQL query. Views are very useful for carrying out data transformations without having to change the underlying data.

Here we add two columns to the store data to create a top level called “All Stores” in the store hierarchy:

CREATE VIEW demo_ret.v_ret_store AS
SELECT *,
    1 all_stores_id,
    'All Stores' all_stores_desc
FROM demo_ret.t_ret_store;

7. Create an image for the store data.

The demo_ret.t_ret_store table contains lookup data for each store. It has a small number of rows so we can replicate it in memory to enable faster joining to large fact tables:

CREATE VIEW IMAGE demo_ret.v_ret_store REPLICATED;

Optionally, you can run a SELECT TOP 10 query against this image too, similar to the query you ran for on the v_ret_sale image.

8. Run a query on the data in your image.

If you want, you can use the DESCRIBE command again too:

DESCRIBE VIEW demo_ret.v_ret_store;

In the response, you should see “location RAM” which confirms that you created the image successfully. You should also see “distribution REPL” which confirms that the replication was successful.

Create views and images for the product hierarchy

In the product data there is a hierarchy defined from top to bottom as:

  • Department

  • Group

  • Section

  • Product

The next steps are to build the views and images for this hierarchy, starting with product and working up to department.

9. Create all the views and images for the product hierarchy.

-- Product data
CREATE VIEW demo_ret.v_ret_product AS
SELECT prodno,
        (group_no*100)+section_no AS section_no,
        group_no,
        dept_no,
        product_name
FROM demo_ret.t_ret_product;

CREATE VIEW IMAGE demo_ret.v_ret_product REPLICATED;

-- Section data
CREATE VIEW demo_ret.v_ret_prod_section AS
SELECT (group_no*100)+section_no AS section_no,
        section_name,
        group_no
FROM demo_ret.t_ret_prod_section;

CREATE VIEW IMAGE demo_ret.v_ret_prod_section REPLICATED;

-- Group data
CREATE VIEW demo_ret.v_ret_prod_group AS
SELECT *
FROM demo_ret.t_ret_prod_group;

CREATE VIEW IMAGE demo_ret.v_ret_prod_group REPLICATED;

-- Department data
CREATE VIEW demo_ret.v_ret_prod_dept as
SELECT *,
        1 as all_prod_id,
        'All Products' as all_prod_desc
FROM demo_ret.t_ret_prod_dept;

CREATE VIEW IMAGE demo_ret.v_ret_prod_dept REPLICATED;

Notes:

  • The ID for the section called “section_no” is not always unique across product groups so we combine the group_no and section_no columns to create a unique ID.

  • The underlying table is not large so we can replicate the image to boost performance.

Create dependent views

Now you can build a product hierarchy from the four views you just created and imaged.

So far, all the views you’ve created on this page have been based on tables. It is possible, instead, to create views that are based on other views. These are called “dependent views”.

Dependent views are useful if:

  • there are series of transformation steps that you want to perform.

  • different users want access to the same data in different ways.

10. Create a view for the entire product hierarchy.

You can use dependent views to bring data in multiple sources together. You can bring the four views that you just created (product, section, group, and department) together like this:

CREATE VIEW demo_ret.v_ret_prod_hier as
SELECT p.*,
    section_name,
    group_name,
    dept_name
FROM demo_ret.v_ret_product p
JOIN demo_ret.v_ret_prod_section ps
ON p.section_no = ps.section_no
JOIN demo_ret.v_ret_prod_group pg
ON p.group_no = pg.group_no
JOIN demo_ret.v_ret_prod_dept pd
ON p.dept_no = pd.dept_no;

You have already imaged the four views that this view depends on. Those four images ensure that this view will be executed very quickly when it is used in a query.

You might decide that you don’t need to create an image of this view at all. Often, such an image is unnecessary if all the underlying views are in RAM. You can create an image of demo_ret.v_ret_prod_hier if you want to. Or you can save the RAM for other purposes if you prefer.

Optionally, you can run a query to test the product hierarchy:

SELECT TOP 10 * FROM demo_ret.v_ret_prod_hier;

Create a dates look-up view

11. Create a view and image for the date data.

This SQL uses the saledate field from the main EPOS sales data to create a dates lookup table:

CREATE VIEW demo_ret.v_ret_dates AS
SELECT
    saledate,
    DAYOFWEEK(saledate) sale_dow_no,
    MAPDAY(DAYOFWEEK(saledate)) sale_dow,
    DT_INFO(saledate, 'WEEKOFYEAR') sale_week,
    EXTRACT (MONTH FROM saledate) sale_month_no,
    MAPMONTH(EXTRACT (MONTH from saledate)) sale_month,
    DT_INFO(saledate, 'QUARTEROFYEAR') sale_quarter,
    EXTRACT (YEAR FROM saledate) sale_year,
    (EXTRACT (YEAR FROM saledate) * 100) +
        DT_INFO(saledate, 'WEEKOFYEAR') sale_year_week_no,
    (EXTRACT (YEAR FROM saledate) * 100) +
        EXTRACT (MONTH FROM saledate) sale_year_month_no,
    (EXTRACT (YEAR FROM saledate) * 100) +
        DT_INFO(saledate, 'QUARTEROFYEAR') sale_year_quarter_no
FROM (SELECT DISTINCT
          saledate
          FROM demo_ret.v_ret_sale
      ) dt1;

CREATE VIEW IMAGE demo_ret.v_ret_dates;

For more details on the functions used here see the date functions in the SQL reference.

Calculate statistics for views and images

12. Update the statistics.

Once a view has a memory image it is important to update its statistics. This helps the optimizer to make good decisions on how to run queries that utilize the view:

UPDATE STATISTICS FOR demo_ret.v_ret_sale;
UPDATE STATISTICS FOR demo_ret.v_ret_sale RAM;
UPDATE STATISTICS FOR demo_ret.v_ret_store;
UPDATE STATISTICS FOR demo_ret.v_ret_store RAM;
UPDATE STATISTICS FOR demo_ret.v_ret_product;
UPDATE STATISTICS FOR demo_ret.v_ret_product RAM;
UPDATE STATISTICS FOR demo_ret.v_ret_prod_section;
UPDATE STATISTICS FOR demo_ret.v_ret_prod_section RAM;
UPDATE STATISTICS FOR demo_ret.v_ret_prod_group;
UPDATE STATISTICS FOR demo_ret.v_ret_prod_group RAM;
UPDATE STATISTICS FOR demo_ret.v_ret_prod_dept;
UPDATE STATISTICS FOR demo_ret.v_ret_prod_dept RAM;
UPDATE STATISTICS FOR demo_ret.v_ret_dates;
UPDATE STATISTICS FOR demo_ret.v_ret_dates RAM;

Here you run two commands for each view:

  • The first finds out the cardinality of the view, takes samples of data from the view, and estimates the selectivity of each field.

  • The second collects information about the RAM utilized by the image.

Remember, we didn’t image the product hierarchy view so we don’t update the statistics for ret.v_ret_prod_hier.

Prev: Creating tables

Next: Querying in Kognitio