Running queries

This tutorial introduces some of the functionality that Kognitio supports. We start with simple examples and gradually build complexity.

All the queries on this page run against the retail demo data. You can find them in the retail demo tarball file: retail/scripts/kog-retail-demo-04-query-set.sql. Alternatively they’re available here.

Simple aggregation queries

1. Count the total number of items sold.

Each row in the main fact sales view in the retail data represents one item sold. To find the total number of items sold:

SELECT COUNT(*) num_items
FROM demo_ret.v_ret_sale;

2. Count the total number of stores.

The items are sold in stores across the company. To find how many stores there are, count the unique values in storeno column:

SELECT COUNT(DISTINCT storeno) num_stores
FROM demo_ret.v_ret_sale;

3. Create a simple aggregation.

You can create simple aggregations using the GROUP BY syntax. Here we group by storeno to obtain the number of items sold and total revenue for each store. We order the result by the total revenue in descending order:

SELECT storeno,
       COUNT(*) num_items,
       SUM(price)/100.00 total_revenue
FROM demo_ret.v_ret_sale
GROUP BY 1
ORDER BY 3 DESC;

4. Count the total number of items sold per store.

You can use the lookup views to make these results more meaningful. Here we use the store view to look up the store name:

SELECT st.storename,
       COUNT(*) num_items,
       SUM(s.price)/100.00 total_revenue
FROM demo_ret.v_ret_sale s
JOIN demo_ret.v_ret_store st
    ON s.storeno=st.storeno
GROUP BY st.storename
ORDER BY total_revenue DESC;

If you prefer, you can specify joins by aliases instead of view names. Aliases are convenient because they’re shorter. Best practice is to use aliases for every column declaration if your query contains a join. In this example the aliases indicate that storename came from v_ret_store and price came from v_ret_sale.

Complex SQL queries

5. Build a weekly report using our analytical functions.

Kognitio supports a wide range of SQL. In this example we’re using analytical windowing functions to obtain week-by-week running totals of items and value:

SELECT saleweek,COUNT(*) Items,
       SUM(price)/100.00 total_value,
       SUM(COUNT(*)) OVER (ORDER BY saleweek ROWS UNBOUNDED PRECEDING) cumul_items,
       SUM(SUM(price)/100.00) OVER (ORDER BY saleweek ROWS UNBOUNDED PRECEDING) cumul_value
FROM demo_ret.v_ret_sale
GROUP BY saleweek
ORDER BY saleweek;

You can find more details on this functionality in the SQL analytical functions section of our SQL Reference Guide.

Train-of-thought analysis

When using SQL to discover patterns in your data it is useful to be able to build queries on top of previous results in an ad hoc manner.

Kognitio is designed to run complex ad hoc queries. When details or patterns in the data are unknown it is useful to be able to build queries on top of previous results.

6. Define a product set.

This query uses a wildcard search to identify a subset of products for a brand:

SELECT *
FROM demo_ret.v_ret_product
WHERE UPPER(product_name) LIKE '%SHARWOOD%';

The SQL here is just example. In practice you could use any query to qualify a group of products such as a marketing campaign or a new product range.

7. Use a product set in a further query.

Once you’ve identified the product set, you might want to investigate how many baskets these products were found in:

SELECT p.product_name,
       dt1.basket_count
FROM (SELECT prodno,
             COUNT(DISTINCT basketno) as basket_count
      FROM demo_ret.v_ret_sale
      WHERE prodno IN (SELECT prodno
                       FROM demo_ret.v_ret_product
                       WHERE UPPER(product_name) LIKE '%SHARWOOD%')
      AND saledate BETWEEN DATE '2015-01-01' AND DATE '2015-03-31'
      GROUP BY 1
      ) dt1
JOIN demo_ret.v_ret_product p
ON dt1.prodno=p.prodno
ORDER BY basket_count DESC;

You can try any uppercase string in the LIKE predicate. As an alternative to LIKE, try MATCHING.

For larger data sets you’ll usually achieve better performance by using derived tables because they allow you to avoid duplication of potentially long strings.

Best practice is to use a derived table to obtain results first, then join them to lookup views.

8. Try a WITH clause as an alternative SQL syntax.

This query can also be written using the WITH clause:

WITH dt1 AS (SELECT prodno,
                    COUNT(DISTINCT basketno) as basket_count
             FROM demo_ret.v_ret_sale
             WHERE prodno IN (SELECT prodno
                              FROM demo_ret.v_ret_product
                              WHERE UPPER(product_name) LIKE '%SHARWOOD%')
      AND saledate BETWEEN DATE '2015-01-01' AND DATE '2015-03-31'
      GROUP BY 1)
SELECT p.product_name,
       dt1.basket_count
FROM dt1
JOIN demo_ret.v_ret_product p
ON dt1.prodno=p.prodno
ORDER BY basket_count DESC;

This is a simple illustration of the WITH clause. It is particularly useful if the same derived table is used multiple times in different sections of the main SELECT statement and the syntax is complex.

The code in step 7 is equivalent to the code in step 8 and both will produce the same results. Usually, but not always, Kognitio will process these kind of queries in the same way internally.

9. Try a real-world example.

This final query demonstrates the kind of commercially significant insight we can extract from the retail data. It performs basket analysis to discover which products were purchased in the same baskets as a selected brand:

SELECT ps.section_name,
       p.product_name,
       p.prodno,
       COUNT(DISTINCT s2.basketno) num_baskets,
       COUNT(s2.prodno) items_sold,
       SUM(s2.price)/100.00 value
FROM (SELECT basketno,
             saledate
      FROM demo_ret.v_ret_sale
      WHERE prodno IN (SELECT prodno
                       FROM demo_ret.v_ret_product
                       WHERE UPPER(product_name) LIKE '%SHARWOOD%')
      AND saledate BETWEEN DATE '2015-01-01' AND DATE '2015-03-31'
      GROUP BY 1,2
      ) s1
JOIN demo_ret.v_ret_sale s2
ON s1.basketno = s2.basketno
JOIN demo_ret.v_ret_product p
ON s2.prodno = p.prodno
JOIN demo_ret.v_ret_prod_section ps
ON p.section_no = ps.section_no
WHERE s2.prodno not in (SELECT prodno
                       FROM demo_ret.v_ret_product
                       WHERE UPPER(product_name) LIKE '%SHARWOOD%')
AND s2.saledate BETWEEN DATE '2015-01-01' AND DATE '2015-03-31'
GROUP BY 1,2,3
ORDER BY 4 DESC;

You define the brand you want to study in the LIKE clause in the first FROM statement. In this case we’re looking for any products that match LIKE ‘%SHARWOOD%’.

10. Learn more.

Kognitio supports a very wide range of SQL functionality including many compatible functions from other databases such as Oracle and MS SQL Server. For more information and examples check out our SQL reference guide.

Prev: Creating views and images