sub-SELECT Clause

Overview

A sub-SELECT clause is simply a SELECT statement used to select values for comparison in a predicate. It lets you tie the result of one query to another. An EXISTS clause used in the inner SELECT tests to see if rows exist, but in all other forms specific columns are compared. In its simplest form the sub-SELECT compares a single column with another single column:

SELECT select-list
FROM ttt
WHERE ccc IN (SELECT cc FROM tt2)

A form such as:

SELECT select_list
FROM ttt
WHERE ccc IN (SELECT cc1, cc2
              FROM tt2)

Is clearly in error, as you can’t compare a single column to two. However, you can replace it with the following:

SELECT select_list
FROM ttt
WHERE (ccc1, ccc2) IN (SELECT cc1, cc2
                       FROM tt2)

And obtain a result that compares two columns. This construct is called ‘row value constructors’ (RVC).

Row value Constructors can also be used with [NOT] IN.

The select_list in subqueries follows the same syntax as select_list in main queries.

Sub-SELECTS can be used in WHERE and CASE clauses if they are non-correlated and return a single row, single column answer, for example:

SELECT a1,
       (SELECT MAX(b2) FROM b)
FROM a
WHERE (SELECT MAX(c1) FROM c) + a3 > 12;

SELECT *
FROM a
WHERE CASE WHEN a1 IN (SELECT b1 FROM b)
           THEN 'Good'
           ELSE 'Bad'
      END = 'Good'

By default Kognitio may perform an automatic DISTINCT on the subquery results; you can specify that Kognitio should not do this by using the ALL keyword as illustrated by the following example:

SELECT ...
FROM ttt
WHERE ccc IN (SELECT ALL cc FROM tt2)

The ALL keyword would typically be used when you know that the subquery is already distinct (or nearly distinct) and so any attempt to apply the DISTINCT could be expensive and not assist with the next step of query evaluation.

You can also ensure that the DISTINCT is performed by using the DISTINCT keyword in the subquery as illustrated by the following example:

SELECT ...
FROM ttt
WHERE ccc IN (SELECT DISTINCT cc FROM tt2)

Example 1: Comparing a Single Column to a Specific Value

Find the cheapest part in the partsupp table, and identify the supplier. (Here the comparison is between a single column that is compared to the minimum value for that column in the sub-SELECT clause, and results in a row for each part sold at the lowest supply cost.):

SELECT ps_partkey part,
       ps_suppkey supplier,
       ps_supplycost cost
FROM partsupp
WHERE ps_supplycost = (
    SELECT MIN(ps_supplycost)
    FROM partsupp
)

Example 2: Comparing Two Columns from One Table

You can also find the cheapest part for each supplier. This time, the result compares two columns (ps_suppkey and ps_supplycost), and returns the minimum value for each supplier:

SELECT ps_partkey part,
       ps_suppkey supplier,
       ps_supplycost cost
FROM partsupp
WHERE (ps_suppkey, ps_supplycost) IN (SELECT ps_suppkey,
                                             MIN(ps_supplycost)
                                      FROM partsupp
                                      GROUP By 1)
ORDER BY 1, 2

Example 3: Nested sub-SELECTs Comparing Values in Multiple Tables

This example uses sub-SELECT statements to obtain names of suppliers that supply pumps:

SELECT s_name
FROM supplier
WHERE s_suppkey IN (SELECT ps_suppkey
                    FROM partsupp
                    WHERE ps_partkey IN (SELECT p_partkey
                                         FROM part
                                         WHERE UPPER(p_name) LIKE '%PUMP%'))
ORDER BY 1

Example 4: Using Exists

This example gets the same result as Example 3, but this time the query uses EXISTS. Note the use of SELECT * in the sub-SELECT—this is only permissible where a query uses EXISTS or ALL:

SELECT s_name
FROM supplier
WHERE EXISTS (SELECT *
              FROM part, partsupp
              WHERE p_partkey = ps_partkey
              AND ps_suppkey = s_suppkey
              AND UPPER (p_name) LIKE '%PUMP%')