Disk Slabs

Kognitio logically divides available disk resource into individually addressable areas called ‘slabs’. Each slab comprises a percentage of available disk resource. The benefits of slabs are:

  • Data for individual schemas/tables can be assigned to specific slabs, thus isolating issues such as exhausting disk capacity to those slabs. Reclaim and repack operations can then be performed on a per-slab basis to free up space

  • Data is less interleaved on the disk, resulting in lower I/O overheads during data retrieval. In particular, logging tables can have a dedicated slab rather than having their data interleaved with user data across the whole disk

The following example is a possible slab layout for a system with 3200GB total disk space:

slab

disk space (GB)

purpose

1

37.1

This slab stores data for the system tables in the sys schema. Examples include ipe_alltable (contains data for every table and view in the system), ipe_allcolumn (contains data for every column of every table/view) and others

2

37.1

This is the logging slab. It stores log entries for historical queries, transactions, logins/logouts and errors

3

390.4

Slab for user data

4

390.4

Slab for user data

5

390.4

Slab for user data

10

390.4

Slab for user data

Slabs 1 and 2 are always reserved for system usage, and are much smaller than other slabs. Slabs 3 and above are available for user data, and are equally sized (with the exception of the last slab which may be smaller, as it uses the remainder of available disk). The number of slabs can be configured before commissioning a system with the desired_slabs parameter. Note that changing the number of slabs involves wiping all data from the system. The maximum size of each slab is 16GB.

Slab assignments

A slab assignment is a list of slabs applied to an object, such that any new rows for the object will be written to one of those slabs. All slab assignments are recorded in the table SYS.IPE_SLABS. Slabs can be assigned at three levels:

  • system: The system default slab list defines which slabs new rows will be written to, for any object which has no schema or table assignments. You can view the current setting with SELECT DISTINCT(slab_id) FROM sys.ipe_slabs WHERE schema_id=-1 AND table_id=-1;, and change it with ALTER SYSTEM SET SLABS TO {ALL | slab-list} [MIGRATE [DEFRAG]]

  • schema: A schema can be assigned to a list of slabs. Thereafter, new rows for any table in the schema will be written to these slabs, unless the table has it’s own table-level assignment. A schema’s current slab assignments can be seen by running SELECT DISTINCT(slab_id) FROM sys.ipe_slabs WHERE schema_id IN (SELECT id FROM ipe_allschema WHERE name ='schema-name') AND table_id=-1, and assignments can be changed with ALTER SCHEMA schema-name SET SLABS TO {ALL | SYSTEM DEFAULT | slab-list} [MIGRATE [DEFRAG]]

  • table: If a table has an assigned slab list, any new rows added to the table will be sent to one of those slabs. A table’s current assignments can be seen by running SELECT DISTINCT(slab_id) FROM sys.ipe_slabs WHERE table_id=table_id(schema-name.table-name), and assignments can be changed with ALTER TABLE table-name SET SLABS TO slab-list | SCHEMA DEFAULT | ALL [MIGRATE [DEFRAG]]

Whenever a new slab assignment is set, the assignment only applies to new rows written after that point. If you want the new assignment to be applied to existing rows as well, you must specify MIGRATE. For example ALTER TABLE s1.t1 SET SLABS TO 3,4,5 MIGRATE. This will take any existing rows for the table s1.t1 that are not on slabs 3,4 or 5, write them to one of those slabs, and mark the old rows as deleted. MIGRATE can be specified at the system, schema or table level.

If MIGRATE is specified, you may optionally specify DEFRAG as well. This will ensure that rows already in one of the slabs in the specified slab-list are also moved. This can be used to defragment a table if required.

Overlapping Assignments

It is good practice to avoid overlapping slab assignments, as they can interfere with the recovery of disk space. An overlap is when one assignment has some - but not all - slabs in common with another. For example, suppose table s1.t1 is assigned to slabs 3,4 and 5, while the schema s2 is assigned to slabs 5 and 6. This constitutes an overlap. The correct approach would be to either separate s2 (e.g. assign it to slabs 6 and 7) or assign it to slabs 3,4 and 5.

Note that overlaps are not prevented by the software in case people in an emergency want to allocate an existing data set to an extra slab, with the intention of quickly resolving the space issue and reverting to the original slab set as soon as possible.

To check a system for overlapping assignments, run the following query:

select with slabagg (schema_id, table_id, nslabs, minslab, maxslab) as (
                        select schema_id, table_id, max(seq + 1) nslabs, min(slab_id) minslab, max(slab_id) maxslab from sys.ipe_allslabs group by 1,2),
                        slabline (schema_id, table_id, slab_id) as (
                        select schema_id, table_id, slab_id from sys.ipe_allslabs),
                        slabinfo (slab_id, schema_id, table_id, nslabs, minslab, maxslab) as (
                        select sl.slab_id, sl.schema_id, sl.table_id, sa.nslabs, sa.minslab, sa.maxslab
                        from slabline sl, slabagg sa where sl.schema_id = sa.schema_id and sl.table_id = sa.table_id
                )
                select s1.*, s2.schema_id schema2, s2.table_id table2, s2.nslabs nslabs2, s2.minslab minslab2, s2.maxslab maxslab2,
                case when s1.nslabs <> s2.nslabs then 'slab number mismatch' when s1.minslab <> s2.minslab then 'min slab mismatch'
                        when s1.maxslab <> s2.maxslab then 'max slab mismatch' end reason
                        from slabinfo s1, slabinfo s2
                        where s1.slab_id = s2.slab_id
                        and (s1.nslabs <> s2.nslabs or s1.minslab <> s2.minslab or s1.maxslab <> s2.maxslab)
                        and (s1.schema_id < s2.schema_id or (s1.schema_id = s2.schema_id and s1.table_id < s2.table_id))
                        union
                        select -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 'no default slabs set but slab allocations exist' from sys.ipe_system
                                where not exists (select * from sys.ipe_allslabs where schema_id = -1 and table_id = -1)
                        and exists (select * from sys.ipe_allslabs where slab_id <> -1)

The following is an example of the output from this query:

slab_id

schema_id

table_id

nslabs

minslab

maxslab

schema2

table2

nslabs2

minslab2

maxslab2

reason

4

-1

-1

2

4

5

-1

1099

1

4

4

slab number mismatch

5

-1

-1

2

4

5

2

-1

1

5

5

slab number mismatch

The first row indicates that table_id 1099 (in the table2 column) is assigned to slab 4 (see nslabs2, minslab2, maxslab2), while the system default assignment (due to schema_id and table_id both being -1) is slabs 4 and 5.

The second row indicates that schema_id 2 is assigned to slab 5, while the system default assignment is slabs 4 and 5.

In this case, one way to resolve the overlap would be to assign table_id 1099 to slab 3, and assign schema_id 2 to slabs 4 and 5.