wxloader - Kognitio bulk data loading tool

SYNOPSIS

wxloader -s server -u user -t table [options] files…

DESCRIPTION

wxloader is Kognitio’s bulk data loading tool. It sends formatted text files to the server for fast conversion and insertion into a table. Unlike wximport, the server rather than the client performs the conversion to Kognitio’s native binary format. The server is instructed how to convert a record by the format string, which resembles that used by scanf(3) and is normally generated automatically by wxloader.

In its simplest use case, wxloader expects an ordinary CSV file which it loads into the specified table. It generates the format string automatically using the table’s column types.

If conversion options are given, wxloader modifies the generated format string accordingly. If necessary, the user can supply their own custom format string, which allows loading of arbitrary flat file formats.

OPTIONS

At least one file name must be given. If a file name is “-“, wxloader reads from standard input.

If multiple file names are given, wxloader loads them all, sequentially, into the destination table. If any of the individual loads fail, the transaction in which that load took place is rolled back, and no further loads are attempted.

-s server
The DSN or host name of the Kognitio server. This option is required.
-u user
The user name with which to log in to Kognitio. This option is required.
-p password
The password to use. If the password is not specified the ODBC driver will attempt to authenticate using your SSH key. If this fails, then if wxloader is being run from a terminal it will prompt the user for a password, otherwise it will fail.
-t table
The name of the table into which to import the data. This is required, and should be in the form TABLE or SCHEMA.TABLE. Normally the argument is automatically upcased. If wxloader sees double quotes in the argument, anything within double quotes will not be upcased.
-d, --truncate-table
Truncate the table before loading into it. By default, imported records are added to whatever happened to be in the table already.
--no-statistics
Don’t run an UPDATE STATISTICS on the table after the load.
--max-conversion-errors n
Fail when more than n records in one input file are rejected due to conversion errors. By default this is 0, which means if any record is rejected the load is aborted and wxloader fails. N.B. This feature is intended to allow occasional bad records to be rejected. It is not recommended to set this to a very high number in an attempt to cope with files where most of the records have errors. Doing this will probably result in a near-total loss of load performance; delivery of errors from the server to the client is not a fast process. If your data file appears to have hundreds of thousands of malformed records, it is better to fix the input file or check the formatting options passed to wxloader.
--use-checksums
Calculate an XOR checksum on the data in every block for the server to check. This can be useful for debugging if wxloader is ever suspected of sending corrupt blocks to the server, but normally this feature is left unused to improve performance.
--no-checksums
Don’t use checksums on the data blocks. This is the default.
--block-size bytes
Set the size, in bytes, of the blocks wxloader sends to the server. This much data will be sent before a reply is expected from the server. Higher values can improve throughput on high-latency connections, but on pre-8.0 servers, too many concurrent loads with large block sizes can fail if the server cannot allocate a contiguous block of the required size. These failures can manifest themselves as “08S01 Communications link failure” messages, while other connections to the server are unaffected. Lower values are recommended on pre-8.0 servers if many concurrent loads are running, particularly on a low-latency connection to the server. Suffix letters can be used, e.g. 128K and 5M. The default is 20M. If you experience unexplained connection failures when running many loads concurrently to a pre-8.0 server, try a lower value such as 128K.
-jcoldef1,coldef2,,coldefN
Tell wxloader that the input is JSON. The argument is a column definition list, specifying for each table column which JSON object member should be written to it. A simple column definition list is a comma-separated list of JSON object members (also known as properties) in the order the columns appear in the table. See the JSON section of this man page, or see the wxjson2csv man page, for more information.
-J filename
As -j, but the argument is a filename which contains the JSON column definition list.
--avro
Tell wxloader that the input is in the Avro file format. In this case the format string, specified with -F, must be a comma-separated list of Avro fullnames matching the number of columns in the table. This tells the server which Avro value from the schema is to go in which destination column. If no format string is given, one is automatically generated which includes all the scalar fields defined in the Avro schema, in order. In this case the columns in the destination table must be compatible with the fields of the Avro record definition, and in the same order. If a JSON format string is given with -j or -J, each Avro record is converted to a JSON object on the server, and from then on treated as if it were a JSON object. In this case, the names listed in the format string refer to an object’s position in the record hierarchy, and the Avro fullnames are disregarded.
-w
Input is WCB. This is assumed if the filename ends with .wcb.

FIELD-LEVEL CONVERSION OPTIONS

Options which take a char argument can take a single character, a decimal number, an octal number if preceded with 0, a hexadecimal number if preceded with 0x, or any of the escape sequences \n (newline), \t (tab) or \r (carriage return).

These conversion options do not apply to JSON files.

For each option, where applicable, a short explanation is given of how the option affects the format string generated by wxloader.

--allow-bool
If fields corresponding to an integer column contain TRUE or FALSE, accept them as 1 or 0 respectively.
--allow-tz-offset
Allow TIME and TIMESTAMP fields to have an optional time zone offset indicator after them in the form +HHNN or -HHNN, with an optional colon (:) between the hours and minutes. This offset may be up to 24 hours in either direction. The value loaded into the column is what the value would be if the offset were zero. For example, if a timestamp field contained 2017-12-07 03:42:15 -0800, the value loaded into the column would be 2017-12-07 11:42:15. This option has the effect of adding a Z onto the end of the time and timestamp format strings for each %T and %S conversion, if the time and timestamp format strings do not already contain a Z. See the DATE/TIME FORMAT section for further information on time and timestamp formats.
--blank-null
Blank strings should always be loaded as NULL, even if quoted. This option sets the blanknull attribute on all directives in the format string.
--char-for-unknown char
Characters in the input file which cannot be expressed in their corresponding column’s character set should be represented in the table by this character without generating an error. The default behaviour if this option is not given is to reject the record if this situation arises. This option sets the unknownchar attribute on all directives in the format string.
--currency-symbol char
Specify a currency symbol, which will be ignored if it appears before or after any numeric field For example, if the currency symbol is ‘$’, the expressions -1.23, $-1.23, -$1.23 and -1.23$ are all valid. This option sets the currency attribute on all directives in the format string.

-D, --date-format format

-T, --time-format format

-S, --timestamp-format format
Specify how date, time and timestamp fields are formatted in the input file. See the DATE/TIME FORMAT section for further information. These options set the format attribute on %D, %T and %S format string directives as appropriate.
--decimal-point char
Decimal point character for DECIMAL fields. The default is a full stop. This option sets the dp attribute on all directives in the format string.

--exclude-fields 1,2,3,…

--exclude-fields first-last[/step],…
Give a comma-separated list of field numbers which are expected in the input file but which should not be imported into the table. The first field is field 1, so 1,2,3 excludes the first three fields. This option attaches the * (ignore) flag to the appropriate directives in the format string. Ranges of field numbers can be used; for example, 1-4,8-10 would exclude fields 1, 2, 3, 4, 8, 9 and 10. A range can optionally be followed by a slash and a positive number N to exclude every Nth field in the range. For example, 4-10/2,14-23/3 would exclude the even-numbered fields between 4 and 10 inclusive, and every third field between 14 and 23; that is, fields 4, 6, 8, 10, 14, 17, 20 and 23.
--discard-trailing-junk
If a quoted string in a CSV file has extra characters following it and before the next field separator or record terminator, silently skip these characters without an error.
-F, --field-format-file file
Specify the file name containing the wxloader format string to use. If this is given, all other field-level conversion options are ignored. If this is not given, wxloader assums the input is CSV and uses the field-level conversion options and the table’s column types to generate a format string.
-f, --field-separator char
Specify the field separator character. The default is a comma. wxloader uses this option to determine which character to place as a literal between the directives in the format string.
--filename-basename
If --filename-column is given (see below) then strip any leading directory components from the path name of the input file before inserting it into the table row. For example, if the input file is /user/joe/myinputfile.csv and --filename-basename is used, the filename inserted into the appropriate column will be myinputfile.csv. This option does not affect which file is actually loaded.
--filename-column num
Column num in the destination table will consume no input in the data file, but it will have the name of the input file inserted into it. This will cause a conversion error if the numth column in the table is not a string column. The first column in the table is column 1.
--filename-override name
For the purposes of --filename-column, behave as if the input file name is name. If --filename-basename is given then anything that looks like a directory component on the start of name will be removed. This option does not affect which file is actually loaded.
--filename-null-value string
Indicate how a null value is specified in a directory component when using --filename-partitions (see below).
--filename-partitions name1,name2,
Extract the values of these columns from directory components of the form name=value in the filename’s path, rather than expecting a field for it in the input file. Each name contains a string (the name part of name=value), optionally followed by a space and a number indicating which column number this value should go into. If name matches the desired column name in the table (case-insensitively), this number is optional. For example, if you have the table SALES, which has columns (PRODUCT_ID, COUNTRY, YEAR, QUANTITY), the data file contains two fields PRODUCT_ID and QUANTITY, and the filename is sales/country=UK/year=2018/data.csv, you might specify --filename-partitions “country 2, year 3”. (In this case “country,year” would also work because the column names are the same as the partition names in the path.) This tells wxloader, “the value for the second column in the table isn’t in the data file, instead look for a directory component called country=… and use that; also the value for the third column in the table is coming from a directory component called year=…”.
--implied-dp
DECIMAL fields do not have explicit decimal points. Instead, for a DECIMAL field an integer is to be read from the file and divided by 10^s, where s is the scale of the destination column. This option sets the implieddp attribute on all %f directives in the format string.
--int-allow-decimals
Allow non-integer decimals (e.g. “14.782”) in fields corresponding to integer columns. This has the effect of generating %f instead of %d for the format string for such fields. The value will be rounded to the nearest integer. If two integers are equally close the even one will be chosen.
--load-no-nulls
If set, blank fields corresponding to string columns are imported as empty strings, and blank fields corresponding to any kind of number are imported as zero. Blank fields corresponding to DATE, TIME and TIMESTAMP fields are still imported as NULL. This option sets the notnull attribute on all directives in the format string.
--long-field-separator str
Specify a field-separator string. Use this rather than -f to load a data file where the field delimiter is more than one character long.
--min-year year
Two-digit years in DATE and TIMESTAMP fields should be assumed to be no earlier than year. This option sets the minyear attribute on all format string directives which correspond to a DATE or TIMESTAMP column.
-N, --naive-delimiting
Strings are not quoted, and quote characters are not to be treated differently from any other character. This option sets the unquoted attribute on all directives in the format string.
--null-value str
If a field matches str, it is to be imported as NULL. By default, blank fields are imported as NULL unless the column is not nullable. Setting this option does not change that behaviour - it provides another way for the file to express NULL. To ensure blank fields do not go in as NULL, see the --blank-null and --load-no-nulls options or the defaultval format string attribute in wxloader-format. This option sets the nullvalue attribute on all directives in the format string.
--oracompat-ts-format
If set, the date, time and timestamp format strings supplied to -D, -T and -S are expected to use the syntax understood by the TO_DATE(), TO_TIME() and TO_TIMESTAMP() SQL functions provided for Oracle compatability, rather than the syntax detailed in the DATE/TIME FORMAT section below. For details on the syntax of this format string, consult the Kognitio SQL Guide. Note that when using this feature, the --min-year option has no effect.
--pad-short-records
If fewer fields exist in a record than expected, use NULL for the missing fields. This option sets the defaultval attribute on all directives in the format string, makes all the field separators optional and causes strings to stop matching on a record terminator as well as a field separator.
--percent-encode
Tell wxloader that fields are percent-encoded and not quoted. Any percent symbol is expected to be followed by two hex digits representing a character code. For example, %2C would be a comma.
--preserve-leading-spaces
Don’t skip spaces at the start of an unquoted field. This option causes wxloader not to put a space before every format string directive which corresponds to a column, as it would otherwise.
--quote char
Specify the character used to quote strings. The default is a double quote character (U+0022, “). This option sets the quotechar attribute on every directive in the format string.
--record-number-column num
Column num in the destination table will consume no input in the data file, but it will have the record number inserted into it. This will cause a conversion error if the numth column in the table cannot take a number. The first record is record 1.
--separator-terminated-values
Expect an additional field separator after the last field. This option adds an extra field separator literal at the end of the format string.
--thousand-separator char
Thousand separator for numeric fields. If this character appears in a numeric field it is ignored, but no checks are made that they appear in the right places. So if the thousand separator is ‘,’ then “1234”, “1,234”, “12,34” and even “1,,,234” are all valid fields and represent the number 1234. This option sets the thousep attribute on all directives in the format string.
--trim-varchars
If a field is to be loaded into a VARCHAR column, remove any trailing spaces first. This option sets the trim attribute on all directives in the format string which correspond to a VARCHAR column.
--truncate-long-records
If more fields exist in a record than expected, ignore the excess fields. This option adds a %*0c directive to the end of the format string (consume all remaining characters in the record and ignore).
--truncate-strings
Silently truncate strings that are too long to fit into their corresponding column. This option sets the truncate attribute on all directives in the format string which correspond to a CHAR or VARCHAR column.

RECORD-LEVEL CONVERSION OPTIONS

-A jsonarray
The target JSON array which contains the records to load. When loading JSON, you need to specify this, or specify the target array in square brackets at the start of the column definition list supplied to -j or -J, unless your JSON input consists only of an unnamed array, or if it consists of a series of JSON objects rather than one root object.
--one-record-per-line
This option is only applicable to JSON. If the JSON input consists of exactly one JSON object per line, with no target array, this option can be used to enable an optimisation in the server which causes the file to be split up into records more efficiently, improving load performance. If you use this option and the JSON input does not have exactly one record per line, or if any JSON structure exists outside the list of records, you will get conversion errors.
--record-terminator char
Specify the record terminator character. This will almost always be a newline (also known as LF, ‘n’, U+000A), as it is by default. If records are terminated with a CRLF sequence (‘rn’), as with files which originate on Windows, it is still sufficient to leave this as the default.
--record-type type

Specify the type of record the server should expect for the purpose of splitting up the file into records. By default, auto is used, which tells the server to infer the record type from the format string. If you’ve specified your own format string, there may be some cases when the server can’t determine the record type in this way (normally resulting in an IO001A error, “unsupported data format”), or tries to divide the file into records in an unexpected way. If this happens, you can tell the server how to identify the record boundaries with this option. Acceptable arguments to --record-type are as follows:

auto
the server should determine the record type for itself. This is the default.
term
records are terminated by a single character (the record terminator), which never appears anywhere else in the record, not even in quoted strings.
qterm
records are terminated by a single character (the record terminator), but this character can also appear in quoted strings without terminating the record.
fixed
every record has a fixed length. The record length, in characters, must be specified using the --record-width option.
--record-width size
For fixed-width records, specify the size of a record, in characters, if the server could not infer it itself using the format string. This option is only useful in conjunction with --record-type fixed.

FILE-LEVEL CONVERSION OPTIONS

-c, --character-set charset
Character set of the input file. By default it is assumed to match the character set specified in the locale. The character set must match one the server knows about. The character sets known by the server are listed in the IPE_CHARACTER_SET system table. Additionally, the character set specified may end with the string “BE” or “LE” to specify big- or little-endianness respectively. Any hyphens are removed from the character set name prior to comparison with character set names known by the server.
--file-start-offset n
Skip n bytes at the start of each file. N.B. The reported file offsets in any conversion errors will be reckoned from this point, not from the start of the file.
--max-records n
Load no more than n records from each file.
--skip-records n
Skip n records at the start of each file. Skipped records are not loaded; they are not even parsed by the format string.
--pad-incomplete-char
If the file ends with an incomplete character and the file’s character encoding is a fixed-width encoding of more than one byte per character, this option tells the server to pad it with NUL bytes as necessary. Normally, every block sent to the server must contain a whole number of characters - that is, the block is not allowed to end with part of a character. wxloader always ensures that blocks sent to the server are a multiple of the character size, but if the last character in the file is incomplete, and the character encoding is little-endian, and you know the last character has only had NUL bytes stripped from it, this option will allow you to load the file.

REPORTING OPTIONS

-b, --bad-records-file file
Destination file for rejected records. If this option is not given, rejected records are discarded.
--log-file file
Output commentary to this file, not to standard output.
--no-progress
Do not show a progress indicator. This is the default if -q is given or if /dev/tty (or standard output on Windows) does not exist or is not a terminal to which wxloader can write.
-P, --show-progress
Show a progress indicator on the terminal. The progress indicator is written to /dev/tty on environments which have it, and to standard output on Windows. By default, a progress indicator is shown anyway if /dev/tty (or standard output if applicable) exists, is a terminal, and is writable, and -q is not given.
--progress-interval-bytes n
Update the progress indicator no more often than every n bytes.
--progress-interval-records n
Update the progress indicator no more often than every n records.
-q, --quiet
Less commentary. Repeat for no commentary at all.
-r, --extra-summary-info
Display the number of records read in the summary output the end of the load, which may differ from the number of records imported.
--show-timings
Show timing information in the progress indicator and after the load.
-v, --verbose
More commentary. Repeat for even more commentary.

DATE/TIME FORMAT

The arguments to -D, -T and -S represent a date/time format string. The syntax is compatible with that used for the date/time format strings in wxloader.

Date/time format specifiers

Format Meaning
YY year without the century
YYYY year with the century
M month number, one or two characters
MM month number, two characters
MMM month name as a three-letter abbreviation
D day of month, one or two characters
DD day of month, two characters
DDD day of the year (001-366)
H hour, one or two characters
HH hour, two characters
N minute, one or two characters
NN minute, two characters
S second, one or two characters
SS second, two characters
I hour, minute and second run together as an integer
T subsecond; number of Ts is the number of characters
P A or P
PP AM or PM
Z Optional time zone offset (see below)
! Next character is a literal (e.g. !M matches an M)
? Skip this character

Any other characters are treated as literals. DD, MM, YY, HH, NN and SS consume exactly two characters, one of which may be a space and at least one of which must be a digit (but see the note on SS below). D, M, H, N and S initially consume one character; if this is not a digit it is a matching failure, otherwise a second character will be consumed if and only if that is also a digit. DDD and MMM consume exactly three characters. YYYY consumes exactly four characters. In the case of DDD, MMM and YYYY, as with the two-character conversions, leading or trailing spaces are permitted but there must be at least one digit in the characters consumed.

The I format letter consumes at least as many characters as the number of Is that appear. A maximum of six characters are read. This is used to read times with no separators or leading zeroes, such as 12345 as 01:23:45.

Date/time format letters are case-insensitive. The strings they match are also matched case-insensitively, so MMM in the format will match Jan or JAN equally, and PP will match AM or am equally.

If S or SS is not followed by a dot in the format string, then as well as consuming one or two seconds digits it will also consume a dot and any number of subsecond digits, if present. So the format string “HH:NN:SS” will match “12:34:56”, “12:34:56.7”, “12:34:56.78”, and even “12:34:56.7890123456”, although in the latter case note that all the decimal places after the sixth will be ignored, and in any case Kognitio’s TIMESTAMP is only accurate to two decimal places.

If P or PP is given in the format string, the time is assumed to use the 12-hour clock, otherwise it is assumed to be the 24-hour clock.

Z matches a time zone offset indicator in the following format: any number of spaces (including none), followed by a plus or minus sign, followed by two digits (the hours), then an optional colon, then two more digits (the minutes). If the offset indicator is not present at all (that is, no plus or minus sign is found) then this is not an error - the offset is taken as zero. It is an error if the sign is found but there aren’t four digits after it, or if the offset is greater than 24 hours in either direction. The value loaded is what the value would be if the offset were zero, so an offset of -0800 will cause 8 hours to be added to the time or timestamp value before loading it into the column.

The default date, time and timestamp formats are the same as their default representations in SQL; that is, “YYYY-MM-DD”, “HH:NN:SS” and “YYYY-MM-DD HH:NN:SS” respectively. The default time and timestamp formats have an additional Z appended to them if --allow-tz-offset is set.

FORMAT STRING

N.B. This does not apply to JSON or Avro files. See the JSON section.

wxloader uses the field-level conversion options described above to generate a format string which it passes to the server. The server uses this format string to decide where records start and end (unless a different method was requested with --record-type) and to parse the fields within the record. If wxloader is run with -v, it shows the format string that has been generated. It resembles, but is not the same as, a format string that might be passed to the C function scanf(3). A format string to parse a file whose records consist of two integers and two strings might look like this:

%d , %d , %[^,] , %[^,\n]

The user should not have to worry about the format string for CSV files in most cases, as it is generated automatically by wxloader. However, if a file is formatted in such a way that wxloader’s command line options do not cater for it, a custom format string may be supplied in a file whose name is given to the -F option. If a file is a fixed-width format, then the user must supply a format string in this way. For more information on the wxloader format string and some examples, see the wxloader-format man page.

JSON

wxloader can load JSON files into tables, but the user must tell wxloader how each JSON object is to be converted into a table row, in the argument to -j.

This is best explained with an example. Suppose you have a JSON file of country details that looks like this:

{
  "lastupdated" : "2014-09-29",
  "author" : "John Smith",
  "countries" : [
      {
        "name" : "United Kingdom",
        "population" : 70000000,
        "capital" : {
            "name": "London", "lat": 51.5072, "long": -0.1275
        }
      },
      {
        "name" : "United States of America",
        "population" : 300000000,
        "capital" : {
            "name": "Washington DC", "lat": 38.8951, "long": -77.0367
        }
      },
      {
        "name" : "Australia",
        "population" : 23000000,
        "capital" : {
            "name": "Canberra", "lat": -35.3075, "long": 149.1244
        }
      }
  ]
}

You want to load the data for each country into a table defined like this:

create table countries (
        country_name varchar(50),
        capital_city varchar(50),
        capital_latitude decimal(6, 4),
        capital_longitude decimal(7, 4),
        population bigint
)

wxloader needs to know firstly which array of objects in the JSON input it should be looking at (in this case, the “countries” array, whose name is passed to -A) and secondly what to put in which table column (specified with -j).

In this case the argument to the -A option should be countries, and the argument to the -j option should list for each column the JSON property which should be converted into that column, as follows:

name, capital.name, capital.lat, capital.long, population

If the whole JSON file is an unnamed array, or is a series of JSON objects rather than an array, specify a blank argument to -A.

Nested arrays

Objects in JSON files are often denormalised, in that an object may contain lists of values as well as scalars. For example, consider the following JSON file describing students and modules:

{
    "Students": [
        {
            "StudentId" : 100214,
            "StudentName" : "John Smith",
            "Modules" : [
                { "Id": "CS203", "Name": "Language Design" },
                { "Id": "CS236", "Name": "Data Structures" }
            ]
        },
        {
            "StudentId" : 100215,
            "StudentName" : "Jill Watson",
            "Modules" : [
                { "Id": "CS301", "Name": "Algorithmic Complexity" },
                { "Id": "CS319", "Name": "Database Theory" },
                { "Id": "CS331", "Name": "Neural Computing" }
            ]
        }
    ]
}

Suppose you want to load the relationships between students and courses into the two-column table STUDENT_MODULE, defined as follows:

create table student_module (
    student_id int,
    module_id char(5)
)

The argument to the -A option would be Students, because that’s the array containing the data we’re interested in. The appropriate column definition, to be passed to -j, would be:

StudentId, Modules[].Id

The [] operator tells the server that it must insert a row for every element in the JSON array “Modules”. The table would look like this after loading:

STUDENT_ID   MODULE_ID
----------------------
100214       CS203
100214       CS236
100215       CS301
100215       CS319
100215       CS331

Further information

The JSON column definition syntax has additional features for more complicated JSON data. It is the same format used by wxjson2csv. More information on the column definition format can be found in the COLUMN DEFINITION FORMAT section of the wxjson2csv man page.

ERRORS

Import conversion errors begin with the prefix “IE”. By default, if any record is rejected due to a conversion error the whole load is aborted and the transaction is rolled back. This behaviour can be changed with the --max-conversion-errors option to tolerate a given number of rejected records. If the number of rejected records for a single file is less than or equal to this number, the load for that file is considered to have succeeded.

Regardless of the maximum number of conversion errors allowed, any error which does not have an “IE” prefix is assumed to be an unrecoverable error and causes the load to fail.

Rejected records are discarded by default. To write rejected records to file, use the -b option with a suitable destination file name.

If the load succeeds, the transaction is committed. If wxloader’s attempt to UPDATE STATISTICS on the table fails, the table will still have been loaded, but an error message is shown explaining that the statistics update failed and wxloader will exit with a failure status.

LIMITATIONS

There are some file formats which the wxloader format string cannot deal with. Limitations of the format string are discussed in the wxloader-format man page.

The user can specify whether quotes are used to delimit strings or are to be treated as any other character (see -N), but this setting applies to every field in the file. Attempting to set this on a per-field basis, even in the format string using the - (unquoted) flag, will cause problems as the file will be split up into records incorrectly.

wxloader aims eventually to replace wximport for all purposes, but there is still functionality only provided by wximport. Currently there is no means of performing a checkpointed import, whereby if the load is interrupted it can be resumed without starting from the beginning.

wxloader cannot guess the column types of a CSV file and create the table automatically, like wximport can with the -ct option. Use wxcsvprofile to generate a CREATE TABLE statement from a CSV file.

EXAMPLES

The following examples will prompt for a password unless key authentication is available. To supply the password on the command line, add -p password. All the examples assume the existence of the DSN or hostname mysvr and the Kognitio user joe.

Load the CSV file parts.csv into the PART table:

  wxloader -s mysvr -u joe -t part parts.csv

Truncate the PART table, then load the CSV file parts.csv into it:

  wxloader -s mysvr -u joe -d -t part parts.csv

Load many files into the same table:

  wxloader -s mysvr -u joe -t part parts1.csv parts2.csv parts3.csv

Load the tab-separated file suppliers.txt into the SUPPLIER table:

  wxloader -s mysvr -u joe -f ”\t” -t supplier suppliers.txt

Load the tab-separated file suppliers.txt into the SUPPLIER table, if fields are not quoted, and quote characters should not be treated specially:

  wxloader -s mysvr -u joe -f ’\t’ -N -t supplier suppliers.txt

Load the CSV file parts.csv into the PART table, skipping the first record, tolerating up to 10 malformed records, writing all malformed records to a separate file:

  wxloader -s mysvr -u joe --skip-records=1 -b parts_rejected.csv --max-conversion-errors=10 -t part parts.csv

Load a data file which uses the LATIN-1 encoding:

  wxloader -s mysvr -u joe -c LATIN1 -t part parts.csv

Write information to a log file rather than stdout, and disable the on-screen progress indicator:

  wxloader -s mysvr -u joe --log-file parts.log --no-progress -t part parts.csv

Load the CSV file parts.csv into the PART table, ignoring the first three fields in every record:

  wxloader -s mysvr -u joe --exclude-fields=1,2,3 -t part parts.csv

EXIT STATUS

wxloader returns 0 on success, and 1 on failure.

AUTHOR

Kognitio Ltd.