15.15. The CSV/TSV Import Formatters

Documentation

VoltDB Home » Documentation » Using VoltDB

15.15. The CSV/TSV Import Formatters

Besides the source of the data, which can be configured using properties in the <configuration> element of the configuration file, the import infrastructure uses formatters to interpret the incoming data and convert it for insertion into the database. If you use the CSV or TSV formatter, you can control how the data is interpreted by setting additional properties associated with those formatters. For example, the following configuration for the Kafka importer includes the formatter property blank specifying that blank entries should generate an error, rather than being interpreted as null or empty values:

<import>
  <configuration type="kafka" format="csv" enabled="true">
    <property name="brokers">kafkasvr:9092</property>
    <property name="topics">employees</property>
    <property name="procedure">EMPLOYEE.insert</property>
    <property name="blank">error</property>
  </configuration>
</import>

You include the formatter properties in the <configuration> element along with the import type properties. Table 15.9, “CSV and TSV Formatter Properties” lists the allowable properties for the CSV and TSV import formatters.

Table 15.9. CSV and TSV Formatter Properties

PropertyAllowable ValuesDescription
blankempty, error, nullSpecifies what to do with missing values in the input. If you specify empty, missing entries result in the corresponding "empty" value (that is, zero for INTEGER, a zero-length string for VARCHAR, and so on); if you specify error, missing entries generate an error, if you specify null, missing entries result in a null value. The default interpretation of missing values is null.
nowhitespacetrue, falseSpecifies whether the input can contain whitespace between data values and separators. If you specify true, any input lines containing whitespace will generate an error and not be inserted into the database. The default is false.
nullstringstring

Specifies a custom string to be interpreted as a null value. By default, the following entries are interpreted as null:

  • An empty entry

  • NULL (unquoted, uppercase)

  • \N (quoted or unquoted, either upper or lowercase)

If you specify a custom null string, it overrides all default null strings.

trimrawtexttrue, falseSpecifies whether any white space around unquoted string values is included in the string input or not. If you specify true, surrounding white space is dropped; if you specify false, surrounding white space between the string value and the separators is included in the input value. The default is true.