B.5. High Availability and Durability

Documentation

VoltDB Home » Documentation » Administrator's Guide

B.5. High Availability and Durability

The following tables describe the metrics related to Volt features that provide high availability and durability for the database, including snapshots, command logging, and Active(N) cross datacenter replication.

Table B.8. Snapshots

MetricsTypeDescription
voltdb_​snapshot_​site_​summary_​infoMetadata

Informational metric. One for every snapshot file in the recent snapshots performed on the cluster. Tags:

  • SNAPSHOT_NONCE - The unique identifier for the snapshot.

  • TABLE_NAME - The name of the database table whose data the file contains.

  • SNAPSHOT_COLLECTION_ITERATIONS - .

  • SNAPSHOT_COLLECTION_TIME - The length of time (in seconds) it took to complete the snapshot.

  • SNAPSHOT_WRITE_TIME - The length of time (in seconds) it took to complete write stage of snapshot operation.

voltdb_​snapshot_​summary_​infoMetadata

Informational metric. One for every snapshot file in the recent snapshots performed on the cluster. Tags:

  • SNAPSHOT_NONCE - The unique identifier for the snapshot.

  • SNAPSHOT_TXN_ID - The transaction ID of the snapshot.

  • SNAPSHOT_TYPE - String value indicating how the snapshot was initiated. Possible values are: "Auto" - an automated snapshot as defined by the configuration file; "Commandlog" - a command log snapshot; "Manual" - a manual snapshot initiated by a user.

  • SNAPSHOT_PATH - The directory path where the snapshot file resides.

  • SNAPSHOT_START_TIME - The timestamp when the snapshot began (in milliseconds).

  • SNAPSHOT_END_TIME - The timestamp when the snapshot was completed (in milliseconds).

  • SNAPSHOT_BYTES_WRITTEN - Total number of bytes written to the file so far.

  • SNAPSHOT_PROGRESS - For snapshots currently in progress, the percent complete at the time of the call (0-100).

  • SNAPSHOT_RESULT - Value indicating whether the writing of the snapshot file was successful ("Success") or not ("Failure").

voltdb_​snapshot_​table_​summary_​infoMetadata

Informational metric. One for every snapshot file in the recent snapshots performed on the cluster. Tags:

  • SNAPSHOT_NONCE - The unique identifier for the snapshot.

  • SNAPSHOT_TXN_ID - The transaction ID of the snapshot.

  • TABLE_NAME - The name of the database table whose data the file contains.

  • TABLE_FILENAME - The file name.

  • SNAPSHOT_BYTES_WRITTEN - The total size, in bytes, of the file.

  • SNAPSHOT_RESULT - Value indicating whether the writing of the snapshot file was successful ("Success") or not ("Failure").


Table B.9. Command Logging

MetricsTypeDescription
voltdb_​commandlog_​fsync_​interval_​secondsGauge

The average interval between the last 10 fsync system calls.

voltdb_​commandlog_​in_​use_​segment_​count_​totalGauge

The total number of segment files in use for command logging.

voltdb_​commandlog_​outstanding_​bytes_​bytesGauge

The size, in bytes, of pending command log data. For synchronous logging, this value is always zero.

voltdb_​commandlog_​outstanding_​txns_​totalGauge

The number of transactions that have been initiated for which the log has yet to be written to disk. For synchronous logging, this value is always zero.

voltdb_​commandlog_​segment_​count_​totalGauge

The number of segment files allocated, including currently unused segments.


Table B.10. Active(N) and XDCR

MetricsTypeDescription
voltdb_​dr_​conflicts_​count_​totalCounter

The total number of conflicts that have been recorded for this table in this partition.

voltdb_​dr_​constraint_​violation_​count_​totalCounter

The number of constraint violation conflicts that occurred.

voltdb_​dr_​consumer_​infoMetadata

Tags:

  • DR_STATE - A text string indicating the current state of replication. Possible values are:

  • UNINITIALIZED - DR has not begun yet or has stopped

  • INITIALIZE - DR is enabled and the consumer is attempting to contact the producer

  • SYNC - DR has started and the consumer is synchronizing by receiving snapshots of existing data from the master

  • RECEIVE - DR is underway and the consumer is receiving binary logs from the master

  • DISABLE - DR has been canceled for some reason and the consumer is stopping DR.

voltdb_​dr_​consumer_​bytes_​replicated_​bytesCounter

Total number of bytes this consumer received.

voltdb_​dr_​consumer_​partition_​infoMetadata

Tags:

  • IS_COVERED - Boolean value indicating whether this partition is currently connected to and receiving data from a matching partition on the producer cluster.

  • COVERING_HOST - The host name of the server in the producer cluster that is providing DR data to this partition. If IS_COVERED is "false", this label is empty.

  • IS_PAUSED - Boolean indicating whether this partition is paused. A partition "pauses" when the schema of the DR tables on the producer change to no longer match the consumer and all binary logs prior to the change have been processed.

  • CONSUMER_LIMIT_TYPE - The type of limit on the DR queue. The response is either BYTES or BUFFERS.

  • LAST_APPLIED_DR_PROTOCOL - The current DR protocol version of binary logs being received and applied for this partition.

voltdb_​dr_​consumer_​partition_​available_​buffers_​totalGauge

The number of free buffers left in the DR queue.

voltdb_​dr_​consumer_​partition_​available_​buffer_​bytes_​bytesGauge

The number of free bytes left in the DR queue.

voltdb_​dr_​consumer_​partition_​duplicate_​buffers_​totalGauge

The number of repeated buffers received after the initial packets were dropped because the queue was full.

voltdb_​dr_​consumer_​partition_​ignored_​buffers_​totalGauge

The number of buffers received but dropped because the queue was full.

voltdb_​dr_​consumer_​partition_​last_​applied_​timestamp_​secondsGauge

The timestamp of the last transaction successfully applied to this partition on the consumer.

voltdb_​dr_​consumer_​partition_​last_​received_​timestamp_​secondsGauge

The timestamp of the last transaction received from the producer.

voltdb_​dr_​consumer_​remote_​creation_​timestamp_​secondsGauge

The timestamp when the remote cluster started for the first time.

voltdb_​dr_​divergence_​count_​totalCounter

The number of conflicts that may have resulted in divergence between the clusters, which is a subset of the total conflicts.

voltdb_​dr_​last_​conflict_​timestamp_​secondsGauge

The timestamp of the last conflict.

voltdb_​dr_​missing_​row_​count_​totalCounter

The number of missing row conflicts that occurred.

voltdb_​dr_​producer_​cluster_​infoMetadata

Informational metric, presents cluster level metadata. Tags:

  • DR_STATE - The current state of the DR relationship. Possible values are the following: "Disabled", "Pending", "Active", "Stopped".

  • LAST_APPLIED_DR_PROTOCOL - The current DR protocol version of binary logs being received and applied for this partition.

  • SUPPORTED_DR_PROTOCOL - The highest version of DR protocol this cluster is capable of using to send data to consumers.

voltdb_​dr_​producer_​node_​infoMetadata

Informational metric, presents node level metadata. Tags:

  • DR_STATE - The current state of the DR relationship. Possible values are the following: "Disabled", "Pending", "Active", "Stopped".

  • DR_SYNC_SNAPSHOT_STATE - The current state of the synchronization snapshot that begins replication. During normal operation, this value is "None" indicating either that replication is not active or that transactions are actively being replicated. If a synchronization snapshot is in progress, this value provides additional information about the specific activity underway.

voltdb_​dr_​producer_​node_​remote_​creation_​timestamp_​secondsGauge

The timestamp (in seconds) when the remote cluster started for the first time.

voltdb_​dr_​producer_​node_​rows_​acked_​for_​sync_​snapshot_​totalGauge

voltdb_​dr_​producer_​node_​rows_​in_​sync_​snapshot_​totalGauge

voltdb_​dr_​producer_​node_​tasks_​queue_​depth_​totalGauge

The number of DR tasks waiting to be processed.

voltdb_​dr_​producer_​partition_​infoMetadata

Informational metric. Tags:

  • DR_STREAM_TYPE - The type of stream, which can either be "Transactions" or "Snapshot".

  • DR_LAST_QUEUED_ID - The ID of the last transaction queued for transmission to the consumer.

  • DR_LAST_ACK_ID - The ID of the last transaction acknowledged by the consumer.

  • DR_IS_SYNCED - Indicates whether the database is currently being replicated. If replication has not started, or the overflow capacity has been exceeded (that is, replication has failed), the value of ISSYNCED is "false". If replication is currently in progress, the value is "true".

  • DR_MODE - Indicates whether this particular partition is replicating data to the consumer ("NOrmal") or not ("Paused"). Only one copy of each logical partition actually sends data during replication. So for clusters with a K-safety value greater than zero, not all physical partitions will report "Normal" even when replication is in progress.

  • DR_CONNECTION_STATUS - Indicates whether the connection to the consumer is operational ("UP") or not ("DOWN").

  • CONSUMER_LIMIT_TYPE - The type of limit on the DR queue. The response is either BYTES or BUFFERS.

  • CURRENT_DR_PROTOCOL - The DR protocol version currently in use when sending data to consumers.

  • SUPPORTED_DR_PROTOCOL - The highest version of DR protocol this cluster is capable of using to send data to consumers.

voltdb_​dr_​producer_​partition_​available_​to_​send_​buffers_​totalGauge

The number of buffers waiting to be sent to the consumer.

voltdb_​dr_​producer_​partition_​available_​to_​send_​total_​bytesGauge

The number of bytes waiting to be sent to the consumer.

voltdb_​dr_​producer_​partition_​buffers_​waiting_​for_​ack_​totalGauge

The total number of buffers in this partition waiting for acknowledgement from the consumer.

voltdb_​dr_​producer_​partition_​last_​ack_​timestamp_​secondsGauge

The total number of bytes currently queued for transmission to the consumer.

voltdb_​dr_​producer_​partition_​last_​queued_​timestamp_​secondsGauge

The timestamp of the last transaction queued for transmission to the consumer.

voltdb_​dr_​producer_​partition_​queued_​in_​memory_​total_​bytesGauge

The total number of bytes of queued data currently held in memory. If the amount of total bytes is larger than the amount in memory, the remainder is kept in overflow storage on disk.

voltdb_​dr_​producer_​partition_​queued_​total_​bytesGauge

The total number of bytes currently queued for transmission to the consumer.

voltdb_​dr_​producer_​partition_​queue_​gap_​totalGauge

The number of missing transactions between those already acknowledged by the consumer and the next available for transmission. Under normal operating conditions, this value is zero.

voltdb_​dr_​producer_​partition_​round_​trip_​time_​secondsHistogram

The distribution of time it took to receive acknowledgement from the consumer.

voltdb_dr_role_infoMetadata

Informational metric. Tags:

  • DR_ROLE_NAME - The role of the current cluster in a DR relationship. Possible values are NONE, MASTER, REPLICA, and XDCR.

  • DR_STATE - The current state of the DR relationship. DISABLED, PENDING, ACTIVE, STOPPED.

  • REMOTE_CLUSTER_ID - The DR ID of the other DR cluster, or -1 if not available (for example, when DR is disabled or communication has not begun).

  • SUPPORTED_DR_PROTOCOL - The highest version of DR protocol this cluster is capable of using to send data to consumers.

voltdb_​dr_​row_​timestamp_​mismatch_​count_​totalCounter

The number of timestamp mismatch conflicts that occurred.

voltdb_​dr_​schema_​change_​infoMetadata

Informational metric containing metadata. Tags:

  • SITE_ID - Numeric ID of the execution site on the host node.

  • TABLE_TYPE - The type of the table. E.g. "PersistentTable" for normal data tables.

  • TABLE_NAME - The name of the database table for which schema was mismatched.

  • CLUSTER_ID - The numeric ID of the current cluster.

  • REMOTE_CLUSTER_ID - The numeric ID of the remote cluster.

  • DR_SCHEMA_CHANGE_MATCH - A text string of "true" or "false" indicating whether the schema for the table matches on the two clusters.

voltdb_​dr_​schema_​change_​tuple_​count_​totalCounter

The total number of tuples exchanged for this tuple while the schema did not match.