Chapter 10. Using VoltDB in a Cluster

It is possible to run VoltDB on a single server and still get all the advantages of parallelism because VoltDB creates multiple partitions on each server. However, there are practical limits to how much memory or processing power any one server can sustain.

One of the key advantages of VoltDB is its ease of expansion. You can increase both capacity and processing (i.e. the total number of partitions) simply by adding servers to the cluster to achieve almost linear scalability. Using VoltDB in a cluster also gives you the ability to increase the availability of the database — protecting it against possible server failures or network glitches.

This chapter explains how to create a cluster of VoltDB servers running a single database. It also explains how to expand the cluster when additional capacity or processing power is needed. The following chapters explain how to increase the availability of your database through the use of K-safety and database replication, as well as how to enable security to limit access to the data.

10.1. Starting a Database Cluster

As described in Chapter 3, Starting the Database, starting a VoltDB cluster is similar to starting VoltDB on a single server — you use the same commands. To start a single server database, you use the voltdb start command by itself. To customize database features, you specify or more configuration files when you initialize the root directory with voltdb init.

To start a cluster, you also use the voltdb start command. In addition, you must:

Specify the number of nodes in the cluster using the --count argument.
Choose one or more nodes as the potential lead or "host" node and specify those nodes using the --host argument on the start command
Issue the same voltdb start command on all nodes of the cluster

For example, if you are creating a new five node cluster and choose nodes server2 and server3 as the hosts, you would issue a command like the following on all five nodes:

$ voltdb start --host=server2,server3 --count=5

To restart a cluster using command logs or automatic snapshots, you repeat the same command. Alternately, you can specify all nodes in the cluster in the --host argument and skip the server count:

$ voltdb start --host=server1,server2,server3,server4,server5

No matter which approach you choose, you must specify the same list of potential hosts on all nodes of the cluster. Once the database cluster is running the leader's special role is complete and all nodes become peers.