Chapter 2. Proof of Concept

Documentation

VoltDB Home » Documentation » Planning Guide

Chapter 2. Proof of Concept

A proof of concept (POC) is a small application that tests the key requirements of the proposed solution. For database applications, the POC usually focuses on a few critical transactions, verifying that the database can support the proposed schema, queries, and, ultimately, the expected volume and throughput. (More on this in the chapter on Benchmarking.)

A POC is not a full prototype. Instead, it is just enough code to validate that the technology meets the need. Depending upon the specific business requirements, each POC emphasizes different database functionality. Some may focus primarily on capacity, some on scalability, some on throughput, etc.

Whatever the business requirements, there are two key aspects of VoltDB that must be designed correctly to guarantee a truly effective proof of concept. The following sections discuss the use of partitioning and stored procedures in POCs.

2.1. Effective Partitioning

VoltDB is a distributed database. The data is partitioned automatically, based on a partitioning column you, as the application developer, specify. You do not need to determine where each record goes — VoltDB does that for you.

However, to be effective, you much choose your partitioning columns carefully. The best partitioning column is not always the most obvious one.

The important thing to keep in mind is that VoltDB partitions both the data and the work. For best performance you want to partition the database tables and associated queries so that the most common transactions can be run in parallel. That is, the transactions are, in VoltDB parlance, "single-partitioned".

To be single-partitioned, a transaction must only contain queries that access tables based on a specific value for the partitioning column. In other words, if a transaction is partitioned on the EmpID column of the Employee table (and that is the partitioning column for the table), all queries in the transaction accessing the Employee table must include the constraint WHERE Employee.EmpID = {value}.

To make single-partitioned transactions easier to create, not all tables have to be partitioned. Tables that are not updated frequently can be replicated, meaning they can be accessed in any single-partitioned transaction, no matter what the partitioning key value.

When planning the partitioning schema for your database, the important questions to ask yourself are:

  • Which are the critical, most frequent queries? (These are the transactions you want to be single-partitioned.)

  • For each critical query, what database tables does it access and using what column?

  • Can any of those tables be replicated? (Replicating smaller, less frequently updated tables makes joins in single-partitioned procedures easier.)