The riak_test application

riak_test is a integration/system testing driver for Riak that runs tests written in Erlang. You write the tests, and riak_test handles bringing up a Riak cluster and running your tests against it. riak_test can generate a cluster directly from a devrel build of a Riak source tree, as well as integrate with basho_expect to generate clusters from generated Riak packages.

The devrel mode operates against a generated devrel that has been turned into a git repo, allowing riak_test to easily reset all nodes by reverting the git repo back to the initial commit. This is the preferred mode of operation as it is easy to setup and can easily test a work-in-progress development tree during implementation and code/automation review.

The basho_expect mode is designed to allow the same system tests to be re-used for validating packages during the release cycle. In this mode, riak_test launches basho_expect in a new server mode that riak_test connects to and sends commands (eg. login to host, install riak, start riak, etc). The server mode is based on Thrift and can therefore be used from languages other than Erlang, say if you want to control basho_expect from Ruby. The Thrift dependency could be removed by finishing the pre-existing basho_expect Erlang port driver if desired.

The following is an example of a riak_test test:

-import(rt, [deploy_nodes/1,
             owners_according_to/1,
             wait_until_nodes_ready/1,
             wait_until_no_pending_changes/1]).

verify_build_cluster() ->
    %% Deploy a set of new nodes
    lager:info("Deploying 3 nodes"),
    Nodes = deploy_nodes(3),

    %% Ensure each node owns 100% of it's own ring
    lager:info("Ensure each nodes 100% of it's own ring"),
    [?assertEqual([Node], owners_according_to(Node)) || Node <- Nodes],

    %% Join nodes
    lager:info("Join nodes together"),
    [Node1|OtherNodes] = Nodes,
    [join(Node, Node1) || Node <- OtherNodes],

    lager:info("Wait until all nodes are ready and there are no pending changes"),
    ?assertEqual(ok, wait_until_nodes_ready(Nodes)),
    ?assertEqual(ok, wait_until_no_pending_changes(Nodes)),

    %% Ensure each node owns a portion of the ring
    lager:info("Ensure each node owns a portion of the ring"),
    [?assertEqual(Nodes, owners_according_to(Node)) || Node <- Nodes],
    lager:info("verify_build_cluster: PASS"),
    ok.

Running the test in devrel mode gives the following output:

16:10:27.787 [info] Application lager started on node 'riak_test@127.0.0.1'
16:10:27.792 [info] Deploying 3 nodes
16:10:27.792 [info] Running: /tmp/rt/dev/dev1/bin/riak stop
16:10:27.793 [info] Running: /tmp/rt/dev/dev2/bin/riak stop
16:10:27.793 [info] Running: /tmp/rt/dev/dev3/bin/riak stop
16:10:28.119 [info] Resetting nodes to fresh state
16:10:28.119 [debug] Running: git --git-dir="/tmp/rt/dev/.git" --work-tree="/tmp/rt/dev" reset HEAD --hard
16:10:34.313 [debug] Running: git --git-dir="/tmp/rt/dev/.git" --work-tree="/tmp/rt/dev" clean -fd
16:10:34.374 [info] Running: /tmp/rt/dev/dev1/bin/riak start
16:10:34.375 [info] Running: /tmp/rt/dev/dev2/bin/riak start
16:10:34.375 [info] Running: /tmp/rt/dev/dev3/bin/riak start
16:10:36.409 [debug] Supervisor inet_gethost_native_sup started undefined at pid <0.68.0>
16:10:36.411 [debug] Supervisor kernel_safe_sup started inet_gethost_native:start_link() at pid <0.67.0>
16:10:36.444 [info] Deployed nodes: ['dev1@127.0.0.1','dev2@127.0.0.1','dev3@127.0.0.1']
16:10:36.444 [info] Ensure each nodes 100% of it's own ring
16:10:36.445 [info] Join nodes together
16:10:36.465 [debug] [join] 'dev2@127.0.0.1' to ('dev1@127.0.0.1'): ok
16:10:36.491 [debug] [join] 'dev3@127.0.0.1' to ('dev1@127.0.0.1'): ok
16:10:36.491 [info] Wait until all nodes are ready and there are no pending changes
16:10:56.192 [info] Ensure each node owns a portion of the ring
16:10:56.193 [info] verify_build_cluster: PASS

Whereas running in basho_expect mode gives the following output:

16:20:13.066 [info] Application lager started on node 'riak_test@127.0.0.1'
16:20:13.086 [info] Connecting to basho_expect at {"127.0.0.1",9000}
16:20:13.090 [debug] Supervisor inet_gethost_native_sup started undefined at pid <0.53.0>
16:20:13.092 [debug] Supervisor kernel_safe_sup started inet_gethost_native:start_link() at pid <0.52.0>
16:20:13.100 [info] Logging into hosts: ["ubuntu10-64-1","ubuntu10-64-2","ubuntu10-64-3"]
<snip basho_expect output>
16:20:30.227 [info] Deploying 3 nodes
16:20:32.378 [info] Re-installing riak on nodes
<snip basho_expect output>
16:22:16.905 [info] Starting riak on nodes
<snip basho_expect output>
16:22:30.141 [info] Deployed nodes: ['riak@192.168.60.10','riak@192.168.60.11','riak@192.168.60.12']
16:22:30.141 [info] Ensure each nodes 100% of it's own ring
16:22:30.143 [info] Join nodes together
16:22:30.153 [debug] [join] 'riak@192.168.60.11' to ('riak@192.168.60.10'): ok
16:22:30.164 [debug] [join] 'riak@192.168.60.12' to ('riak@192.168.60.10'): ok
16:22:30.164 [info] Wait until all nodes are ready and there are no pending changes
16:22:49.433 [info] Ensure each node owns a portion of the ring
16:22:49.436 [info] verify_build_cluster: PASS

As shown, the tests look similar to standard eunit tests and are straightforward Erlang. The rt module provides the basic commands to control Riak nodes, and abstracts everything away from the underlying harness: devrel or basho_expect. The rt module also provides a set of useful existing built-in operations. It is expected that the rt module will be extended over time with more reusable functions.

Writing additional checks and preconditions is the same as with an eunit test. For example, rt:owners_according_to(Node) is just a basic RPC call:

owners_according_to(Node) ->
    {ok, Ring} = rpc:call(Node, riak_core_ring_manager, get_raw_ring, []),
    Owners = [Owner || {_Idx, Owner} <- riak_core_ring:all_owners(Ring)],
    lists:usort(Owners).

And the rt:wait_until_nodes_ready, which leverages the built-in wait_until primitive:

wait_until_nodes_ready(Nodes) ->
    [?assertEqual(ok, wait_until(Node, fun is_ready/1)) || Node <- Nodes],
    ok.

is_ready(Node) ->
    case rpc:call(Node, riak_core_ring_manager, get_raw_ring, []) of
        {ok, Ring} ->
            lists:member(Node, riak_core_ring:ready_members(Ring));
        _ ->
            false
    end.

Much like basho_bench, riak_test uses Erlang term based config files. The references rtdev.config file above:

%% Deps should include the path to compiled Riak libs
{rt_deps, ["/Users/jtuple/basho/riak/deps"]}.

%% Maximum time in milliseconds for wait_until to wait
{rt_max_wait_time, 180000}.

%% Delay between each retry in wait_until
{rt_retry_delay, 500}.

%% Use the devrel harness
{rt_harness, rtdev}.

%% Path to generated devrel of Riak that is git-versioned
{rtdev_path, "/tmp/rt"}.

And the rtbe.config file:

%% Deps should include Riak libs and basho_expect
{rt_deps, ["/Users/jtuple/basho/riak/deps",
           "/Users/jtuple/basho/basho_expect"]}.
{rt_max_wait_time, 180000}.
{rt_retry_delay, 500}.

%% Use the rtbe harness (in basho_expect source tree)
{rt_harness, rtbe}.

%% Host/Port to connect to basho_expect server.
{rtbe_host, "127.0.0.1"}.
{rtbe_port, 9000}.

And to conclude, a slightly longer test that uses most of the built-in primitives including starting and stopping Riak nodes:

verify_claimant() ->
    Nodes = build_cluster(3),
    [Node1, Node2, _Node3] = Nodes,

    %% Ensure all nodes believe node1 is the claimant
    lager:info("Ensure all nodes believe ~p is the claimant", [Node1]),
    [?assertEqual(Node1, claimant_according_to(Node)) || Node <- Nodes],

    %% Stop node1
    lager:info("Stop ~p", [Node1]),
    stop(Node1),
    ?assertEqual(ok, wait_until_unpingable(Node1)),

    %% Ensure all nodes still believe node1 is the claimant
    lager:info("Ensure all nodes still believe ~p is the claimant", [Node1]),
    Remaining = Nodes -- [Node1],
    [?assertEqual(Node1, claimant_according_to(Node)) || Node <- Remaining],

    %% Mark node1 as down and wait for ring convergence
    lager:info("Mark ~p as down", [Node1]),
    down(Node2, Node1),
    ?assertEqual(ok, wait_until_ring_converged(Remaining)),
    [?assertEqual(down, status_of_according_to(Node1, Node)) || Node <- Remaining],

    %% Ensure all nodes now believe node2 to be the claimant
    lager:info("Ensure all nodes now believe ~p is the claimant", [Node2]),
    [?assertEqual(Node2, claimant_according_to(Node)) || Node <- Remaining],

    %% Restart node1 and wait for ring convergence
    lager:info("Restart ~p and wait for ring convergence", [Node1]),
    start(Node1),
    ?assertEqual(ok, wait_until_nodes_ready([Node1])),
    ?assertEqual(ok, wait_until_ring_converged(Nodes)),

    %% Ensure node has rejoined and is no longer down
    lager:info("Ensure ~p has rejoined and is no longer down", [Node1]),
    [?assertEqual(valid, status_of_according_to(Node1, Node)) || Node <- Nodes],

    %% Ensure all nodes still believe node2 is the claimant
    lager:info("Ensure all nodes still believe ~p is the claimant", [Node2]),
    [?assertEqual(Node2, claimant_according_to(Node)) || Node <- Nodes],
    ok.

Installation and Usage

Checkout riak_test from Github and build: http://github.com/basho/riak_test

riak_test comes with two pre-written config files: rtdev.config for devrel mode and rtbe.config for basho_expect mode. The devrel operating mode comes built into riak_test, and can be used right away.

To use the basho_expect mode, you must checkout a version of basho_expect that includes the basho_expect_thrift server mode, and make sure to run make from within the basho_expect/basho_expect_thrift directory.

For both modes, ensure the rt_deps setting in the config file is accurate, and adjust other settings as appropriate. The comments should be self explanatory.

To run riak_test in devrel mode, you must first generate a git-versioned devrel directory and ensure the rtdev_path setting in rtdev.config is accurate. To generate the versioned devrel, modify the following as necessary:
       riak$ make devrel
       riak$ mkdir /tmp/rt
       riak$ cp -r dev /tmp/rt/dev
       riak$ cd /tmp/rt/dev
/tmp/rt/dev$ git init
/tmp/rt/dev$ git add .
/tmp/rt/dev$ git commit -m "initial"
You can then launch tests from within the riak_test directory:
./riak_test rtdev.config verify_build_cluster

riak_test currently assumes tests are written such that there is one test per module and that the module contains a same named zero arity function that implements the test. For example, the above calls verify_build_cluster:verify_build_cluster().

To run run in basho_expect mode, you must first launch the basho_expect server and then run riak_test:
basho_expect$ python run_be_thrift.py &
   riak_test$ ./riak_test rtbe.config verify_build_cluster ubuntu10-64-1 ubuntu10-64-2 ubuntu10-64-3
To make this easier, there is an included script in basho_expect: Run-riak-test that operates more like traditional basho_expect tests. Just make sure to modify it to point to the right riak_test directory. The above simplifies to the following:
basho_expect$ ./Run-riak-test verify_build_cluster ubuntu10-64-1 ubuntu10-64-2 ubuntu10-64-3

Generated by EDoc, Jan 10 2012, 17:33:57.