Package com.basho.riak.client.core.query.indexes

Low-level API for managing Riak Secondary Indexes (2i)

See: Description

Package com.basho.riak.client.core.query.indexes Description

Low-level API for managing Riak Secondary Indexes (2i)

Introduction

Secondary Indexing (2i) in Riak gives developers the ability, at write time, to tag an object stored in Riak with one or more queryable values.

Since the KV data in Riak is completely opaque to 2i, the user must tell 2i exactly what attribute to index on and what its index value should be via key/value metadata. This is different from Search, which parses the data and builds indexes based on a schema.

The classes in this package provide an API for managing secondary indexes.

Important note: 2i currently requires Riak to be configured to use the eleveldb or memory backend. The default bitcask backend does not support 2i.

Overview

In Riak there are two types of secondary indexes; "integer" and "binary". The former represents an index with numeric values whereas the second is used for textual (String) data. The server API distinguishes between the two via a suffix ("_int" and "_bin" respectively) added to the index's name. In the client this is encapsulated in the IndexType enum. When specifying an index name you do not have to append this suffix; it's done automatically.

A RiakIndex is made up of the index name, it's type, then one or more queryable index values.

RiakIndex instances are created and managed via the RiakIndexes container. The container is stored in a RiakObject.

Working with RiakIndexes

Data in Riak, including secondary indexes, is stored as raw bytes. The conversion to and from bytes is handled by the concrete RiakIndex implementations and all indexes are managed by the RiakIndexes container.

Each concrete RiakIndex includes a hybrid builder class named Name. The methods of this class take an instance of that builder as an argument to allow for proper type inference and construction of RiakIndex objects to expose.

The RiakIndexes' getIndex() method will either return a reference to the existing RiakIndex or atomically add and return a new one. The returned reference is of the type provided by the Name and is the mutable index; changes are made directly to it.

 RiakIndexes myIndexes = riakObject.getIndexes();
 LongIntIndex myIndex = myIndexes.getIndex(new LongIntIndex.Name("number_on_hand"));
 myIndex.removeAll();
 myIndex.add(6L);
 

Calls can be chained, allowing for easy addition or removal of values from an index.

 riakObject.getIndexes()
           .getIndex(new StringBinIndex.Name("colors"))
           .remove("blue")
           .add("red");
 
Special note when using RawIndex
A RiakIndex is uniquely identified by its textual name and IndexType regardless of the concrete RiakIndex implementation being used to view or update it. This container enforces this uniqueness by being the source of all RiakIndex instances and managing them in a thread-safe way with atomic operations.

What this means is that any RiakIndex having the same name and Indextype will refer to the same index. This is only important to note if you are mixing access to the indexes using RawIndex. The test case below demonstrates the relation.

 public void wrapping()
 {
     // creates or fetches the BIN (_bin) index named "foo", adds a value to it  
     RawIndex index = indexes.getIndex(new RawIndex.Name("foo", IndexType.BIN));
     BinaryValue baw = BinaryValue.unsafeCreate("value".getBytes());
     index.add(baw);
       
     // fetches the previously created index as a StringBinIndex
     StringBinIndex wrapper = indexes.getIndex(new StringBinIndex.Name("foo"));

     // The references are to different objects
     assertNotSame(index, wrapper);
     // The two objects are equal ( index.equals(wrapper) == true )
     assertEquals(index, wrapper);
     // The value exists
     assertTrue(wrapper.hasValue("value"));
     
     // Removing the value via the StringBinIndex is reflected in the RawIndex
     wrapper.remove("value");
     assertFalse(index.hasValue(baw));
 }
 
Riak 2i _bin indexes and sorting

One of the key features of 2i is the ability to do range queries. As previously noted the values are stored in Riak as bytes. Comparison is done byte-by-byte. UTF-8 lends itself well to this as its byte ordering is the same as its lexical ordering.

If you are using a _bin index with a character asSet whose byte ordering differs from its lexical ordering, range queries will be affected.

See Also:
Using Secondary Indexes in Riak

Copyright © 2014. All rights reserved.