Search
Riak 2 features two search systems. New in Riak 2 is “Riak Search 2.0,” which was
developed under the codename “Yokozuna.” It uses Solr for indexing and
querying, and riak-core
for distributing and sharding indexes.
This document covers using Riak Search 2.0 with the Ruby client. See the full Riak Search documentation for more about working with search itself.
tl;dr
This documentation assumes you have a yokozuna
bucket type defined.
require 'riak'
client = Riak::Client.new
bucket = client.bucket_type('yokozuna').bucket('pizzas')
# Create an index
index = Riak::Search::Index.new client, 'pizzas'
index.exists? #=> false
index.create!
# Add the new index to a typed bucket. Setting the index on the bucket
# may fail until the index creation has propagated.
props = Riak::BucketProperties.new bucket
props['search_index'] = index
props.store
# Store records
meat = bucket.new 'meat'
meat.data = {toppings_ss: %w{pepperoni ham sausage}}
meat.store
hawaiian = bucket.new 'hawaiian'
hawaiian.data = {toppings_ss: %w{ham pineapple}}
hawaiian.store
# Search the pizzas index for hashes that have a "ham" entry in the
# toppings_ss array
query = index.query 'toppings_ss:ham'
query.rows = 5
result = query.results
result.num_found # total number of results
result.length # total number returned, can be less than num_found
result.docs.first # metadata about the search result
result.docs.first['score'] # result score
result.first # the first found RObject
Indexes
Indexes connect search terms to documents. They can be created, attached to buckets, and inspected from the Ruby client:
existing_index = Riak::Search::Index.new client, 'existing_index'
existing_index.exists? #=> true
existing_index.create! # raises Riak::SearchError::IndexExistsError
new_index = Riak::Search::Index.new client, 'a_cool_new_index'
new_index.exists? #=> false
# Creating an index can only be done once
new_index.create! #=> true
new_index.create! # raises Riak::Search::IndexExistsError
# Creating an index allows you to specify the schema and n-value for replication
fancy_index = Riak::Search::Index.new client, 'fancy_index_for_fancy_documents'
fancy_index.create! 'schema_name', n_value
# Indexes have accessors:
fancy_index.n_val #=> 3
fancy_index.schema #=> 'schema_name'
Indexes and Buckets
Riak objects aren’t indexed by default. You can set a bucket’s properties to
index objects on write. The BucketProperties
object accepts either a
String
index name, or a Riak::Search::Index
instance for the search_index
property.
props = Riak::BucketProperties.new bucket
props['search_index'] = 'index_name' # String
props['search_index'] = index_object # Riak::Search::Index
props.store
Queries and Results
Riak allows you to search a given index. You can do this with the Ruby client
by creating a Riak::Search::Query
object for a given index.
# Already materialized the index? Ask it for a query:
query = index.query 'search query'
# Initialize a query with a client, index, and the search terms:
query = Riak::Search::Query.new client, index, 'search query'
# You can initialize a query with the index name instead of a materialized
# index:
query = Riak::Search::Query.new client, 'index_name', 'search query'
# Perform the query
results = query.results
You can use normal Lucene query syntax for searching:
query = Riak::Search::Query.new(client, 'famous', "name_s:Lion*")
query = Riak::Search::Query.new(client, 'famous', "age_i:[30 TO *]")
query = Riak::Search::Query.new(client, 'famous', "leader_b:true AND age_i:[30 TO *]")
Queries have optional parameters that can be assigned at initialization or using regular attribute setters:
# Index#query takes an options hash as a second argument
query = index.query 'name_s:Lion*', rows: 5, df: 'dog_ss'
# Query.new takes an options hash as the fourth argument
query = Riak::Search::Query.new(client,
index,
'age_i:[30 TO *]',
sort: 'age_i desc',
start: 15
)
# Options also have accessor methods defined
query.sort = 'age_i asc'
query.rows = 1
query.df = 'dog_ss'
Result Collections and Result Documents
The Query#result
method returns a ResultCollection
object. This object has
useful information about the query response:
results.num_found #=> number of results matching the query
results.length #=> number of results returned from the query
results.max_score #=> highest score found by Solr
Perhaps more usefully, it provides access to an array of ResultDocument
instances, one for each document returned in the query.
docs = results.docs # Array<ResultDocument>
first_result = docs.first # ResultDocument
# addressing information
first_result.bucket_type # Riak::BucketType instance
first_result.bucket # Riak::BucketTyped::Bucket instance
first_result.key # String
Materializing Results into Objects
You can materialize a Riak object from a ResultDocument
, either a RObject
key-value object, or one of the many flavors of CRDT.
# ask the result if it refers to a CRDT
first_result.crdt?
# ask the result what class it will use to materialize the object; returns
# the class Riak::RObject, or a Riak::Crdt::Base subclass
first_result.type_class
# materializes the object, no matter what the type_class
first_result.object
# materializes a CRDT, raises an error if it's not a CRDT
first_result.crdt
# materializes this kind of obejct, raises an error if it's not that
first_result.robject # Riak::RObject
first_result.counter # Riak::Crdt::Counter
first_result.map # Riak::Crdt::Map
first_result.set # Riak::Crdt::Set
Technically, any CRDT object can also be materialized as a regular key-value object. This API doesn’t allow you to do this to make corrupting a CRDT object more difficult.
If you do actually need the RObject for a CRDT, perhaps to delete it, use the
fields on the ResultDocument
to help out.
map_result.map #=> Riak::Crdt::Map instance
map_result.robject # raise Riak::SearchError::UnexpectedResultError
map_robject = map_result.bucket.get map_result.key #=> Riak::RObject instance
Schemas
Schemas explain to Solr how fields should be indexed. They can be created and read with the Ruby client:
schema_content = File.read 'schema.xml'
schema = Riak::Search::Schema.new client, 'schema_for_cool_cats'
schema.exists? #=> false
schema.content = schema_content
schema.create!
other_schema = Riak::Search::Schema.new client, 'some_other_schema'
other_schema.name #=> "some_other_schema"
other_schema.content #=> "<?xml version..."
other_schema.exists? #=> true
other_schema.create! # raises Riak::SearchError::SchemaExistsError
Just like indexes, schemas can only be created once per cluster.