Big Data applications require powerful search capabilities, but integrating search into highly available and resilient distributed systems is a challenge. Riak KV meets this challenge with Riak Search and Secondary Indexes (2i). Riak Search is integrated with Apache Solr.
Riak Search pairs the availability and scalability of Riak KV with Apache Solr’s powerful full-text search. It allows for distributed, scalable, fault-tolerant, and transparent indexing and querying of Riak KV data sets. As data changes, search indexes are automatically synchronized, which ensures fast, accurate, and easy queries.
Once data is stored in Riak KV, you will then search and retrieve the data. Riak Search monitors for changes to data in Riak KV and propagates those changes to indexes managed by Apache Solr. Riak KV ensures that incoming data is evenly distributed across all nodes in the cluster. Each node in the Riak KV cluster also supervises an instance of Solr. On each Riak KV node, the Solr instance manages the indexes for that node.
On the query side, Riak Search accepts standard Solr queries and expands them to distributed search queries behind the scenes. Distributed search queries target multiple Solr instances to provide a complete result set across replicas.
Riak Search with integrated Apache Solr includes:
Distributed Full-text Search – Connect to one, talk to all
Standard full-text Solr queries are automatically expanded into distributed search queries to provide a complete result set across instances.
Ad-hoc Query Support – Ask complex questions of your data
Broad support for a wide range of Solr query parameters including: exact match, range queries, and/or/not, sorting, pagination, scoring, ranking, etc.
Index Synchronization – Automate index updates
Automatically synchronize data between Riak KV and Solr. Intelligent monitoring picks up changes to data and propagates those changes to Solr indexes.
Solr API Support – Integrate with existing software
Solr’s integration with Riak KV also includes Solr client query APIs for integration with your application.
Auto-restart – Reduce or eliminate slow manual restarts
Monitor the Solr OS processes and automatically start or restart processes when failures are detected.
Example: Ad-hoc Solr Queries
Step 1: add suffix and index
The first step is to tell Solr the attribute type by appending a suffix.
For example, if you had a
name attribute inside a JSON object and you wanted to index it as a string, you’d rename it to
name_s. If, instead, you wanted to index an
age attribute, you could do
age_i to index it as an integer for range queries.
Step 2: query against index
After the values are indexed, you can send Solr (default query engine is
lucene) queries against the index:
Give me all runners with 10 or more miles run (open-ended range-query)
search/runners?wt=ison&a=miles_run_i: [10 TP *]
Give me all of the runners with a name that begins with Jake (wildcard query)
Give me all of the runners with bios that contain references to “Roger Bannister” (exact match query)
search/runners?wt=ison&a=bio_t: "Roger Bannister"
Once you store your data, you want to quickly query and analyze it. You need complex query support to manage user sessions, store and retrieve patient data, or ensure your gamers continue to chat. Riak KV provides fast, complex query support to meet your most demanding customer and application requirements.
Make real-time decisions
Your application must provide a variety of complex queries and real-time analysis to keep your users engaged. Riak KV supports complex queries so you can search and retrieve your data almost instantly to enable real-time responses.
Increase performance and scale
Scaling your application requires not only the ability to add capacity but also having a scalable query engine. Riak KV enables your search and queries to scale as your data grows.
Improve customer experiences
Your customers won’t wait. They expect your application to respond quickly. Riak KV provides fast queries to ensure a seamless customer experience.