HyperDex: A Distributed, Searchable Key-Value Store

kyann

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

12 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Sujets

HyperDex: ADistributed,SearchableKey-ValueStore
RobertEscriva BernardWong EminGünSirer
ComputerScience CheritonSchoolofComputer ComputerScience
Department Science Department
CornellUniversity UniversityofWaterloo CornellUniversity
escriva@cs.cornell.edu bernard@uwaterloo.ca egs@systems.cs.cornell.edu
ABSTRACT data retrieval API is narrow and restrictive, permitting an
object to be retrieved using only the key under which itDistributed key-value stores are now a standard component
was stored, and the consistency guarantees are often quiteof high-performance web services and cloud computing ap-
weak. Queries based on secondary attributes are either notplications. While key-value stores oer signi cant perfor-
supported, utilize costly secondary indexing schemes or enu-mance and scalability advantages compared to traditional
merate all objects of a given type.databases, they achieve these properties through a restricted
This paper introduces HyperDex, a high-performance, scal-API that limits object retrieval|an object can only be re-
able, consistent and distributed key-value store that providestrieved by the (primary and only) key under which it was
a new search primitive for retrieving objects by secondaryinserted. This paper presents HyperDex, a novel distributed
attributes. HyperDex achieves this extended functionalitykey-value store that provides a unique search primitive that
by organizing its data using a novel technique called hyper-enables queries on secondary attributes. The key insight
space hashing. Similar to other hashing techniques [23, 29,behind HyperDex is the concept of hyperspace hashing in
36,47], hyperspace hashing deterministically maps objects towhich objects with multiple attributes are mapped into a
servers to enable e cient object insertion and retrieval. Butmultidimensional hyperspace. This mapping leads to e -
it di ers from these techniques because it takes into accountcient implementations not only for retrieval by primary key,
the secondary attributes of an object when determining thebut also for partially-speci ed secondary attribute searches
mapping for an object. Specically, it maps objects to co-and range queries. A novel chaining protocol enables the
ordinates in a multi-dimensional Euclidean space { a hyper-system to achieve strong consistency, maintain availability
space { which has axes de ned by the objects’ attributes.and guarantee fault tolerance. An evaluation of the full sys-
Each server in the system is mapped onto a region of thetem shows that HyperDex is 12-13 faster than Cassandra
same hyperspace, and owns the objects that fall within itsand MongoDB for nding partially specied objects. Ad-
region. Clients use this mapping to deterministically insert,ditionally, HyperDex achieves 2-4 higher throughput for
remove, and search for objects.get/put operations.
Hyperspace hashing facilitates e cient search by signi-
Categories and Subject Descriptors cantly reducing the number of servers to contact for each
partially-speci ed search. The construction of the hyper-D.4.7 [Operating Systems]: Organization and Design
space mapping guarantees that objects with matching at-
Keywords tribute values will reside on the same server. Through geo-
metric reasoning, clients can restrict the search space for aKey-Value Store, NoSQL, Fault-Tolerance, Strong Consis-
partially-speci ed query to a subset of servers in the system,tency, Performance
thereby improving search e ciency. Speci city in searches
works to the clients’ advantage: a fully-specied search con-1. INTRODUCTION
tacts exactly one server.
Modern distributed applications are reshaping the land-
A naive hyperspace construction, however, may suer from
scape of storage systems. Recently emerging distributed
a well-known problem with multi-attribute data known as
key-value stores such as BigTable [11], Cassandra [32] and
\curse of dimensionality [6]." With each additional secondary
Dynamo [19] form the backbone of large commercial appli-
attribute, the hyperspace increases in volume exponentially.
cations because they oer scalability and availability prop-
If constructed in this fashion, each server would be respon-
erties that traditional database systems simply cannot pro-
sible for a large volume of the resulting hyperspace, which
vide. Yet these properties come at a substantial cost: the
would in turn force search operations to contact a large num-
ber of servers, counteracting the bene ts of hyperspace hash-
ing. HyperDex addresses this problem by partitioning the
data into smaller, limited size subspaces of fewer dimensions.Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are Failures are inevitable in all large-scale deployments. The
not made or distributed for proﬁt or commercial advantage and that copies standard approaches for providing fault tolerance store ob-
bear this notice and the full citation on the ﬁrst page. To copy otherwise, to jects on a xed set of replicas determined by a primary
republish, to post on servers or to redistribute to lists, requires prior speciﬁc
key. These techniques, whether they employ a consensus
permission and/or a fee.
algorithm among the replicas and provide strong consis-SIGCOMM’12, August 13–17, 2012, Helsinki, Finland.
Copyright 2012 ACM 978-1-4503-1419-0/12/08 ...$15.00.tency [18,46], or spray the updates to the replicas and only tables. This organization permits straightforward migration
achieve eventual consistency [19,32,45,49], assume that the from existing key-value stores and database systems.
replica sets remain xed even as the objects are updated. HyperDex provides a rich API that supports a variety of
Such techniques are not immediately suitable in our setting datastructures and a wide range of operations. The system
because, in hyperspace hashing, object attributes determine natively supports primitive types, such asstrings,integers
the set of servers on which an object resides, and conse- and floats, as well as composite types, such as lists, sets
quently, each update may implicitly alter the replica set. or maps constructed from primitive types. The dozens of op-
Providing strong consistency guarantees with low overhead erations that HyperDex provides on these datatypes fall into
is di cult, and more so when replica sets change dynami- three categories. First, basic operations, consisting of get,
cally and frequently. HyperDex utilizes a novel replication put, and delete, enable a user to retrieve, update, and de-
protocol called value-dependent chaining to simultaneously stroy an object identied by its key. Second, the search op-
achieve fault tolerance, high performance and strong con- eration enables a user to specify zero or more ranges for sec-
sistency. Value-dependent chaining replicates an object to ondary attributes and retrieve the objects whose attributes
withstand f faults (which may span server crashes and net- fall within the speci ed, potentially singleton, ranges. Fi-
work partitions) and ensures linearizability, even as replica nally, a large set of atomic operations, such as cas and
sets are updated. Thus, HyperDex’s replication protocol atomic-inc, enable applications to safely perform concur-
guarantees that all get operations will immediately see the rent updates on objects identi