SKG2005 An Integrated Database Benchmark Suite FINAL
7 pages
English

SKG2005 An Integrated Database Benchmark Suite FINAL

-

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
7 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

An Integrated Database Benchmark Suite Hoe Jin Jeong and Sang Ho Lee School of Computing, Soongsil University, Seoul, Korea bqangel@gmail.com and shlee@comp.ssu.ac.kr Abstract. This paper presents an integrated database benchmark suite, which is named SIMS. SIMS offers generic benchmarks, custom benchmarks, and hy-brid benchmarks to users on a unified Web interface. Users can run benchmarks in realistic environments by performing the workload generation facility of SIMS, which generates composite workloads similar to those of the real world. Using SIMS, users can easily implement new custom and generic benchmarks in SIMS. An illustrative demonstration to add a new custom benchmark to SIMS is presented. 1 Introduction As new database systems are developed or new functions are added to existing data-base systems, developers and users would like to evaluate new database systems or new functions systematically in different environments. As such, a number of data-base benchmarks [1][3][4][5][9][10][11], which are indispensable to evaluating the performance of a database system, have been proposed in the literature. Database benchmarks can be classified into two categories: generic benchmarks and custom benchmarks [7]. A generic benchmark is created to represent a com-monly perceived paradigm of use in an application domain. A custom benchmark is created by a particular customer on the basis of a specific application. There is an approach to ...

Informations

Publié par
Nombre de lectures 44
Langue English

Extrait

An Integrated Database Benchmark Suite
Hoe Jin Jeong and Sang Ho Lee
School of Computing, Soongsil University, Seoul, Korea
bqangel@gmail.com
and
shlee@comp.ssu.ac.kr
Abstract.
This paper presents an integrated database benchmark suite, which
is named SIMS. SIMS offers generic benchmarks, custom benchmarks, and hy-
brid benchmarks to users on a unified Web interface. Users can run benchmarks
in realistic environments by performing the workload generation facility of
SIMS, which generates composite workloads similar to those of the real world.
Using SIMS, users can easily implement new custom and generic benchmarks
in SIMS. An illustrative demonstration to add a new custom benchmark to
SIMS is presented.
1
Introduction
As new database systems are developed or new functions are added to existing data-
base systems, developers and users would like to evaluate new database systems or
new functions systematically in different environments.
As such, a number of data-
base benchmarks [1][3][4][5][9][10][11], which are indispensable to evaluating the
performance of a database system, have been proposed in the literature.
Database benchmarks can be classified into two categories: generic benchmarks
and custom benchmarks [7].
A generic benchmark is created to represent a com-
monly perceived paradigm of use in an application domain.
A custom benchmark is
created by a particular customer on the basis of a specific application.
There is an
approach to create a custom benchmark by starting with the framework of a generic
benchmark and modifying or adding tests to reflect the specific needs of users.
The
result is called a hybrid benchmark.
To get the best results, all kinds of benchmarks are usually executed with no appli-
cations running simultaneously.
This benchmark environment is much different from
real-world environments in which a number of applications are running simultane-
ously.
Benchmark results obtained in such unrealistic environments are unlikely to
represent a meaningful performance yardstick of database systems for acquisition
decisions.
End users often suggest that database benchmarks should be run in realis-
tic environments to gain meaningful performance results.
This paper describes an integrated database benchmark suite, the Soongsil Inte-
grated Benchmark Suite (SIMS), which provides environments convenient to run
generic benchmarks, custom benchmarks, and hybrid benchmarks all together and a
facility for realistic workload generation.
SIMS offers a number of generic bench-
marks, a framework to develop custom benchmarks, and hybrid benchmarks for data-
base systems on a unified Web user interface.
SIMS is developed to meet the bench-
mark requirements of both system developers and end users.
The system developers
generally want to execute generic benchmarks to know the relative performance of
database systems, and end users want to know the performance of specific functions
of database systems.
The workload generation facility of SIMS provides users with
realistic benchmark environments.
In terms of data generation, SIMS can generate
not only text data but also XML data for database benchmarks.
The remainder of this paper is organized as follows.
Section 2 introduces the de-
sign philosophies in the design phase of SIMS.
The modules and the parameter files
of SIMS are described in section 3.
The text and XML data generation facilities and
the workload generation facility of SIMS are also presented in Section 3.
Section 4
shows how to implement a new custom benchmark into SIMS with ease and the pro-
cedure to execute a benchmark in realistic environments with the workload genera-
tion facility.
Finally, section 5 contains closing remarks.
2
Design Philosophy
The objective of SIMS is to provide users with a benchmark environment that helps
evaluate the performance of database systems in various ways.
Recognizing the dif-
ficulties most users can face during benchmarking database systems, we would like to
build a framework that helps users effectively meet a variety of benchmark require-
ments.
SIMS allows users to execute all kinds of benchmarks (i.e., generic bench-
marks, custom benchmarks, and hybrid benchmarks) in a unified user interface.
One important consideration in the design phase of SIMS is extensibility.
An ex-
tensible benchmark suite should allow users to implement any benchmark easily.
Our
suite is extensible in that without changing the underlying architecture of SIMS users
can: (1) add any generic benchmarks to SIMS, (2) extend pre-specified test queries in
generic benchmarks to model new features of database systems, (3) add new custom
benchmarks to SIMS.
Any benchmarks have, by nature, a degree of limitations to simulate real-world
environments, and results of such benchmarks are likely to be far from what users see
in practical situations.
We designed SIMS to simulate real-world environments as
much as possible.
The provision of realistic test environments for users is important
in many aspects.
To provide users with realistic benchmark environments, we devel-
oped a workload generation facility that puts a synthesized workload into the system
under test.
Workload generation can be performed in two ways [2].
One is an ana-
lytic approach, which uses mathematical models to simulate the behaviors of users or
the characteristics of specific workloads.
The other is a trace-based approach, which
collects data on resource status and generates workloads based on the collected in-
formation.
In the analytic approach, only a limited number of mathematical distribu-
tions are applied for workload generation.
It may be unsuitable to use the analytic
approach for the generation of realistic workloads, because real-world workloads are
often too complicated to be simulated by mathematical models and workloads are
subject to dramatic changes in unexpected ways.
Hence, we adapt a trace-based ap-
proach to generate workloads in SIMS.
3
An Integrated Database Benchmark Suite
3.1
Modules and Parameter Files
SIMS consists of five independent modules: the unified interface module, the data-
base creation module, the data generation module, the query execution module, and
the workload generation module.
Each module has exactly one parameter file, and it
is implemented as a single process.
Figure 1 illustrates the modules and parameter
files of SIMS.
The parameter files contain three components: section, entry, and
value.
The parameter file must be made in a pre-determined way.
Figure 1.
Modules and parameters of SIMS
SIMS currently supports four generic benchmarks (i.e., the Wisconsin benchmark,
the Set Query benchmark, the TPC-C, and the BORD benchmark), and SIMS offers
four custom benchmarks that evaluate primitive database functions in an intensive
manner: full table scan, index scan, projection, and join functions.
We completely
separate the data generation from the query specification, so that any combination of
data generation and query specification can be possible in SIMS.
This formation of
modules allows us to implement the most important feature of SIMS, extensibility,
which implies that additional custom benchmarks can be added to SIMS at the users’
disposal without changing the underlying structure.
Moreover, other generic bench-
marks can also be added to SIMS.
With SIMS, hybrid benchmarks can be carried out
by executing the test queries of the custom benchmark under the schema of the ge-
neric benchmarks available in SIMS.
3.2
Data Generation
This section describes the data generation module, a component of SIMS, by which
users can generate a large volume of text data as well as XML data.
SIMS totally
supports eight distributions on three output data types (i.e., integer, float, and charac-
ter string): ordinal, uniform, random, constant, Poisson, normal, negative exponential,
and Zipfian distributions.
We support the normal, negative exponential, Poisson, and
Zipfian distributions by resorting to the algorithms in [6].
SIMS supports a shuffling function, which completely orders the generated data.
Every time the shuffling function is called, SIMS guarantees that the order of the data
is totally different.
This feature helps make the generated data look realistic.
In addi-
tion to the synthesized data, SIMS also can generate real-world data, which include
resident registration numbers (similar to social security numbers in the United States),
phone numbers, and zip codes.
The data generation module of SIMS offers three different methods for users to
generate XML data.
The first method is to use a text file that can be obtained in an
automatic way (for example, using a database system that can export data, or a text
data generator), and a data structure definition file.
The second method is to use a
text file and a parameter file.
The parameter file has elementary information for the
data structures to be generated.
Users can use the third method to generate XML data
if they do not have any text files or data structure files.
The detailed description of
XML data generation is omitted because of space restriction.
3.3
Workload Generation
SIMS generates memory-bound, I/O-bound, and CPU-bound workloads.
The work-
loads can be generated independently or in chorus by forking user processes to di-
rectly consume the resources of an operating system.
It is difficult for a user process
to precisely control the operating system resources.
Hence, we use the notion of
tolerance margin that represents the gap between the user’s intended workload and
the generated actual workload.
Figure 2.
Memory-bound workload generation
As shown in figure 2, the workload generation module generates the memory-
bound workload by causing the operating system to exchange data between the physi-
cal memory and the swap memory due to the virtual memory scheme.
The workload
generation module forks a number of processes that compete against each other to
secure space in the physical memory of which the operating system is in charge.
Users can express the memory-bound workload by using the “iowait” value in a
system command (for example, “top”, “iostat”, etc).
The control process tries to
achieve a workload as close as possible to the requested workload by controlling the
number of processes, the amount of memory to be requested, and the sleep time of the
processes.
Given the user’s requested workload, we are able to generate the work-
load with a 5% tolerance margin.
See [8] for the detailed description.
4
Illustrative Demonstration
In this section, we describe how to easily add a new custom benchmark to SIMS and
how to execute the benchmark in realistic environments with the workload generation
facility.
The new custom benchmark we want to add evaluates join operations and
contains 42 queries.
The new custom benchmark is different from the aforemen-
tioned custom benchmark that evaluates primitive join operations in an intensive
manner.
Users are expected to create five parameter files for adding the new custom
benchmark into SIMS.
Figure 3 shows how to create a data generation parameter file that is one of the
five parameter files.
There are three tables in the “Tables” section, and information
on “tableB” table is currently shown.
The “Number of Records” entry shows that the
number of records of “tableB” table was set as 208,000.
The “Columns” section
shows that the “tableB” table has three columns (“col1B”, “col2B”, and “chpadB”)
and the “col2B” column has an integer data type, a uniform distribution, no shuffling,
and no null value.
The “Minimum Number” and the “Maximum Number” entries are
only used for the uniform distribution.
Figure 3.
Data generation parameter file
After completing the five parameter files using the Web user interface, users can
start the custom benchmark by clicking the “Run” button in figure 4.
Figure 4 shows
the main Web page of SIMS.
The main page contains the links to the parameter files.
The five parameter files we have created are listed on the last line of figure 4.
Figure 4.
Main Web page of SIMS
We used a Sun Enterprise E3500 server and a commercial database system to exe-
cute the new custom benchmark.
Figure 5 shows the elapsed times of the new custom
benchmark’s queries with the two workloads: the real TPC-C workload and the syn-
thesized workload similar to the TPC-C workload.
The shapes of the two graphs in
figure 5 are similar to each other.
The fact that the two graphs are similar means that
SIMS generated a workload similar to the TPC-C workload.
In summary, it is easy to
add a new custom benchmark to SIMS and it is possible to execute the benchmark in
realistic environments with the workload generation facility of SIMS.
Figure 5.
Results of the new custom benchmark
5
Conclusions
This paper describes an integrated database benchmark suite, SIMS, which offers
generic benchmarks, custom benchmarks, and hybrid benchmarks to users in a uni-
fied Web interface. The extensibility of SIMS easily allows new benchmarks to be
included in SIMS.
SIMS supports facilities of data generation as well as workload
generation.
The TPC-C workload used for the aforementioned demonstration is a workload of
a frequently used de-facto benchmark, which evaluates a database system or a server
system in a real-world environment.
Hence, we used the TPC-C workload as a repre-
sentative real-world workload.
We executed many internal experiments to show that
SIMS can generate workloads users want to produce.
The experimental results
showed us that SIMS can generate workloads users want as well as the TPC-C work-
load.
SIMS was developed over several years, and implemented with two commercial
database systems.
Initially, we started to develop SIMS to fulfill the requirements of
a local start-up company that is developing and marketing an object-relational data-
base system.
The requirements for SIMS typically came from local database commu-
nities.
In particular, the workload generation facility is a viable attempt to reflect the
practical needs of vendors.
SIMS is currently in use at a local database company.
An
extension of SIMS for a multi-user environment is recommended for future work.
Acknowledgements.
This work was supported by Korea Research Foundation
Grant. (KRF-2004-005-D00172)
References
1. Asgarian, M., Carey, M.J., DeWitt, D.J., Gehrke, J., Naughton, J.F., and Shah,
D.N.: The BUCKY Object-Relational Benchmark. Proceedings of the 1997 ACM
SIGMOD Conference on Management of Data (1997) 135-146
2.
Barford, P. and Crovella, M.: Generating Representative Web Workloads for Net-
work and Server Performance Evaluation. Proceedings of the 1998 ACM
SIGMETRICS International Conference on Measurement and Modeling of Com-
puter Systems (1998) 151-160
3. Carey, M.J., DeWitt, D.J., and Naughton, J.F.: The OO7 Benchmark. Proceedings
of the 1993 ACM SIGMOD Conference on Management of Data (1993) 12-21
4. Cattell, R.G.G. and Skeen, K.: Object Operations Benchmark. ACM Transactions
on Database Systems, Vol. 17, No. 3 (1992) 1-31
5. DeWitt, D.J.: The Wisconsin Benchmark: Past, Present, and Future. In: Gray, J.
(ed.): The Benchmark Handbook for Database and Transaction Processing Sys-
tems, 2
nd
Ed. Morgan Kaufmann, (1993) 269-316
6. Gray, J., Sundaresan, P., Englert, S., Baclawski, K., and Weinberger, P.J.: Quickly
Generating Billion-Record Synthetic Databases. Proceedings of the 1994 ACM
SIGMOD Conference on Management of Data (1994) 233-242
7. Hohenstein, U., Plesser, V., and Heller, R.: Evaluating the Performance of Object-
Oriented Database Systems by Means of a Concrete Application. Proceedings of
the 8th Database and Expert Systems Applications Workshop (1997) 496-501
8. Jeong, H.J. and Lee, S.H.: A Workload Generator for Database System Bench-
marks. Proceedings of the 7
th
International Conference on Information Integration
and Web-based Applications & Services (2005) 813-822
9. Lee, S.H., Kim, S.J., and Kim, W.: The BORD Benchmark for Object-Relational
Database. Proceedings of the 11th International Conference on Database and Ex-
pert Systems Applications (2000) 6-20
10. O’Neil, P.E.: The Set Query Benchmark. In: Gray, J. (ed.): The Benchmark
Handbook for Database and Transaction Processing Systems, 2
nd
Ed. Morgan
Kaufmann, (1993) 359-395
11. The TPC home page,
http://www.tpc.org/
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents