SKG2005 An Integrated Database Benchmark Suite FINAL

7 pages

English

SKG2005 An Integrated Database Benchmark Suite FINAL

Arze - <Bfecb8aec1fd>

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

7 pages

English

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

A propos
Informations
Extrait

Description

An Integrated Database Benchmark Suite Hoe Jin Jeong and Sang Ho Lee School of Computing, Soongsil University, Seoul, Korea bqangel@gmail.com and shlee@comp.ssu.ac.kr Abstract. This paper presents an integrated database benchmark suite, which is named SIMS. SIMS offers generic benchmarks, custom benchmarks, and hy-brid benchmarks to users on a unified Web interface. Users can run benchmarks in realistic environments by performing the workload generation facility of SIMS, which generates composite workloads similar to those of the real world. Using SIMS, users can easily implement new custom and generic benchmarks in SIMS. An illustrative demonstration to add a new custom benchmark to SIMS is presented. 1 Introduction As new database systems are developed or new functions are added to existing data-base systems, developers and users would like to evaluate new database systems or new functions systematically in different environments. As such, a number of data-base benchmarks [1][3][4][5][9][10][11], which are indispensable to evaluating the performance of a database system, have been proposed in the literature. Database benchmarks can be classified into two categories: generic benchmarks and custom benchmarks [7]. A generic benchmark is created to represent a com-monly perceived paradigm of use in an application domain. A custom benchmark is created by a particular customer on the basis of a specific application. There is an approach to ...

Informations

Publié par	Arze
Nombre de lectures	44
Langue	English

Extrait

An Integrated Database Benchmark Suite

Hoe Jin Jeong and Sang Ho Lee

School of Computing, Soongsil University, Seoul, Korea

bqangel@gmail.com

and

shlee@comp.ssu.ac.kr

Abstract.

This paper presents an integrated database benchmark suite, which

is named SIMS. SIMS offers generic benchmarks, custom benchmarks, and hy-

brid benchmarks to users on a unified Web interface. Users can run benchmarks

in realistic environments by performing the workload generation facility of

SIMS, which generates composite workloads similar to those of the real world.

Using SIMS, users can easily implement new custom and generic benchmarks

in SIMS. An illustrative demonstration to add a new custom benchmark to

SIMS is presented.

Introduction

As new database systems are developed or new functions are added to existing data-

base systems, developers and users would like to evaluate new database systems or

new functions systematically in different environments.

As such, a number of data-

base benchmarks [1][3][4][5][9][10][11], which are indispensable to evaluating the

performance of a database system, have been proposed in the literature.

Database benchmarks can be classified into two categories: generic benchmarks

and custom benchmarks [7].

A generic benchmark is created to represent a com-

monly perceived paradigm of use in an application domain.

A custom benchmark is

created by a particular customer on the basis of a specific application.

There is an

approach to create a custom benchmark by starting with the framework of a generic

benchmark and modifying or adding tests to reflect the specific needs of users.

The

result is called a hybrid benchmark.

To get the best results, all kinds of benchmarks are usually executed with no appli-

cations running simultaneously.

This benchmark environment is much different from

real-world environments in which a number of applications are running simultane-

ously.

Benchmark results obtained in such unrealistic environments are unlikely to

represent a meaningful performance yardstick of database systems for acquisition

decisions.

End users often suggest that database benchmarks should be run in realis-

tic environments to gain meaningful performance results.

This paper describes an integrated database benchmark suite, the Soongsil Inte-

grated Benchmark Suite (SIMS), which provides environments convenient to run

generic benchmarks, custom benchmarks, and hybrid benchmarks all together and a

facility for realistic workload generation.

SIMS offers a number of generic bench-

marks, a framework to develop custom benchmarks, and hybrid benchmarks for data-

base systems on a unified Web user interface.

SIMS is developed to meet the bench-

mark requirements of both system developers and end users.

The system developers

generally want to execute generic benchmarks to know the relative performance of

database systems, and end users want to know the performance of specific functions

of database systems.

The workload generation facility of SIMS provides users with

realistic benchmark environments.

In terms of data generation, SIMS can generate

not only text data but also XML data for database benchmarks.

The remainder of this paper is organized as follows.

Section 2 introduces the de-

sign philosophies in the design phase of SIMS.

The modules and the parameter files

of SIMS are described in section 3.

The text and XML data generation facilities and

the workload generation facility of SIMS are also presented in Section 3.

Section 4

shows how to implement a new custom benchmark into SIMS with ease and the pro-

cedure to execute a benchmark in realistic environments with the workload genera-

tion facility.

Finally, section 5 contains closing remarks.

Design Philosophy

The objective of SIMS is to provide users with a benchmark environment that helps

evaluate the performance of database systems in various ways.

Recognizing the dif-

ficulties most users can face during benchmarking database systems, we would like to

build a framework that helps users effectively meet a variety of benchmark require-

ments.

SIMS allows users to execute all kinds of benchmarks (i.e., generic bench-

marks, custom benchmarks, and hybrid benchmarks) in a unified user interface.

One important consideration in the design phase of SIMS is extensibility.

An ex-

tensible benchmark suite should allow users to implement any benchmark easily.

Our

suite is extensible in that without changing the underlying architecture of SIMS users

can: (1) add any generic benchmarks to SIMS, (2) extend pre-specified test queries in

generic benchmarks to model new features of database systems, (3) add new custom

benchmarks to SIMS.

Any benchmarks have, by nature, a degree of limitations to simulate real-world

environments, and results of such benchmarks are likely to be far from what users see

in practical situations.

We designed SIMS to simulate real-world environments as

much as possible.

The provision of realistic test environments for users is important

in many aspects.

To provide users with realistic benchmark environments, we devel-

oped a workload generation facility that puts a synthesized workload into the system

under test.

Workload generation can be performed in two ways [2].

One is an ana-

lytic approach, which uses mathematical models to simulate the behaviors of users or

the characteristics of specific workloads.

The other is a trace-based approach, which

collects data on resource status and generates workloads based on the collected in-

formation.

In the analytic approach, only a limited number of mathematical distribu-

tions are applied for workload generation.

It may be unsuitable to use the analytic

approach for the generation of realistic workloads, because real-world workloads are

often too complicated to be simulated by mathematical models and workloads are

subject to dramatic changes in unexpected ways.

Hence, we adapt a trace-based ap-

proach to generate workloads in SIMS.

An Integrated Database Benchmark Suite

3.1

Modules and Parameter Files

SIMS consists of five independent modules: the unified interface module, the data-

base creation module, the data generation module, the query execution module, and

the workload generation module.

Each module has exactly one parameter file, and it

is implemented as a single process.

Figure 1 illustrates the modules and parameter

files of SIMS.

The parameter files contain three components: section, entry, and

value.

The parameter file must be made in a pre-determined way.

Figure 1.

Modules and parameters of SIMS

SIMS currently supports four generic benchmarks (i.e., the Wisconsin benchmark,

the Set Query benchmark, the TPC-C, and the BORD benchmark), and SIMS offers

four custom benchmarks that evaluate primitive database functions in an intensive

manner: full table scan, index scan, projection, and join functions.

We completely

separate the data generation from the query specification, so that any combination of

data generation and query specification can be possible in SIMS.

This formation of

modules allows us to implement the most important feature of SIMS, extensibility,

which implies that additional custom benchmarks can be added to SIMS at the users’

disposal without changing the underlying structure.

Moreover, other generic bench-

marks can also be added to SIMS.

With SIMS, hybrid benchmarks can be carried out

by executing the test queries of the custom benchmark under the schema of the ge-

neric benchmarks available in SIMS.

3.2

Data Generation

This section describes the data generation module, a component of SIMS, by which

users can generate a large volume of text data as well as XML data.

SIMS totally

supports eight distributions on three output data types (i.e., integer, float, and charac-

ter string): ordinal, uniform, random, constant, Poisson, normal, negative exponential,

and Zipfian distributions.

We support the normal, negative exponential, Poisson, and

Zipfian distributions by resorting to the algorithms in [6].

SIMS supports a shuffling function, which completely orders the generated data.

Every time the shuffling function is called, SIMS guarantees that the order of the data

is totally different.

This feature helps make the generated data look realistic.

In addi-

tion to the synthesized data, SIMS also can generate real-world data, which include

resident registration numbers (similar to social security numbers in the United States),

phone numbers, and zip codes.

The data generation module of SIMS offers three different methods for users to

generate XML data.

The first method is to use a text file that can be obtained in an

automatic way (for example, using a database system that can export data, or a text

data generator), and a data structure definition file.

The second method is to use a

text file and a parameter file.

The parameter file has elementary information for the

data structures to be generated.

Users can use the third method to generate XML data

if they do not have any text files or data structure files.

The detailed description of

XML data generation is omitted because of space restriction.

3.3

Workload Generation

SIMS generates memory-bound, I/O-bound, and CPU-bound workloads.

The work-

loads can be generated independently or in chorus by forking user processes to di-

rectly consume the resources of an operating system.

It is difficult for a user process

to precisely control the operating system resources.

Hence, we use the notion of

tolerance margin that represents the gap between the user’s intended workload and

the generated actual workload.

Figure 2.

Memory-bound workload generation

As shown in figure 2, the workload generation module generates the memory-

bound workload by causing the operating system to exchange data between the physi-

cal memory and the swap memory due to the virtual memory scheme.

The workload

generation module forks a number of processes that compete against each other to

secure space in the physical memory of which the operating system is in charge.

Users can express the memory-bound workload by using the “iowait” value in a

system command (for example, “top”, “iostat”, etc).

The control process tries to

achieve a workload as close as possible to the requested workload by controlling the

number of processes, the amount of memory to be requested, and the sleep time of the

processes.

Given the user’s requested workload, we are able to generate the work-

load with a 5% tolerance margin.

See [8] for the detailed description.

Illustrative Demonstration

In this section, we describe how to easily add a new custom benchmark to SIMS and

how to execute the benchmark in realistic environments with the workload generation

facility.

The new custom benchmark we want to add evaluates join operations and

contains 42 queries.

The new custom benchmark is different from the aforemen-

tioned custom benchmark that evaluates primitive join operations in an intensive

manner.

Users are expected to create five parameter files for adding the new custom

benchmark into SIMS.

Figure 3 shows how to create a data generation parameter file that is one of the

five parameter files.

There are three tables in the “Tables” section, and information

on “tableB” table is currently shown.

The “Number of Records” entry shows that the

number of records of “tableB” table was set as 208,000.

The “Columns” section

shows that the “tableB” table has three columns (“col1B”, “col2B”, and “chpadB”)

and the “col2B” column has an integer data type, a uniform distribution, no shuffling,

and no null value.

The “Minimum Number” and the “Maximum Number” entries are

only used for the uniform distribution.

Figure 3.

Data generation parameter file

After completing the five parameter files using the Web user interface, users can

start the custom benchmark by clicking the “Run” button in figure 4.

Figure 4 shows

the main Web page of SIMS.

The main page contains the links to the parameter files.

The five parameter files we have created are listed on the last line of figure 4.

Figure 4.

Main Web page of SIMS

We used a Sun Enterprise E3500 server and a commercial database system to exe-

cute the new custom benchmark.

Figure 5 shows the elapsed times of the new custom

benchmark’s queries with the two workloads: the real TPC-C workload and the syn-

thesized workload similar to the TPC-C workload.

The shapes of the two graphs in

figure 5 are similar to each other.

The fact that the two graphs are similar means that

SIMS generated a workload similar to the TPC-C workload.

In summary, it is easy to

add a new custom benchmark to SIMS and it is possible to execute the benchmark in

realistic environments with the workload generation facility of SIMS.

Figure 5.

Results of the new custom benchmark

Conclusions

This paper describes an integrated database benchmark suite, SIMS, which offers

generic benchmarks, custom benchmarks, and hybrid benchmarks to users in a uni-

fied Web interface. The extensibility of SIMS easily allows new benchmarks to be

included in SIMS.

SIMS supports facilities of data generation as well as workload

generation.

The TPC-C workload used for the aforementioned demonstration is a workload of

a frequently used de-facto benchmark, which evaluates a database system or a server

system in a real-world environment.

Hence, we used the TPC-C workload as a repre-

sentative real-world workload.

We executed many internal experiments to show that

SIMS can generate workloads users want to produce.

The experimental results

showed us that SIMS can generate workloads users want as well as the TPC-C work-

load.

SIMS was developed over several years, and implemented with two commercial

database systems.

Initially, we started to develop SIMS to fulfill the requirements of

a local start-up company that is developing and marketing an object-relational data-

base system.

The requirements for SIMS typically came from local database commu-

nities.

In particular, the workload generation facility is a viable attempt to reflect the

practical needs of vendors.

SIMS is currently in use at a local database company.

extension of SIMS for a multi-user environment is recommended for future work.

Acknowledgements.

This work was supported by Korea Research Foundation

Grant. (KRF-2004-005-D00172)

References

1. Asgarian, M., Carey, M.J., DeWitt, D.J., Gehrke, J., Naughton, J.F., and Shah,

D.N.: The BUCKY Object-Relational Benchmark. Proceedings of the 1997 ACM

SIGMOD Conference on Management of Data (1997) 135-146

Barford, P. and Crovella, M.: Generating Representative Web Workloads for Net-

work and Server Performance Evaluation. Proceedings of the 1998 ACM

SIGMETRICS International Conference on Measurement and Modeling of Com-

puter Systems (1998) 151-160

3. Carey, M.J., DeWitt, D.J., and Naughton, J.F.: The OO7 Benchmark. Proceedings

of the 1993 ACM SIGMOD Conference on Management of Data (1993) 12-21

4. Cattell, R.G.G. and Skeen, K.: Object Operations Benchmark. ACM Transactions

on Database Systems, Vol. 17, No. 3 (1992) 1-31

5. DeWitt, D.J.: The Wisconsin Benchmark: Past, Present, and Future. In: Gray, J.

(ed.): The Benchmark Handbook for Database and Transaction Processing Sys-

tems, 2

Ed. Morgan Kaufmann, (1993) 269-316

6. Gray, J., Sundaresan, P., Englert, S., Baclawski, K., and Weinberger, P.J.: Quickly

Generating Billion-Record Synthetic Databases. Proceedings of the 1994 ACM

SIGMOD Conference on Management of Data (1994) 233-242

7. Hohenstein, U., Plesser, V., and Heller, R.: Evaluating the Performance of Object-

Oriented Database Systems by Means of a Concrete Application. Proceedings of

the 8th Database and Expert Systems Applications Workshop (1997) 496-501

8. Jeong, H.J. and Lee, S.H.: A Workload Generator for Database System Bench-

marks. Proceedings of the 7

International Conference on Information Integration

and Web-based Applications & Services (2005) 813-822

9. Lee, S.H., Kim, S.J., and Kim, W.: The BORD Benchmark for Object-Relational

Database. Proceedings of the 11th International Conference on Database and Ex-

pert Systems Applications (2000) 6-20

10. O’Neil, P.E.: The Set Query Benchmark. In: Gray, J. (ed.): The Benchmark

Handbook for Database and Transaction Processing Systems, 2

Ed. Morgan

Kaufmann, (1993) 359-395

11. The TPC home page,

http://www.tpc.org/

Univers
Ebooks
Livres audio
Presse
Podcasts
BD
Documents

Livre audio en ligne - Développement personnel Livre en ligne Tout le catalogue Tous les Intérêts

SKG2005 An Integrated Database Benchmark Suite FINAL

YouScribe

Le catalogue

Le service

Les conditions