La lecture en ligne est gratuite
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Partagez cette publication

Server Commodity Rack Mounted
Server Benchmark
Whitepaper
Table of Contents
Table of Contents.........................................................................................................................................................2
Introduction ..................................................................................................................................................................3
Hardware and Software Summary..............................................................................................................................3
2Continuous Availability Architecture – CA ...............................................................................................................4
Benchmark Details.........................................5
Benchmark Results......................................................................................................................................................5
Combined Replication Time – Linear Scalability...................................................................................................................... 6
Replication Throughput – Flat Response......................................................................................... 7
Connected Server versus Disconnected Workstation.............................................................................................................. 8
Conclusions .................................................................................................................................................................9
About Progress Real Time Division .........................................................................................................................10

2 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark
Introduction
The price difference between commodity rack mounted server hardware and large SMP
servers justifies alternate deployment strategies. This white paper explores the use of
rack mounted servers in a load-sharing environment to support complete database
applications. Benchmarks are presented to illustrate the performance of replication
operations in this environment and the relative cost of replication for a range of overall
database change within a database. Replication costs are approximated for operations
during a typical 8-hour work day as a percentage of overall time.
Hardware and Software Summary
For this benchmark the server was not a typical database server with multiple CPUs and
large I/O bandwidth. Instead a desktop-PC class hardware is demonstrated as an
effective DataXtend RE InnerEdge™ or OuterEdge™ Server. The software
configuration ran on Microsoft Windows 2000 Professional with Microsoft SQL Server
2000 personal edition. This software platform is considered appropriate for this
hardware. Limits in the operating system that impact I/O subsystem configuration and
memory utilization were not an issue with the hardware. Limits in the database software
on multi-processor utilization and memory use were also not issues for the configuration.
The database software ran effectively in a single-user multi-threaded mode sufficient for
a simple web application.
The specific hardware target for this benchmark is commodity rack-mounted servers
from vendors like IBM, HP, or Dell. Pictured below is a Dell 350-series server as an
example.


Uniprocessor 1GHz Intel PIII

512MB Ram, 40GB 7200 RPM HD
Figure 1 - Front and Back of Dell 350-Series Server
Today's rack mounted servers are remarkable for the density that can be created within
limited floor space. In a typical rack up to 40 of these individual servers can be
configured to provide a complete infrastructure for a significant organization.
There are three major configurations possible for most applications with this hardware
utilizing a three-tier hardware architecture with tier-1 containing web server, tier-2
application server, and tier-3 database server. For small applications the three tiers can
be collapsed into a single configuration. For security issues the web server may need to
be split off from the application server and database servers providing a second approach.
And to maximize hardware resources all three tiers can be split into individual machines.
3 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark To scale up an application, multiple devices can be configured at each layer. For this
benchmark the application was collapsed into a single tier configuration.
2Continuous Availability Architecture – CA
It is important to keep in mind that while the results are for a single server, continuously
available configurations can be built from commodity rack mount servers. With most
application deployment models, the database is restricted to a single layer in the hardware
deployment, and in fact a single device. Applications that deploy with commodity rack-
mounted hardware are often forced to place the database on a large server while the web
server and application server layers can be spread over the appropriate number of
devices. With bi-directional replication technology provided by a Progress®
DataXtend™ RE InnerEdge™ server, the database layer can be deployed on multiple
rack mounted servers and maintained as a synchronized single resource.
This approach differs from typical fail-over or cluster configurations which often require
expensive specialized software and hardware solutions. With InnerEdge server’s
scalability can be achieved at the database level using commodity hardware. Rather than
a complex fail-over scheme, the DataXtend RE Continuous Availability Architecture –
2 2CA – can be used to provide a scalable reliable deployment. In a CA environment
multiple database servers provide the same dataset for one or more applications and any
individual server can be off-line without impacting overall application availability. There
is no fail-over process or management and there is no recovery process or management
aside from repairing hardware.


CA2 Characteristics


Fully Redundant Datasets

Synchronized Updates

No Fail-over Processing

No Fail-back Management

2The DataXtend RE CA approach is a marked contrast to typical fail-over schemes that
require significant operational decisions to be made during a failure event. Even more
significant is the difference after a failure has been resolved and fail-back is necessary.
Fail-back operations are often as significant from a business and management perspective
2as the original failure. With CA , commodity hardware can be configured to provide
very low cost continuous operation environments
In this benchmark we are limiting measurements to replication time and throughput on a
single server. The results of this benchmark apply not only to a single server, but can be
2applied to a rack of DataXtend RE InnerEdge CA Servers. Figure 2 illustrates an
example of this configuration. In a multi-machine configuration each Server would need
to perform one additional replication pass which would result in doubling of overall
4 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark replication times regardless of the number of servers. Peak replication rates would
remain consistent.
Benchmark Details
To analyze performance of the hardware with varying workloads and database sizes a
benchmark was prepared with multiple databases that ranged in size from 100,000 to
5,000,000 records in the largest table (for discussion we use the terms 100K, 1M, 2M,
3M, and 5M to represent the databases containing these tables. The 5M row database
exceeded 2GB in total disk space use. Secondary tables were used with smaller number
of rows. All tables were indexed and all were modified as part of the transactions of the
benchmark. For each database size, the benchmark was designed to modify a varying
amount of data, from 1% to 20% randomly throughout the dataset. This change is called
a database delta. For most applications a 1% delta during a single day would satisfy real-
world workloads. For the most demanding applications, a 20% delta in the dataset was
modeled to provide an extreme measurement that might involve major bulk-load or batch
update processing.
An individual delta is not equal to a single update to a record. A record can be updated
any number of times since it was last replicated, yet requires only a single delta to
represent the change. From this perspective the database delta we are working with is net
change to the overall dataset; the number of database operations executed to modify the
database is generally significantly higher. It is common practice in applications to update
records multiple times as part of a single business process. These updates may be part of
one or multiple transactions. Regardless of the number of transactions, or the number of
updates, if a record has been changed since it was last replicated it is only a single delta.
This concept is a critical difference between the DataXtend RE replication mechanism
and log-based replication schemes. In a log-based scheme, each individual update
operation would need to be replicated, dramatically increasing the number of individual
operations and the amount of data needed to synchronize the dataset.
The database delta methodology is designed to simulate the widest possible variety of
application types. Low deltas would be typical of Customer Resource Management or
Field Force Automation applications, where high deltas would characterize transaction
oriented applications. Low deltas are also associated with frequent replication sessions,
whereas high deltas are associated with infrequent replication sessions. While this
benchmark will give a customer insight into what can be done with commodity hardware,
the results using larger hardware platforms are not subject to the memory and disk
bandwidth limits of the commodity hardware tested. It is recommended that customers in
an evaluation process, or as part of sizing deployment hardware, always create an
application specific benchmark to more closely model the characteristics of replication
for that application. This is especially critical for large deployments of hundreds or
thousands of systems.
The database server configuration and tuning will have a significant impact on replication
times. For specific customer situations, replication time and rates will be highly
dependent on specific hardware and database configurations. In this benchmark, the
database was configured with basic tuning parameters, but no significant tuning was
attempted other than making sure tables were indexed and memory utilization was
configured to insure that a significant dataset was in memory. The 5M row table could
not fit in database cache memory or in the operating system's file cache memory.
Benchmark Results
These results and observations are specific to the hardware and databases used in the
benchmarks. During the benchmark process, the system under test experienced a
significant workload during peak replication processing and was generally near or at
maximum CPU and memory utilization; for real world applications, workloads will vary
and will generally be less demanding. The workload on the underlying database software
5 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark is significant at peak replication rates and the database software itself was observed to be
the bottleneck.
Combined Replication Time – Linear Scalability
The first set of measurements were done to determine combined replication time for a
given database delta measurement. The total time to replicate and the maximum
sustained replication rates are important when sizing an environment. The total time to
replicate allows peak loads to be determined. Generally, an application would not change
a significant percentage of the overall database in a short period of time, but in the event
significant change does occur, it is important to understand the system behavior.
Graph 1 illustrates the combined replication time for varying database delta values.
Applications with small databases where large tables contain 1 million rows or less will
find the replication process is minimal impact regardless of daily database change – these
applications do not stress memory and disk resources of the hardware. Larger databases
will stress the memory and disk subsystems on small hardware platforms. In the most
extreme case benchmarked in this test, 20% of a 5 million row table was modified
resulting in 1 million modified data records in the tables in the database.
Combined Replication Time
160.00
138.75
140.00
120.00 109.01
100.00
79.28
80.00
100K Total Records
1M Total Records60.00
5M Total Records
34.92
40.00
26.12
20.51
14.90
20.00 7.42 6.45
1.35
0.00
1% Delta 5% Delta 10% Delta 15% Delta 20% Delta
Database Change (Percent of Total Records)
Graph 1
It is important to note that as database delta increases, the combined replication time
increases at a linear rate. This characteristic of the DataXtend RE replication mechanism
– linear scalability - is illustrated in two ways. First, as the database delta increases for
individual databases (varying change within the same dataset) the combined replication
time remains linear.
Second, as database size increases (varying total number of rows) the combined
replication times remain linear – 20% delta for 1 million rows takes 26 minutes and 20%
delta for 5 million rows (5 times as much data) takes 138 minutes– A fivefold increase in
data incurs a fivefold increase in replication time.
6 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark
Combined Time (Minutes)Replication Throughput – Flat Response
The next measurements focused on understanding replication throughput while varying
the total number of records. Graph 2 illustrates the peak replication throughput measured
when synchronizing a 10% database delta for different size databases. Peak throughput
was 7500 replication operations per minute for a small database. That throughput
remained relatively flat while varying database size and the number of required
replication operations. The 5M row tables showed a reduction in overall replication
throughput due to the total size of the table and the associated memory characteristics
when performing bulk operations on that table. Specifically the table does not fit in
memory and therefore required I/O activity to read data into memory.
Replication Throughput
10% Database Change Per Day
8000.00 7500.00
7188.507088.01
6711.41
7000.00 7493.06
6306.50
6980.176949.12
6641.96
6000.00
5959.27
5000.00
4000.00
Max Replication Rate Measured
Minimum Rate Required Per Day
3000.00
Additional Available Throughput
2000.00
1000.00 347.22208.33138.8969.446.94
0.00
100K 1M 2M 3M 5M
Database Size (Data Rows)
Graph 1
It is important to note that because peak measurements were done in a single replication
operation (versus distributed in multiple operations over a longer period of time) these
throughput values represent a worst case measurement since they eliminate important
cache effects for large tables in both file system and internal database engine due to the
amount of change in the system. If the same change was replicated in repeated sessions
throughout a day, it would be much more likely that the data remained in the database
cache (depending on specific application) and would not require any I/O activity.
While the overall workload resulted in delivering replication updates at a peak rate of
over 7,000 replication operations per minute, when concentrated into a single replication
session most applications will spread replication cost over a much longer period. If the
delta was spread over an 8 hour work day, the replication rates necessary to maintain the
database updates are shown and drop to a range between 20 and 1000 replication
operations per minute. The difference between these minimum rates and the maximum
rate is shows and the available replication throughput. Available replication throughput
is an indication of the additional replication workload that could be handled by the
system as configured, given an increased workload. That workload could be as a result
of a spike in system activity or a permanent increase in database activity.
7 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark
Replication Updates Per MinuteConnected Server versus Disconnected Workstation
The final measurements are observations around the replication time expressed as a
percentage of a work day. For always-connected InnerEdge servers or remote offices
with 24 hour connectivity, the necessary replication time is always less than 10% of the
work period. For occasionally connected applications, connectivity requirements can
become a significant issue.
Graph 3 illustrates the replication time expressed as a fraction of work periods for 5M
row database. For high delta environments, a significant amount of time must be
provided on a regular basis to synchronize the underlying database changes. For
example, a 20% delta might take as much as 1/3 of a workday for an office worker who
only maintained connectivity to the network with a notebook PC placed in an office
docking station.
Replication Time as Percentage of Day
8 Hour Workday and 24 Hour Workday
30.00% 28.91%
25.00% 22.71%
20.00%
16.52%
% Of 8 Hr Work Day
15.00%
% Of 24Hr Work Day
10.00% 9.64%7.27%
7.57%
5.51%5.00%
1.55%
2.42%
0.52%0.00%
1% Delta 5% Delta 10% Delta 15% Delta 20% Delta
Change (Percent of Records)
Graph 3
This information illustrates how network connectivity can play an important role in
determining a viable distributed dataset. There are techniques provided by DataXtend
RE, such as worksets, that allow a subset of the overall database data to be replicated and
can be used to dramatically reduce the required connection time while still providing
required information for a specific application. So while these limits are real, DataXtend
RE provides a variety of ways to tackle these problems.
8 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark
Percent of Work Period
Conclusions
Commodity rack mounted hardware provides a viable platform for distributing significant
applications. Multi-gigabyte datasets can easily be maintained with modest hardware and
2 software resources. DataXtend RE CA servers can provide a continuous availability
solution using inexpensive hardware platforms. In benchmarks, DataXtend RE is able to
demonstrate linear scalability of the replication mechanism itself. As databases grow and
the amount of change in a database grows, replication performance remains linear.
Replication throughput in commodity hardware can maintain significant replication rates
of over 7000 replication operations per minute. This rate is far in excess of the rate of
change for most applications. Replication does require sufficient connected time to
transfer change. For environments where significant change occurs, this must be
accounted for in the overall system design and features like worksets may be valuable in
reducing the necessary connect time to a reasonable value. For low delta environments,
connectivity requirements can easily be managed with just minutes a day.
These benchmarks provide an insight into the performance characteristics of DataXtend
RE InnerEdge and OuterEdge Server products on specific hardware. It is recommended
that customers in an evaluation process, or as part of sizing deployment hardware,
understand the characteristics of replication for a specific application. This is especially
critical for large deployments of hundreds or thousands of systems.























9 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark
About Progress Real Time Division
The Progress Real Time Division provides event stream processing, data management,
access and synchronization products to enable the real-time enterprise. Our products
manage and analyze real-time event stream data for applications such as algorithmic
trading and RFID; accelerate the performance of existing databases through sophisticated
caching; manage and process complex data in the industry’s leading object database; and
support occasionally connected mobile users requiring real-time access to enterprise
applications. The Progress Real Time Division is an operating unit of Progress Software
Corporation (Nasdaq: PRGS), a global software industry leader. Headquartered in
Bedford, Mass., they can be reached at www.progress.com/realtime or +1-781-280-4000.




www.progress.com/realtime

Worldwide and North American Headquarters
Progress Real Time Division, 14 Oak Park, Bedford, MA 01730 USA Tel: +1 781 280 4000
UK and Northern Ireland
Progress Real Time Division, 210 Bath Road, Slough, Berkshire, SL1 3XE England Tel: +44 1753 216 300
Central Europe
Progress Real Time Division, Konrad-Adenauer-Str. 13, 50996 Köln, Germany Tel: +49 6171 981 127
France
Progress Real Time Division, 3 Place de Saverne, Les Renardières B, 92901 Paris la Défense Tel: +33 1 41 16 16 56










© 2005 Progress Software Corporation. All rights reserved. Progress, DataXtend RE, InnerEdge and OuterEdge are trademarks or registered trademarks of
Progress Software Corporation, or any of its affiliates or subsidiaries, in the U.S. and other countries. Any other trademarks or service marks contained herein are
the property of their respective owners. Specifications subject to change without notice. Visit www.progress.com/realtime for more information.
10 of 10 InnerEdge Server: Commodity Rack Mounted Server Benchmark

Un pour Un
Permettre à tous d'accéder à la lecture
Pour chaque accès à la bibliothèque, YouScribe donne un accès à une personne dans le besoin