Statistics
| Branch: | Revision:

root / synthbench / bonnie++ / readme.html @ 0:839f52ef7657

History | View | Annotate | Download (9.1 kB)

1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2
<HTML>
3
<HEAD><TITLE>Bonnie++ Documentation</TITLE></HEAD>
4
<BODY>
5

    
6
<UL><LI><B>Introduction</B><BR>
7
This benchmark is named <B>Bonnie++</B>, it is based on the <B>Bonnie</B>
8
benchmark written by <A HREF="MAILTO:tbray@textuality.com">Tim Bray</A>. I was
9
originally hoping to work with Tim on developing the
10
next version of Bonnie, but we could not agree on the issue of whether C++
11
should be used in the program. Tim has graciously given me permission to use
12
the name "bonnie++" for my program which is based around his benchmark.<BR>
13
Bonnie++ adds the facility to test more than 2G of storage on a 32bit
14
machine, and tests for file creat(), stat(), unlink() operations.<BR>
15
Also it will output in CSV spread-sheet format to standard output. If you use
16
the "-q" option for quiet mode then the human-readable version will go to
17
stderr so redirecting stdout to a file will get only the csv in the file.
18
The program bon_csv2html takes csv format data on stdin and writes a HTML
19
file on standard output which has a nice display of all the data. The program
20
bon_csv2txt takes csv format data on stdin and writes a formatted plain text
21
version on stdout, this was originally written to work with 80 column braille
22
displays, but can also work well in email.<BR></LI>
23
<LI><B>A note on blocking writes</B><BR>
24
I have recently added a <B>-b</B> option to cause a fsync() after every
25
write (and a fsync() of the directory after file create or delete). This is
26
what you probably want to do if testing performance of mail or database
27
servers as they like to sync everything. The default is to allow write-back
28
caching in the OS which is what you want if testing performance for copying
29
files, compiling, etc.<BR></LI>
30
<LI><B>Waiting for semaphores</B><BR>
31
There is often a need to test multiple types of IO at the same time. This is
32
most important for testing RAID arrays where you will almost never see full
33
performance with only one process active. Bonnie++ 2.0 will address this
34
issue, but there is also a need for more flexibility than the ability to
35
create multiple files in the same directory and fork processes to access
36
them (which is what version 2.0 will do). There is also need to perform tests
37
such as determining whether access to an NFS server will load the system and
38
slow down access to a local hard drive. <A HREF="c.kagerhuber@t-online.net">
39
Christian Kagerhuber</A> contributed the initial code to do semaphores so that
40
several copies of Bonnie++ can be run in a synchronised fashion.  This means
41
you can have 8 copies of Bonnie++ doing per-char reads to test out your 8CPU
42
system!</LI>
43
<LI><B>Summary of tests</B><BR>
44
The first 6 tests are from the original Bonnie: Specifically, these are the
45
types of filesystem activity that have been observed to be bottlenecks in
46
I/O-intensive applications, in particular the text database work done in
47
connection with the New Oxford English Dictionary Project at the University
48
of Waterloo.<BR>
49
It initially performs a series of tests on a file (or files) of known size.
50
By default, that size is 200 MiB (but that's not enough - see below). For
51
each test, Bonnie reports the number of Kilo-bytes processed per elapsed
52
second, and the % CPU usage (sum of user and system). If a size &gt;1G is
53
specified then we will use a number of files of size 1G or less. This way
54
we can use a 32bit program to test machines with 8G of RAM! NB I have not
55
yet tested more than 2100M of file storage. If you test with larger storage
56
then this please send me the results.<BR>
57
The next 6 tests involve file create/stat/unlink to simulate some operations
58
that are common bottlenecks on large Squid and INN servers, and machines with
59
tens of thousands of mail files in /var/spool/mail.<BR>
60
In each case, an attempt is made to keep optimizers from noticing it's
61
all bogus. The idea is to make sure that these are real transfers to/from
62
user space to the physical disk.<P></LI>
63
<LI><B>Test Details</B><BR>
64
<UL><LI>The file IO tests are:
65
<OL>
66
<LI><B>Sequential Output</B>
67
<OL>
68
<LI>Per-Character. The file is written using the putc() stdio macro.
69
The loop that does the writing should be small enough to fit into any
70
reasonable I-cache. The CPU overhead here is that required to do the
71
stdio code plus the OS file space allocation.</LI>
72

    
73
<LI>Block. The file is created using write(2). The CPU overhead
74
should be just the OS file space allocation.</LI>
75

    
76
<LI>Rewrite. Each BUFSIZ of the file is read with read(2), dirtied, and
77
rewritten with write(2), requiring an lseek(2). Since no space
78
allocation is done, and the I/O is well-localized, this should test the
79
effectiveness of the filesystem cache and the speed of data transfer.</LI>
80
</OL>
81
</LI>
82

    
83
<LI><B>Sequential Input</B>
84
<OL>
85
<LI>Per-Character. The file is read using the getc() stdio macro. Once
86
again, the inner loop is small. This should exercise only stdio and
87
sequential input.</LI>
88

    
89
<LI>Block. The file is read using read(2). This should be a very pure
90
test of sequential input performance.</LI>
91
</OL>
92
</LI>
93

    
94
<LI><B>Random Seeks</B><BR>
95

    
96
This test runs SeekProcCount processes (default 3) in parallel, doing a total of
97
8000 lseek()s to locations in the file specified by random() in bsd systems,
98
drand48() on sysV systems. In each case, the block is read with read(2).
99
In 10% of cases, it is dirtied and written back with write(2).<BR>
100

    
101
The idea behind the SeekProcCount processes is to make sure there's always
102
a seek queued up.<BR>
103

    
104
AXIOM: For any unix filesystem, the effective number of lseek(2) calls
105
per second declines asymptotically to near 30, once the effect of
106
caching is defeated.<BR>
107
One thing to note about this is that the number of disks in a RAID set
108
increases the number of seeks. For read using RAID-1 (mirroring) will
109
double the number of seeks. For write using RAID-0 will multiply the number
110
of writes by the number of disks in the RAID-0 set (provided that enough
111
seek processes exist).<BR>
112

    
113
The size of the file has a strong nonlinear effect on the results of
114
this test. Many Unix systems that have the memory available will make
115
aggressive efforts to cache the whole thing, and report random I/O rates
116
in the thousands per second, which is ridiculous. As an extreme
117
example, an IBM RISC 6000 with 64 MiB of memory reported 3,722 per second
118
on a 50 MiB file. Some have argued that bypassing the cache is artificial
119
since the cache is just doing what it's designed to. True, but in any
120
application that requires rapid random access to file(s) significantly
121
larger than main memory which is running on a system which is doing
122
significant other work, the caches will inevitably max out. There is
123
a hard limit hiding behind the cache which has been observed by the
124
author to be of significant import in many situations - what we are trying
125
to do here is measure that number.</LI>
126
</OL>
127
</LI>
128

    
129
<LI>
130
The file creation tests use file names with 7 digits numbers and a random
131
number (from 0 to 12) of random alpha-numeric characters.
132
For the sequential tests the random characters in the file name follow the
133
number. For the random tests the random characters are first.<BR>
134
The sequential tests involve creating the files in numeric order, then
135
stat()ing them in readdir() order (IE the order they are stored in the
136
directory which is very likely to be the same order as which they were
137
created), and deleting them in the same order.<BR>
138
For the random tests we create the files in an order that will appear
139
random to the file system (the last 7 characters are in numeric order on
140
the files). Then we stat() random files (NB this will return very good
141
results on file systems with sorted directories because not every file
142
will be stat()ed and the cache will be more effective). After that we
143
delete all the files in random order.<BR>
144
If a maximum size greater than 0 is specified then when each file is created
145
it will have a random amount of data written to it. Then when the file is
146
stat()ed it's data will be read.
147
</LI>
148
</UL>
149
</LI>
150
<LI><B>COPYRIGHT NOTICE</B><BR>
151
* Copyright &copy; <A HREF="MAILTO:tbray@textuality.com">Tim Bray
152
(tbray@textuality.com)</A>, 1990.<BR>
153
* Copyright &copy; <A HREF="MAILTO:russell@coker.com.au">Russell Coker
154
(russell@coker.com.au)</A> 1999.<P>
155
I have updated the program, added support for &gt;2G on 32bit machines, and
156
tests for file creation.<BR>
157
Licensed under the GPL version 2.0.
158
</LI><LI>
159
<B>DISCLAIMER</B><BR>
160
This program is provided AS IS with no warranty of any kind, and<BR>
161
The author makes no representation with respect to the adequacy of this
162
program for any particular purpose or with respect to its adequacy to
163
produce any particular result, and<BR>
164
The authors shall not be liable for loss or damage arising out of
165
the use of this program regardless of how sustained, and
166
In no event shall the author be liable for special, direct, indirect
167
or consequential damage, loss, costs or fees or expenses of any
168
nature or kind.<P>
169

    
170
NB The results of running this program on live server machines can include
171
extremely bad performance of server processes, and excessive consumption of
172
disk space and/or Inodes which may cause the machine to cease performing it's
173
designated tasks. Also the benchmark results are likely to be bad.<P>
174
Do not run this program on live production machines.
175
</LI>
176
</UL>
177
</BODY>
178
</HTML>