root / synthbench / bonnie++ / readme.html @ 0:839f52ef7657
History | View | Annotate | Download (9.1 kB)
1 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
---|---|
2 |
<HTML>
|
3 |
<HEAD><TITLE>Bonnie++ Documentation</TITLE></HEAD> |
4 |
<BODY>
|
5 |
|
6 |
<UL><LI><B>Introduction</B><BR> |
7 |
This benchmark is named <B>Bonnie++</B>, it is based on the <B>Bonnie</B> |
8 |
benchmark written by <A HREF="MAILTO:tbray@textuality.com">Tim Bray</A>. I was |
9 |
originally hoping to work with Tim on developing the |
10 |
next version of Bonnie, but we could not agree on the issue of whether C++ |
11 |
should be used in the program. Tim has graciously given me permission to use |
12 |
the name "bonnie++" for my program which is based around his benchmark.<BR>
|
13 |
Bonnie++ adds the facility to test more than 2G of storage on a 32bit |
14 |
machine, and tests for file creat(), stat(), unlink() operations.<BR>
|
15 |
Also it will output in CSV spread-sheet format to standard output. If you use |
16 |
the "-q" option for quiet mode then the human-readable version will go to |
17 |
stderr so redirecting stdout to a file will get only the csv in the file. |
18 |
The program bon_csv2html takes csv format data on stdin and writes a HTML |
19 |
file on standard output which has a nice display of all the data. The program |
20 |
bon_csv2txt takes csv format data on stdin and writes a formatted plain text |
21 |
version on stdout, this was originally written to work with 80 column braille |
22 |
displays, but can also work well in email.<BR></LI> |
23 |
<LI><B>A note on blocking writes</B><BR> |
24 |
I have recently added a <B>-b</B> option to cause a fsync() after every |
25 |
write (and a fsync() of the directory after file create or delete). This is |
26 |
what you probably want to do if testing performance of mail or database |
27 |
servers as they like to sync everything. The default is to allow write-back |
28 |
caching in the OS which is what you want if testing performance for copying |
29 |
files, compiling, etc.<BR></LI> |
30 |
<LI><B>Waiting for semaphores</B><BR> |
31 |
There is often a need to test multiple types of IO at the same time. This is |
32 |
most important for testing RAID arrays where you will almost never see full |
33 |
performance with only one process active. Bonnie++ 2.0 will address this |
34 |
issue, but there is also a need for more flexibility than the ability to |
35 |
create multiple files in the same directory and fork processes to access |
36 |
them (which is what version 2.0 will do). There is also need to perform tests |
37 |
such as determining whether access to an NFS server will load the system and |
38 |
slow down access to a local hard drive. <A HREF="c.kagerhuber@t-online.net"> |
39 |
Christian Kagerhuber</A> contributed the initial code to do semaphores so that
|
40 |
several copies of Bonnie++ can be run in a synchronised fashion. This means |
41 |
you can have 8 copies of Bonnie++ doing per-char reads to test out your 8CPU |
42 |
system!</LI>
|
43 |
<LI><B>Summary of tests</B><BR> |
44 |
The first 6 tests are from the original Bonnie: Specifically, these are the |
45 |
types of filesystem activity that have been observed to be bottlenecks in |
46 |
I/O-intensive applications, in particular the text database work done in |
47 |
connection with the New Oxford English Dictionary Project at the University |
48 |
of Waterloo.<BR>
|
49 |
It initially performs a series of tests on a file (or files) of known size. |
50 |
By default, that size is 200 MiB (but that's not enough - see below). For |
51 |
each test, Bonnie reports the number of Kilo-bytes processed per elapsed |
52 |
second, and the % CPU usage (sum of user and system). If a size >1G is
|
53 |
specified then we will use a number of files of size 1G or less. This way |
54 |
we can use a 32bit program to test machines with 8G of RAM! NB I have not |
55 |
yet tested more than 2100M of file storage. If you test with larger storage |
56 |
then this please send me the results.<BR>
|
57 |
The next 6 tests involve file create/stat/unlink to simulate some operations |
58 |
that are common bottlenecks on large Squid and INN servers, and machines with |
59 |
tens of thousands of mail files in /var/spool/mail.<BR>
|
60 |
In each case, an attempt is made to keep optimizers from noticing it's |
61 |
all bogus. The idea is to make sure that these are real transfers to/from |
62 |
user space to the physical disk.<P></LI> |
63 |
<LI><B>Test Details</B><BR> |
64 |
<UL><LI>The file IO tests are: |
65 |
<OL>
|
66 |
<LI><B>Sequential Output</B> |
67 |
<OL>
|
68 |
<LI>Per-Character. The file is written using the putc() stdio macro.
|
69 |
The loop that does the writing should be small enough to fit into any |
70 |
reasonable I-cache. The CPU overhead here is that required to do the |
71 |
stdio code plus the OS file space allocation.</LI>
|
72 |
|
73 |
<LI>Block. The file is created using write(2). The CPU overhead
|
74 |
should be just the OS file space allocation.</LI>
|
75 |
|
76 |
<LI>Rewrite. Each BUFSIZ of the file is read with read(2), dirtied, and
|
77 |
rewritten with write(2), requiring an lseek(2). Since no space |
78 |
allocation is done, and the I/O is well-localized, this should test the |
79 |
effectiveness of the filesystem cache and the speed of data transfer.</LI>
|
80 |
</OL>
|
81 |
</LI>
|
82 |
|
83 |
<LI><B>Sequential Input</B> |
84 |
<OL>
|
85 |
<LI>Per-Character. The file is read using the getc() stdio macro. Once
|
86 |
again, the inner loop is small. This should exercise only stdio and |
87 |
sequential input.</LI>
|
88 |
|
89 |
<LI>Block. The file is read using read(2). This should be a very pure
|
90 |
test of sequential input performance.</LI>
|
91 |
</OL>
|
92 |
</LI>
|
93 |
|
94 |
<LI><B>Random Seeks</B><BR> |
95 |
|
96 |
This test runs SeekProcCount processes (default 3) in parallel, doing a total of |
97 |
8000 lseek()s to locations in the file specified by random() in bsd systems, |
98 |
drand48() on sysV systems. In each case, the block is read with read(2). |
99 |
In 10% of cases, it is dirtied and written back with write(2).<BR>
|
100 |
|
101 |
The idea behind the SeekProcCount processes is to make sure there's always |
102 |
a seek queued up.<BR>
|
103 |
|
104 |
AXIOM: For any unix filesystem, the effective number of lseek(2) calls |
105 |
per second declines asymptotically to near 30, once the effect of |
106 |
caching is defeated.<BR>
|
107 |
One thing to note about this is that the number of disks in a RAID set |
108 |
increases the number of seeks. For read using RAID-1 (mirroring) will |
109 |
double the number of seeks. For write using RAID-0 will multiply the number |
110 |
of writes by the number of disks in the RAID-0 set (provided that enough |
111 |
seek processes exist).<BR>
|
112 |
|
113 |
The size of the file has a strong nonlinear effect on the results of |
114 |
this test. Many Unix systems that have the memory available will make |
115 |
aggressive efforts to cache the whole thing, and report random I/O rates |
116 |
in the thousands per second, which is ridiculous. As an extreme |
117 |
example, an IBM RISC 6000 with 64 MiB of memory reported 3,722 per second |
118 |
on a 50 MiB file. Some have argued that bypassing the cache is artificial |
119 |
since the cache is just doing what it's designed to. True, but in any |
120 |
application that requires rapid random access to file(s) significantly |
121 |
larger than main memory which is running on a system which is doing |
122 |
significant other work, the caches will inevitably max out. There is |
123 |
a hard limit hiding behind the cache which has been observed by the |
124 |
author to be of significant import in many situations - what we are trying |
125 |
to do here is measure that number.</LI>
|
126 |
</OL>
|
127 |
</LI>
|
128 |
|
129 |
<LI>
|
130 |
The file creation tests use file names with 7 digits numbers and a random |
131 |
number (from 0 to 12) of random alpha-numeric characters. |
132 |
For the sequential tests the random characters in the file name follow the |
133 |
number. For the random tests the random characters are first.<BR>
|
134 |
The sequential tests involve creating the files in numeric order, then |
135 |
stat()ing them in readdir() order (IE the order they are stored in the |
136 |
directory which is very likely to be the same order as which they were |
137 |
created), and deleting them in the same order.<BR>
|
138 |
For the random tests we create the files in an order that will appear |
139 |
random to the file system (the last 7 characters are in numeric order on |
140 |
the files). Then we stat() random files (NB this will return very good |
141 |
results on file systems with sorted directories because not every file |
142 |
will be stat()ed and the cache will be more effective). After that we |
143 |
delete all the files in random order.<BR>
|
144 |
If a maximum size greater than 0 is specified then when each file is created |
145 |
it will have a random amount of data written to it. Then when the file is |
146 |
stat()ed it's data will be read. |
147 |
</LI>
|
148 |
</UL>
|
149 |
</LI>
|
150 |
<LI><B>COPYRIGHT NOTICE</B><BR> |
151 |
* Copyright © <A HREF="MAILTO:tbray@textuality.com">Tim Bray |
152 |
(tbray@textuality.com)</A>, 1990.<BR> |
153 |
* Copyright © <A HREF="MAILTO:russell@coker.com.au">Russell Coker |
154 |
(russell@coker.com.au)</A> 1999.<P> |
155 |
I have updated the program, added support for >2G on 32bit machines, and
|
156 |
tests for file creation.<BR>
|
157 |
Licensed under the GPL version 2.0. |
158 |
</LI><LI> |
159 |
<B>DISCLAIMER</B><BR> |
160 |
This program is provided AS IS with no warranty of any kind, and<BR>
|
161 |
The author makes no representation with respect to the adequacy of this |
162 |
program for any particular purpose or with respect to its adequacy to |
163 |
produce any particular result, and<BR>
|
164 |
The authors shall not be liable for loss or damage arising out of |
165 |
the use of this program regardless of how sustained, and |
166 |
In no event shall the author be liable for special, direct, indirect |
167 |
or consequential damage, loss, costs or fees or expenses of any |
168 |
nature or kind.<P>
|
169 |
|
170 |
NB The results of running this program on live server machines can include |
171 |
extremely bad performance of server processes, and excessive consumption of |
172 |
disk space and/or Inodes which may cause the machine to cease performing it's |
173 |
designated tasks. Also the benchmark results are likely to be bad.<P>
|
174 |
Do not run this program on live production machines. |
175 |
</LI>
|
176 |
</UL>
|
177 |
</BODY>
|
178 |
</HTML>
|