Category Archives: Testverse

FIO (Flexible I/O Tester) Part2 – First run and defining job files

First there is an official HOWTO from Jens Axboe. So why I am writing a blog series?

  1. Because for someone new to fio it maybe overwhelming.
  2. There are good examples of using it but there are no real world output/result and how to interpret them step by step.
  3. I want to fully understand this tool. Remember 8PP – 4.2
  4. I want to use it in my upcoming storage performance post
  5. Increase the awareness of some fio features most people don’t know about (fio2gnuplot, ioengines=rbd or cpuio)

1. First run

I am working with Testverse on a Samsung 840 at /dev/sda (in my case not root) with ext4 mounted at “/840”.

fio runs jobs (workloads) you define. So lets start with the minimum parameters needed because there are a lot.

first-run

So what happened?

fio ran one job called “first-run”. We did not specify what that job should do except that the job should run until 1 MB  has been transferred (size=1M). But what data has been transfered to where and how? Don´t get confused. You don´t need to understand the whole output (Block1-Block8)  right now.

So fio used some default values in this case which can be seen in Block 1.

fio used the defaults and ran:

  1. one job which is called “first-run”
  2. This job belongs to “group 0”
  3. created a new file with 1MB file size
  4. scheduled “sequential read” against the file
  5. it read 256 times x 4KB blocks

A detailed explanation can be found here.

2. Job Files

If you don´t want to type in long commands in your terminal every time you call fio I advise you to use job files instead. To avoid interpreting issues with file name and option etc. I call the job files “jobfil1”, “jobfile2″… but it is best practice to give meaningful file names like “read_4k_sync_1numjob”.

Job files define the jobs fio should run.

They are structure like the classic ini files. Lets write a file which runs the same job like in 1. First run

File: jobfile1

Now lets run it:

fio_jobfile1

Easy or?

How  to define a job file with 2 jobs which should run in order?

So lets write a file which contains 2 jobs. The first job should read sequential and the second should write sequential 1M.

File: jobfile2

So job1-read will run first and then job2-write will run.

How  to define a job file with 2 jobs which should run in order with the same workload?

Now we can make use of the global section to define defaults for all jobs if they don´t change it in their own section

File: jobfile3

Go Firefox.

FIO (Flexible I/O Tester) Part1 – Installation and compiling if needed

FIO (Flexible I/O Tester)  is a decent I/O test tool which is often used to test the performance of HDD/SSD and PCIe flash drives. But it can do much more. For example did you know that it provides an io-engine to test a CEPH rbd (RADOS block devices) without the need to use the kernel rbd driver?

I couldn’t find good documents which shows more interpretations and explanation of the results you receive from “fio”. Voilà, I will do it then. This little tiny tool is so complex that I am planing to split it in different parts.

But don’t forget: “fio is a synthetic testing (benchmarking) tool which in most cases doesn’t represent real world workloads”

Installation and compiling if needed (Ubuntu)

fio is developed by Jens Axboe and available at github.

These posts are based on Testverse and Ubuntu 14.04.2 but sources are available so you are able to compile it in your environment. Or the easier way is to use the binary packages available for these OSes:

1. Installing the fio binary in Ubuntu 14.04.2

sudo_install_fio

and

list the help of the command.

that’s it…. or?

shows that

fio-2.1.3 is installed. The actual version available at github is 2.2.9   (30.07.2015) so lets have some fun with:

2. Compiling the newest fio version in Ubuntu 14.04.2

I am using git for the installation because I like git.

The ./configure showed that some features are using zlib-devel – so thats the reason why we install it. The packages libaio1 and libaio-dev are needed to use the ioengine libaio which ist often used to measure the raw performance of devices.

In other distributions you may need to install other packages like make, gcc, libaio etc. in advance.For Ubuntu the “build-essential” should work.

make_fio

shows version 2.2.9-g669e

done.

Go hadoop.

 

FIO (Flexible I/O Tester) Appendix 1 – Interpreting the output/result of the “first-run”

first-run

So what happened?

fio ran one job called “first-run”. We did not specify what that job should do except that the job should run until 1 MB  has been transferred (size=1M). But what data has been transfered to where and how?

So fio used some default values in this case which can be seen in Block 1.

Block 1

block1

  • “g=0”
    • this job belongs to group 0 – Groups can be used to aggregate job results.
  •  “rw=read”
    • the default io pattern we use is sequential read
  •  “bs=4K-4K/4K-4K/4K-4K”
    • the default (read/write/trim) blocksize will be 4K
  •  “ioengine=sync”
    • the default ioengine is synchronous so no parallel access (async)
  •  “iodepth=1”
    • per default there will be no more then “1 IO unit” in flight against the device

Block 2

block2

  • “Laying out IO files…. ”
    • This step creates a file if not already existing with 1MB in size with the name “first-run.0.0” in the working directory
    • This file is used for the data transfer

Block 3

block3

  • Name of the job and some infos about it like
    • “err=0”
      • no errors occurred when running this job

Block 4

block4

  • This is the IO statistic for the job.
    • “read” (remember the default for this job is sequential read)
    • “io=1024.0KB”
      • number of KB transfered from file (1 MB)
    • “bw=341333KB/s”
      • we transfered data at a speed of ~333MB per second in average
    • “iops=85333”
      • is the average IO per second (4k in this case).
    • “runt=3msec”
      • The job ran 3 milliseconds

Actually in this case we only scheduled 256 IOs (1MB / 4KB)  to the file. This took only 3 milliseconds. So the value of 85333 does only means that we could achieve these much IO per second if we read for one second.

1 s / 0,003s = ~333  (we could complete 256 IOs in 3ms) = ~85333

  • the rest of Block 4 shows in detail the latency distribution. For more details read Part8.

Block 5

block5

  • “cpu”
    • this line is dedicated to the CPU usage of the running the job
  • “usr=0.00%”
    • this is the percentage of CPU usage of the running job at user level
    • Its nearly 0. remember the job ran only for 3ms so no impact on CPU
    • 100% would mean that one CPU core will be at 100% workload, depending if HT on/off
  • “sys=0.00%”
    • this is the percentage of CPU usage of the running job at system/kernel level
  • “ctx=8”
    • The number of context switches this thread encountered while running
  • “majf=0” and “minf=0”
    • The number of major and minor page faults

 Block 6

block6

  •  this blocks shows the distribution of IO depths over the job lifetime.
    • “IO depths :   1=100.0%… ”
      • this number showed that job was able to always have 1 IO unit in flight (see Block 1)
    • “submit: …..4=100.0%….”
      • shows how many IO were submitted in a single call. In this case it could be in the range of 1 to 4
      • we know that the IO depths was always at 1 so this indicates the submitted IO in a call have been 1 all time
    • “complete: …. 4=100.0%….”
      • same like submit but for complete calls.
    • “issued:….total=r=256/w=0/d=0″…
      • 256 read IO have been issued, no writes, no discards and none of them have been short

 Block 7

block7

  •  This is the group statistic. We ran only one job belonging to group 0
    • READ
      • “io=1024KB”
        • As in the job statics the same amount of transfered MB here
      • “aggrb=341333KB/s”
        • aggregated bandwidth of all jobs/threads for group 0
      • “minb=341333KB/s maxb=341333KB/s”
        • The minimum bandwidth one thread saw. In this case is the minimum the same as the maximum because it run only 3 ms
      • “mint=3msec” and “maxt=3msec”
        • Smallest and longest runtime of one of the jobs. The same because we ran only one job

 Block 8

block8

  • Disks statistics for involved disks but they look strange or?
    • “sda: ios=0/0”
      • 0 READ IO and 0 WRITE IO on /dev/sda
    • “merge=0/0” number of merges the IO from the IO scheduler
      • no merges here
    • “ticks=0/0”
      • number of ticks we kept the drive busy.. never
    • “io_queue=0”
      • total time spend in the disk queue
    • “util=0.00%”
      • the utilization of the drive -> nothing be done on disk?

So what we are seeing here is probably the linux file cache/buffer(page cache) for ext4 files. It seems the blocks are already prefetched. And the linux readahead can have an influence as well.

1. Testverse – A universe goes live

For the upcoming storage performance tests for docker, MySQL, etc. I build a decent test machine. This box will be called Testverse (Test Universe). Please no comments about the cables 🙂

testverse

But why do I use universe in the name?

I learned years ago (feels like ages) that the first step when forming sentential logic statements is to the define your universe. That means a statement can be true or false depending on the environment (universe).

Example:

  • Statement I am saying right now: “I am drunken.”  which is not true.
  • but the same statement at a Saturday evening with friends at a bar maybe true.

So it is important for performance tuning/analysis/testing to clearly define how and in which environment you are testing. If you do so, others are able to repeat the same tests or can compare it to similar environments. This helps to prove your statements or to prove you are wrong. Even if others prove that you are wrong, its good, because then your are able to improve or correct your statement. Remember the 8PP 1.2.

The table is based on Ubuntu 14.04.2 server which will be the OS if not mentioned separately in other posts.

Testverse Hardware

ItemDescriptionFirmwareDriverHints
Chieftec Smart CH-09B-BMidi Tower---
ASUS P9X79 PROSocket 2011 with X79 Chipset4801
(4701 before 14.09.2015)
-Intel SpeedStep On
HT - On
Intel VT - On
Intel VT-d On
Intel Core i7-3820 Bx4 Cores @3,6Ghz--OC - Off
Turbo mode - on
64GB-Kit Corsair XMS38 x 8GB Modules--DRAM at 1347Mhz
EVGA GeForce GTX 650cheap GPU , not really needed---
Samsung 840 SSD 128GB128 GB SSD at SATA 3.1 - 6 Gb/sDXT09B0Qtest drive
/dev/sda
smart On
Samsung 840 SSD 128GB128 GB SSD at SATA 3.1- 6 Gb/sDXT09B0QOS drive
/dev/sdb
Sandisk/Fusion-IO PX-600 1000PCIe Flash card 8.9.14.2.1test drive
/dev/fioa
Samsung EcoGreen F4 HD204UI at SATA 2.6 - 3 Gb/s1AQ10001test drive
/dev/sdc
Brocade CNA 10202 x 10 Gbit/s Ethernet Adapter3.2.5bna - 3.2.23.01 x 10Gbe mainly for cluster interconnection
Intel® 82579V, 1 x Gbit/s (Onboard)1 x 1Gbit/s Ethernet 0.13-4e1000e - 2.3.2-kmanagement interface
be quiet! SYSTEM POWER 7600W power supply-
-hope 600W will be enough