Skip to content

notes on smallfile development history

Ben England edited this page Oct 29, 2018 · 8 revisions

10/2018

Support was added for YAML parameter input to ease integration with CI systems like Teuthology and Jenkins. smallfile also switched over to the python argparse module for CLI parsing, to make it easier for others to add new features to it and understand how it works.

1/2014

I added support for use of pypy with smallfile. Since smallfile launches a remote process, it needs to be configured to use pypy interpreter (similar to JIT compiler), the environment variable PYTHON_PROG should allow you to do this, when I used it on a cached read workload I got a factor of 3 speedup.

10/16/2013

I've speeded up smallfile to create files in tmpfs at 40000 files/sec and read them at 60000 files/sec. This is about 5x faster than it used to be. I did this by precomputing thread's subdirectory where each file will go and storing it in a list, so what used to be a loop with string formatting becomes an array reference. Further improvement can probably be had by using "buffer protocol" to avoid copying to/from buffer when using only a slice of it.

10/7/2013

Smallfile starts up quite fast compared to yesterday's release after cutting down max buffer size, eliminating buffer from object before returning to parent process, and optimizing directory tree construction & teardown. After it gets a little more use I'm going to create a stable branch.

10/6/2013

smallfile runs about 7 times faster for the smallest file sizes because of optimizations to mk_file_nm(). This was done with the python profiling module.

9/23/2013

smallfile now supports both python3 and python 2.6/2.7 with same code base, using forward compatibility of python 2.6. This means it should continue to run on older Linux distros such as RHEL6 while now supporting most modern distros.

8/24/2013

If you are interested in participating more in development and support of this benchmark please contact Ben. Here are some release notes about smallfile.

Smallfile uses a shared filesystem directory to coordinate activities of test threads and return test results at present. The smallfile v1.9 series had some problems with finding files in the shared directory, because of distributed filesystems not invalidating old cached directory contents. At least for glusterfs these are resolved now by performing a readdir() call on it before trying to read files in it. This makes it much easier to use, but could eventually limit scalability.

Smallfile V2.0 is a completely refactored code base (some of refactoring is in smallfile v1.9.17 as well). This was done to make the code easier to understand and to pave the way for possible changes in the future (one proposal from Matus Kocka: use MPI to coordinate multi-host tests).

Clone this wiki locally