Merge pull request #210 from ceph/wip-boto-3

boto3: Foundation laid for boto3 tests Reviewed-by: Casey Bodley <cbodley@redhat.com>
2025-04-18 11:08:09 +00:00 · 2019-01-17 14:00:37 -05:00 · 2019-01-17 14:00:37 -05:00 · daade6614f
commit daade6614f
parent fa979f416d 67f4f5d356
25 changed files with 13990 additions and 150 deletions
--- a/README.rst
+++ b/README.rst
@ -2,15 +2,9 @@
 S3 compatibility tests
 ========================

-This is a set of completely unofficial Amazon AWS S3 compatibility
-tests, that will hopefully be useful to people implementing software
-that exposes an S3-like API.
-
-The tests only cover the REST interface.
-
-The tests use the Boto library, so any e.g. HTTP-level differences
-that Boto papers over, the tests will not be able to discover. Raw
-HTTP tests may be added later.
+This is a set of unofficial Amazon AWS S3 compatibility
+tests, that can be useful to people implementing software
+that exposes an S3-like API. The tests use the Boto2 and Boto3 libraries.

 The tests use the Nose test framework. To get started, ensure you have
 the ``virtualenv`` software installed; e.g. on Debian/Ubuntu::
@ -22,76 +16,41 @@ and then run::
 	./bootstrap

 You will need to create a configuration file with the location of the
-service and two different credentials, something like this::
+service and two different credentials. A sample configuration file named
+``s3tests.conf.SAMPLE`` has been provided in this repo. This file can be
+used to run the s3 tests on a Ceph cluster started with vstart.

-	[DEFAULT]
-	## this section is just used as default for all the "s3 *"
-        ## sections, you can place these variables also directly there
-
-	## replace with e.g. "localhost" to run against local software
-	host = s3.amazonaws.com
-
-	## uncomment the port to use something other than 80
-	# port = 8080
-
-	## say "no" to disable TLS
-	is_secure = yes
-
-	[fixtures]
-	## all the buckets created will start with this prefix;
-	## {random} will be filled with random characters to pad
-	## the prefix to 30 characters long, and avoid collisions
-	bucket prefix = YOURNAMEHERE-{random}-
-
-	[s3 main]
-	## the tests assume two accounts are defined, "main" and "alt".
-
-	## user_id is a 64-character hexstring
-	user_id = 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
-
-	## display name typically looks more like a unix login, "jdoe" etc
-	display_name = youruseridhere
-
-	## replace these with your access keys
-	access_key = ABCDEFGHIJKLMNOPQRST
-	secret_key = abcdefghijklmnopqrstuvwxyzabcdefghijklmn
-
-	## replace with key id obtained when secret is created, or delete if KMS not tested
-	kms_keyid = 01234567-89ab-cdef-0123-456789abcdef
-
-	[s3 alt]
-	## another user account, used for ACL-related tests
-	user_id = 56789abcdef0123456789abcdef0123456789abcdef0123456789abcdef01234
-	display_name = john.doe
-	## the "alt" user needs to have email set, too
-	email = john.doe@example.com
-	access_key = NOPQRSTUVWXYZABCDEFG
-	secret_key = nopqrstuvwxyzabcdefghijklmnabcdefghijklm
-
-Once you have that, you can run the tests with::
+Once you have that file copied and edited, you can run the tests with::

 	S3TEST_CONF=your.conf ./virtualenv/bin/nosetests

+You can specify which directory of tests to run::
+
+	S3TEST_CONF=your.conf ./virtualenv/bin/nosetests s3tests.functional
+
+You can specify which file of tests to run::
+
+	S3TEST_CONF=your.conf ./virtualenv/bin/nosetests s3tests.functional.test_s3
+
+You can specify which test to run::
+
+	S3TEST_CONF=your.conf ./virtualenv/bin/nosetests s3tests.functional.test_s3:test_bucket_list_empty
+
 To gather a list of tests being run, use the flags::

 	 -v --collect-only

-You can specify what test(s) to run::
-
-	S3TEST_CONF=your.conf ./virtualenv/bin/nosetests s3tests.functional.test_s3:test_bucket_list_empty
-
 Some tests have attributes set based on their current reliability and
 things like AWS not enforcing their spec stricly. You can filter tests
 based on their attributes::

 	S3TEST_CONF=aws.conf ./virtualenv/bin/nosetests -a '!fails_on_aws'

+Most of the tests have both Boto3 and Boto2 versions. Tests written in
+Boto2 are in the ``s3tests`` directory. Tests written in Boto3 are
+located in the ``s3test_boto3`` directory.

-TODO
-====
+You can run only the boto3 tests with::

- We should assume read-after-write consistency, and make the tests
-  actually request such a location.
-  http://aws.amazon.com/s3/faqs/#What_data_consistency_model_does_Amazon_S3_employ
+        S3TEST_CONF=your.conf ./virtualenv/bin/nosetests -v -s -A 'not fails_on_rgw' s3tests_boto3.functional

- Test direct HTTP downloads, like a web browser would do.
--- a/config.yaml.SAMPLE
+++ b/config.yaml.SAMPLE
@ -1,85 +0,0 @@
-fixtures:
-## All the buckets created will start with this prefix;
-## {random} will be filled with random characters to pad
-## the prefix to 30 characters long, and avoid collisions
-  bucket prefix: YOURNAMEHERE-{random}-
-
-file_generation:
-  groups:
-## File generation works by creating N groups of files. Each group of
-## files is defined by three elements: number of files, avg(filesize),
-## and stddev(filesize) -- in that order.
-    - [1, 2, 3]
-    - [4, 5, 6]
-
-## Config for the readwrite tool.
-## The readwrite tool concurrently reads and writes to files in a
-## single bucket for a set duration.
-## Note: the readwrite tool does not need the s3.alt connection info.
-## only s3.main is used.
-readwrite:
-## The number of reader and writer worker threads. This sets how many
-## files will be read and written concurrently.
-  readers: 2
-  writers: 2
-## The duration to run in seconds. Doesn't count setup/warmup time
-  duration: 15
-
-  files:
-## The number of files to use. This number of files is created during the
-## "warmup" phase. After the warmup, readers will randomly pick a file to
-## read, and writers will randomly pick a file to overwrite
-    num: 3
-## The file size to use, in KB
-    size: 1024
-## The stddev for the file size, in KB
-    stddev: 0
-
-s3:
-## This section contains all the connection information
-
-  defaults:
-## This section contains the defaults for all of the other connections
-## below. You can also place these variables directly there.
-
-## Replace with e.g. "localhost" to run against local software
-    host: s3.amazonaws.com
-
-## Uncomment the port to use soemthing other than 80
-#    port: 8080
-
-## Say "no" to disable TLS.
-    is_secure: yes
-
-## The tests assume two accounts are defined, "main" and "alt". You
-## may add other connections to be instantianted as well, however
-## any additional ones will not be used unless your tests use them.
-
-  main:
-
-## The User ID that the S3 provider gives you. For AWS, this is
-## typically a 64-char hexstring.
-    user_id: AWS_USER_ID
-
-## Display name typically looks more like a unix login, "jdoe" etc
-    display_name: AWS_DISPLAY_NAME
-
-## The email for this account.
-    email: AWS_EMAIL
-
-## Replace these with your access keys.
-    access_key: AWS_ACCESS_KEY
-    secret_key: AWS_SECRET_KEY
-
-## If KMS is tested, this if barbican key id. Optional.
-   kms_keyid: barbican_key_id
-
-  alt:
-## Another user accout, used for ACL-related tests.
-
-    user_id: AWS_USER_ID
-    display_name: AWS_DISPLAY_NAME
-    email: AWS_EMAIL
-    access_key: AWS_ACCESS_KEY
-    secret_key: AWS_SECRET_KEY
-
--- a/requirements.txt
+++ b/requirements.txt
@ -1,6 +1,7 @@
 PyYAML
 nose >=1.0.0
 boto >=2.6.0
+boto3 >=1.0.0
 bunch >=1.0.0
 # 0.14 switches to libev, that means bootstrap needs to change too
 gevent >=1.0
--- a/s3tests.conf.SAMPLE
+++ b/s3tests.conf.SAMPLE
@ -0,0 +1,69 @@
+[DEFAULT]
+## this section is just used for host, port and bucket_prefix
+
+# host set for rgw in vstart.sh
+host = localhost
+
+# port set for rgw in vstart.sh
+port = 8000
+
+## say "False" to disable TLS
+is_secure = False
+
+[fixtures]
+## all the buckets created will start with this prefix;
+## {random} will be filled with random characters to pad
+## the prefix to 30 characters long, and avoid collisions
+bucket prefix = yournamehere-{random}-
+
+[s3 main]
+# main display_name set in vstart.sh
+display_name = M. Tester
+
+# main user_idname set in vstart.sh
+user_id = testid
+
+# main email set in vstart.sh
+email = tester@ceph.com
+
+api_name = ""
+
+## main AWS access key
+access_key = 0555b35654ad1656d804
+
+## main AWS secret key
+secret_key = h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q==
+
+## replace with key id obtained when secret is created, or delete if KMS not tested
+#kms_keyid = 01234567-89ab-cdef-0123-456789abcdef
+
+[s3 alt]
+# alt display_name set in vstart.sh
+display_name = john.doe
+## alt email set in vstart.sh
+email = john.doe@example.com
+
+# alt user_id set in vstart.sh
+user_id = 56789abcdef0123456789abcdef0123456789abcdef0123456789abcdef01234
+
+# alt AWS access key set in vstart.sh
+access_key = NOPQRSTUVWXYZABCDEFG
+
+# alt AWS secret key set in vstart.sh
+secret_key = nopqrstuvwxyzabcdefghijklmnabcdefghijklm
+
+[s3 tenant]
+# tenant display_name set in vstart.sh
+display_name = testx$tenanteduser
+
+# tenant user_id set in vstart.sh
+user_id = 9876543210abcdef0123456789abcdef0123456789abcdef0123456789abcdef
+
+# tenant AWS secret key set in vstart.sh
+access_key = HIJKLMNOPQRSTUVWXYZA
+
+# tenant AWS secret key set in vstart.sh
+secret_key = opqrstuvwxyzabcdefghijklmnopqrstuvwxyzab
+
+# tenant email set in vstart.sh
+email = tenanteduser@example.com
--- a/s3tests_boto3/init.py
+++ b/s3tests_boto3/init.py
--- a/s3tests_boto3/analysis/init.py
+++ b/s3tests_boto3/analysis/init.py
--- a/s3tests_boto3/analysis/rwstats.py
+++ b/s3tests_boto3/analysis/rwstats.py
@ -0,0 +1,142 @@
+#!/usr/bin/python
+import sys
+import os
+import yaml
+import optparse
+
+NANOSECONDS = int(1e9)
+
+# Output stats in a format similar to siege
+# see http://www.joedog.org/index/siege-home
+OUTPUT_FORMAT = """Stats for type: [{type}]
+Transactions:            {trans:>11} hits
+Availability:            {avail:>11.2f} %
+Elapsed time:            {elapsed:>11.2f} secs
+Data transferred:        {data:>11.2f} MB
+Response time:           {resp_time:>11.2f} secs
+Transaction rate:        {trans_rate:>11.2f} trans/sec
+Throughput:              {data_rate:>11.2f} MB/sec
+Concurrency:             {conc:>11.2f}
+Successful transactions: {trans_success:>11}
+Failed transactions:     {trans_fail:>11}
+Longest transaction:     {trans_long:>11.2f}
+Shortest transaction:    {trans_short:>11.2f}
+"""
+
+def parse_options():
+    usage = "usage: %prog [options]"
+    parser = optparse.OptionParser(usage=usage)
+    parser.add_option(
+        "-f", "--file", dest="input", metavar="FILE",
+        help="Name of input YAML file. Default uses sys.stdin")
+    parser.add_option(
+        "-v", "--verbose", dest="verbose", action="store_true",
+        help="Enable verbose output")
+
+    (options, args) = parser.parse_args()
+
+    if not options.input and os.isatty(sys.stdin.fileno()):
+        parser.error("option -f required if no data is provided "
+                     "in stdin")
+
+    return (options, args)
+
+def main():
+    (options, args) = parse_options()
+
+    total     = {}
+    durations = {}
+    min_time  = {}
+    max_time  = {}
+    errors    = {}
+    success   = {}
+
+    calculate_stats(options, total, durations, min_time, max_time, errors,
+                    success)
+    print_results(total, durations, min_time, max_time, errors, success)
+
+def calculate_stats(options, total, durations, min_time, max_time, errors,
+                    success):
+    print 'Calculating statistics...'
+    
+    f = sys.stdin
+    if options.input:
+        f = file(options.input, 'r')
+
+    for item in yaml.safe_load_all(f):
+        type_ = item.get('type')
+        if type_ not in ('r', 'w'):
+            continue # ignore any invalid items
+
+        if 'error' in item:
+            errors[type_] = errors.get(type_, 0) + 1
+            continue # skip rest of analysis for this item
+        else:
+            success[type_] = success.get(type_, 0) + 1
+
+        # parse the item
+        data_size = item['chunks'][-1][0]
+        duration = item['duration']
+        start = item['start']
+        end = start + duration / float(NANOSECONDS)
+
+        if options.verbose:
+            print "[{type}] POSIX time: {start:>18.2f} - {end:<18.2f} " \
+                  "{data:>11.2f} KB".format(
+                type=type_,
+                start=start,
+                end=end,
+                data=data_size / 1024.0, # convert to KB
+                )
+
+        # update time boundaries
+        prev = min_time.setdefault(type_, start)
+        if start < prev:
+            min_time[type_] = start
+        prev = max_time.setdefault(type_, end)
+        if end > prev:
+            max_time[type_] = end
+
+        # save the duration
+        if type_ not in durations:
+            durations[type_] = []
+        durations[type_].append(duration)
+
+        # add to running totals
+        total[type_] = total.get(type_, 0) + data_size
+
+def print_results(total, durations, min_time, max_time, errors, success):
+    for type_ in total.keys():
+        trans_success = success.get(type_, 0)
+        trans_fail    = errors.get(type_, 0)
+        trans         = trans_success + trans_fail
+        avail         = trans_success * 100.0 / trans
+        elapsed       = max_time[type_] - min_time[type_]
+        data          = total[type_] / 1024.0 / 1024.0 # convert to MB
+        resp_time     = sum(durations[type_]) / float(NANOSECONDS) / \
+                        len(durations[type_])
+        trans_rate    = trans / elapsed
+        data_rate     = data / elapsed
+        conc          = trans_rate * resp_time
+        trans_long    = max(durations[type_]) / float(NANOSECONDS)
+        trans_short   = min(durations[type_]) / float(NANOSECONDS)
+
+        print OUTPUT_FORMAT.format(
+            type=type_,
+            trans_success=trans_success,
+            trans_fail=trans_fail,
+            trans=trans,
+            avail=avail,
+            elapsed=elapsed,
+            data=data,
+            resp_time=resp_time,
+            trans_rate=trans_rate,
+            data_rate=data_rate,
+            conc=conc,
+            trans_long=trans_long,
+            trans_short=trans_short,
+            )
+
+if __name__ == '__main__':
+    main()
+
--- a/s3tests_boto3/common.py
+++ b/s3tests_boto3/common.py
@ -0,0 +1,301 @@
+import boto.s3.connection
+import bunch
+import itertools
+import os
+import random
+import string
+import yaml
+import re
+from lxml import etree
+
+from doctest import Example
+from lxml.doctestcompare import LXMLOutputChecker
+
+s3 = bunch.Bunch()
+config = bunch.Bunch()
+prefix = ''
+
+bucket_counter = itertools.count(1)
+key_counter = itertools.count(1)
+
+def choose_bucket_prefix(template, max_len=30):
+    """
+    Choose a prefix for our test buckets, so they're easy to identify.
+
+    Use template and feed it more and more random filler, until it's
+    as long as possible but still below max_len.
+    """
+    rand = ''.join(
+        random.choice(string.ascii_lowercase + string.digits)
+        for c in range(255)
+        )
+
+    while rand:
+        s = template.format(random=rand)
+        if len(s) <= max_len:
+            return s
+        rand = rand[:-1]
+
+    raise RuntimeError(
+        'Bucket prefix template is impossible to fulfill: {template!r}'.format(
+            template=template,
+            ),
+        )
+
+def nuke_bucket(bucket):
+    try:
+        bucket.set_canned_acl('private')
+        # TODO: deleted_cnt and the while loop is a work around for rgw
+        # not sending the
+        deleted_cnt = 1
+        while deleted_cnt:
+            deleted_cnt = 0
+            for key in bucket.list():
+                print 'Cleaning bucket {bucket} key {key}'.format(
+                    bucket=bucket,
+                    key=key,
+                    )
+                key.set_canned_acl('private')
+                key.delete()
+                deleted_cnt += 1
+        bucket.delete()
+    except boto.exception.S3ResponseError as e:
+        # TODO workaround for buggy rgw that fails to send
+        # error_code, remove
+        if (e.status == 403
+            and e.error_code is None
+            and e.body == ''):
+            e.error_code = 'AccessDenied'
+        if e.error_code != 'AccessDenied':
+            print 'GOT UNWANTED ERROR', e.error_code
+            raise
+        # seems like we're not the owner of the bucket; ignore
+        pass
+
+def nuke_prefixed_buckets():
+    for name, conn in s3.items():
+        print 'Cleaning buckets from connection {name}'.format(name=name)
+        for bucket in conn.get_all_buckets():
+            if bucket.name.startswith(prefix):
+                print 'Cleaning bucket {bucket}'.format(bucket=bucket)
+                nuke_bucket(bucket)
+
+    print 'Done with cleanup of test buckets.'
+
+def read_config(fp):
+    config = bunch.Bunch()
+    g = yaml.safe_load_all(fp)
+    for new in g:
+        config.update(bunch.bunchify(new))
+    return config
+
+def connect(conf):
+    mapping = dict(
+        port='port',
+        host='host',
+        is_secure='is_secure',
+        access_key='aws_access_key_id',
+        secret_key='aws_secret_access_key',
+        )
+    kwargs = dict((mapping[k],v) for (k,v) in conf.iteritems() if k in mapping)
+    #process calling_format argument
+    calling_formats = dict(
+        ordinary=boto.s3.connection.OrdinaryCallingFormat(),
+        subdomain=boto.s3.connection.SubdomainCallingFormat(),
+        vhost=boto.s3.connection.VHostCallingFormat(),
+        )
+    kwargs['calling_format'] = calling_formats['ordinary']
+    if conf.has_key('calling_format'):
+        raw_calling_format = conf['calling_format']
+        try:
+            kwargs['calling_format'] = calling_formats[raw_calling_format]
+        except KeyError:
+            raise RuntimeError(
+                'calling_format unknown: %r' % raw_calling_format
+                )
+    # TODO test vhost calling format
+    conn = boto.s3.connection.S3Connection(**kwargs)
+    return conn
+
+def setup():
+    global s3, config, prefix
+    s3.clear()
+    config.clear()
+
+    try:
+        path = os.environ['S3TEST_CONF']
+    except KeyError:
+        raise RuntimeError(
+            'To run tests, point environment '
+            + 'variable S3TEST_CONF to a config file.',
+            )
+    with file(path) as f:
+        config.update(read_config(f))
+
+    # These 3 should always be present.
+    if 's3' not in config:
+        raise RuntimeError('Your config file is missing the s3 section!')
+    if 'defaults' not in config.s3:
+        raise RuntimeError('Your config file is missing the s3.defaults section!')
+    if 'fixtures' not in config:
+        raise RuntimeError('Your config file is missing the fixtures section!')
+
+    template = config.fixtures.get('bucket prefix', 'test-{random}-')
+    prefix = choose_bucket_prefix(template=template)
+    if prefix == '':
+        raise RuntimeError("Empty Prefix! Aborting!")
+
+    defaults = config.s3.defaults
+    for section in config.s3.keys():
+        if section == 'defaults':
+            continue
+
+        conf = {}
+        conf.update(defaults)
+        conf.update(config.s3[section])
+        conn = connect(conf)
+        s3[section] = conn
+
+    # WARNING! we actively delete all buckets we see with the prefix
+    # we've chosen! Choose your prefix with care, and don't reuse
+    # credentials!
+
+    # We also assume nobody else is going to use buckets with that
+    # prefix. This is racy but given enough randomness, should not
+    # really fail.
+    nuke_prefixed_buckets()
+
+def get_new_bucket(connection=None):
+    """
+    Get a bucket that exists and is empty.
+
+    Always recreates a bucket from scratch. This is useful to also
+    reset ACLs and such.
+    """
+    if connection is None:
+        connection = s3.main
+    name = '{prefix}{num}'.format(
+        prefix=prefix,
+        num=next(bucket_counter),
+        )
+    # the only way for this to fail with a pre-existing bucket is if
+    # someone raced us between setup nuke_prefixed_buckets and here;
+    # ignore that as astronomically unlikely
+    bucket = connection.create_bucket(name)
+    return bucket
+
+def teardown():
+    nuke_prefixed_buckets()
+
+def with_setup_kwargs(setup, teardown=None):
+    """Decorator to add setup and/or teardown methods to a test function::
+
+      @with_setup_args(setup, teardown)
+      def test_something():
+          " ... "
+
+    The setup function should return (kwargs) which will be passed to
+    test function, and teardown function.
+
+    Note that `with_setup_kwargs` is useful *only* for test functions, not for test
+    methods or inside of TestCase subclasses.
+    """
+    def decorate(func):
+        kwargs = {}
+
+        def test_wrapped(*args, **kwargs2):
+            k2 = kwargs.copy()
+            k2.update(kwargs2)
+            k2['testname'] = func.__name__
+            func(*args, **k2)
+
+        test_wrapped.__name__ = func.__name__
+
+        def setup_wrapped():
+            k = setup()
+            kwargs.update(k)
+            if hasattr(func, 'setup'):
+                func.setup()
+        test_wrapped.setup = setup_wrapped
+
+        if teardown:
+            def teardown_wrapped():
+                if hasattr(func, 'teardown'):
+                    func.teardown()
+                teardown(**kwargs)
+
+            test_wrapped.teardown = teardown_wrapped
+        else:
+            if hasattr(func, 'teardown'):
+                test_wrapped.teardown = func.teardown()
+        return test_wrapped
+    return decorate
+
+# Demo case for the above, when you run test_gen():
+# _test_gen will run twice,
+# with the following stderr printing
+# setup_func {'b': 2}
+# testcase ('1',) {'b': 2, 'testname': '_test_gen'}
+# teardown_func {'b': 2}
+# setup_func {'b': 2}
+# testcase () {'b': 2, 'testname': '_test_gen'}
+# teardown_func {'b': 2}
+# 
+#def setup_func():
+#    kwargs = {'b': 2}
+#    print("setup_func", kwargs, file=sys.stderr)
+#    return kwargs
+#
+#def teardown_func(**kwargs):
+#    print("teardown_func", kwargs, file=sys.stderr)
+#
+#@with_setup_kwargs(setup=setup_func, teardown=teardown_func)
+#def _test_gen(*args, **kwargs):
+#    print("testcase", args, kwargs, file=sys.stderr)
+#
+#def test_gen():
+#    yield _test_gen, '1'
+#    yield _test_gen
+
+def trim_xml(xml_str):
+    p = etree.XMLParser(remove_blank_text=True)
+    elem = etree.XML(xml_str, parser=p)
+    return etree.tostring(elem)
+
+def normalize_xml(xml, pretty_print=True):
+    if xml is None:
+        return xml
+
+    root = etree.fromstring(xml.encode(encoding='ascii'))
+
+    for element in root.iter('*'):
+        if element.text is not None and not element.text.strip():
+            element.text = None
+        if element.text is not None:
+            element.text = element.text.strip().replace("\n", "").replace("\r", "")
+        if element.tail is not None and not element.tail.strip():
+            element.tail = None
+        if element.tail is not None:
+            element.tail = element.tail.strip().replace("\n", "").replace("\r", "")
+
+    # Sort the elements
+    for parent in root.xpath('//*[./*]'): # Search for parent elements
+          parent[:] = sorted(parent,key=lambda x: x.tag)
+
+    xmlstr = etree.tostring(root, encoding="utf-8", xml_declaration=True, pretty_print=pretty_print)
+    # there are two different DTD URIs
+    xmlstr = re.sub(r'xmlns="[^"]+"', 'xmlns="s3"', xmlstr)
+    xmlstr = re.sub(r'xmlns=\'[^\']+\'', 'xmlns="s3"', xmlstr)
+    for uri in ['http://doc.s3.amazonaws.com/doc/2006-03-01/', 'http://s3.amazonaws.com/doc/2006-03-01/']:
+        xmlstr = xmlstr.replace(uri, 'URI-DTD')
+    #xmlstr = re.sub(r'>\s+', '>', xmlstr, count=0, flags=re.MULTILINE)
+    return xmlstr
+
+def assert_xml_equal(got, want):
+    assert want is not None, 'Wanted XML cannot be None'
+    if got is None:
+        raise AssertionError('Got input to validate was None')
+    checker = LXMLOutputChecker()
+    if not checker.check_output(want, got, 0):
+        message = checker.output_difference(Example("", want), got, 0)
+        raise AssertionError(message)
--- a/s3tests_boto3/functional/init.py
+++ b/s3tests_boto3/functional/init.py
@ -0,0 +1,405 @@
+import boto3
+from botocore import UNSIGNED
+from botocore.client import Config
+from botocore.handlers import disable_signing
+import ConfigParser
+import os
+import bunch
+import random
+import string
+import itertools
+
+config = bunch.Bunch
+
+# this will be assigned by setup()
+prefix = None
+
+def get_prefix():
+    assert prefix is not None
+    return prefix
+
+def choose_bucket_prefix(template, max_len=30):
+    """
+    Choose a prefix for our test buckets, so they're easy to identify.
+
+    Use template and feed it more and more random filler, until it's
+    as long as possible but still below max_len.
+    """
+    rand = ''.join(
+        random.choice(string.ascii_lowercase + string.digits)
+        for c in range(255)
+        )
+
+    while rand:
+        s = template.format(random=rand)
+        if len(s) <= max_len:
+            return s
+        rand = rand[:-1]
+
+    raise RuntimeError(
+        'Bucket prefix template is impossible to fulfill: {template!r}'.format(
+            template=template,
+            ),
+        )
+
+def get_buckets_list(client=None, prefix=None):
+    if client == None:
+        client = get_client()
+    if prefix == None:
+        prefix = get_prefix()
+    response = client.list_buckets()
+    bucket_dicts = response['Buckets']
+    buckets_list = []
+    for bucket in bucket_dicts:
+        if prefix in bucket['Name']:
+            buckets_list.append(bucket['Name'])
+
+    return buckets_list
+
+def get_objects_list(bucket, client=None, prefix=None):
+    if client == None:
+        client = get_client()
+
+    if prefix == None:
+        response = client.list_objects(Bucket=bucket)
+    else:
+        response = client.list_objects(Bucket=bucket, Prefix=prefix)
+    objects_list = []
+
+    if 'Contents' in response:
+        contents = response['Contents']
+        for obj in contents:
+            objects_list.append(obj['Key'])
+
+    return objects_list
+
+def get_versioned_objects_list(bucket, client=None):
+    if client == None:
+        client = get_client()
+    response = client.list_object_versions(Bucket=bucket)
+    versioned_objects_list = []
+
+    if 'Versions' in response:
+        contents = response['Versions']
+        for obj in contents:
+            key = obj['Key']
+            version_id = obj['VersionId']
+            versioned_obj = (key,version_id)
+            versioned_objects_list.append(versioned_obj)
+
+    return versioned_objects_list
+
+def get_delete_markers_list(bucket, client=None):
+    if client == None:
+        client = get_client()
+    response = client.list_object_versions(Bucket=bucket)
+    delete_markers = []
+
+    if 'DeleteMarkers' in response:
+        contents = response['DeleteMarkers']
+        for obj in contents:
+            key = obj['Key']
+            version_id = obj['VersionId']
+            versioned_obj = (key,version_id)
+            delete_markers.append(versioned_obj)
+
+    return delete_markers
+
+
+def nuke_prefixed_buckets(prefix, client=None):
+    if client == None:
+        client = get_client()
+
+    buckets = get_buckets_list(client, prefix)
+
+    if buckets != []:
+        for bucket_name in buckets:
+            objects_list = get_objects_list(bucket_name, client)
+            for obj in objects_list:
+                response = client.delete_object(Bucket=bucket_name,Key=obj)
+            versioned_objects_list = get_versioned_objects_list(bucket_name, client)
+            for obj in versioned_objects_list:
+                response = client.delete_object(Bucket=bucket_name,Key=obj[0],VersionId=obj[1])
+            delete_markers = get_delete_markers_list(bucket_name, client)
+            for obj in delete_markers:
+                response = client.delete_object(Bucket=bucket_name,Key=obj[0],VersionId=obj[1])
+            client.delete_bucket(Bucket=bucket_name)
+
+    print('Done with cleanup of buckets in tests.')
+
+def setup():
+    cfg = ConfigParser.RawConfigParser()
+    try:
+        path = os.environ['S3TEST_CONF']
+    except KeyError:
+        raise RuntimeError(
+            'To run tests, point environment '
+            + 'variable S3TEST_CONF to a config file.',
+            )
+    with file(path) as f:
+        cfg.readfp(f)
+
+    if not cfg.defaults():
+        raise RuntimeError('Your config file is missing the DEFAULT section!')
+    if not cfg.has_section("s3 main"):
+        raise RuntimeError('Your config file is missing the "s3 main" section!')
+    if not cfg.has_section("s3 alt"):
+        raise RuntimeError('Your config file is missing the "s3 alt" section!')
+    if not cfg.has_section("s3 tenant"):
+        raise RuntimeError('Your config file is missing the "s3 tenant" section!')
+
+    global prefix
+
+    defaults = cfg.defaults()
+
+    # vars from the DEFAULT section
+    config.default_host = defaults.get("host")
+    config.default_port = int(defaults.get("port"))
+    config.default_is_secure = defaults.get("is_secure")
+
+    # vars from the main section
+    config.main_access_key = cfg.get('s3 main',"access_key")
+    config.main_secret_key = cfg.get('s3 main',"secret_key")
+    config.main_display_name = cfg.get('s3 main',"display_name")
+    config.main_user_id = cfg.get('s3 main',"user_id")
+    config.main_email = cfg.get('s3 main',"email")
+    try:
+        config.main_kms_keyid = cfg.get('s3 main',"kms_keyid")
+    except (ConfigParser.NoSectionError, ConfigParser.NoOptionError):
+        config.main_kms_keyid = None
+        pass
+
+    try:
+        config.main_api_name = cfg.get('s3 main',"api_name")
+    except (ConfigParser.NoSectionError, ConfigParser.NoOptionError):
+        config.main_api_name = ""
+        pass
+
+    config.alt_access_key = cfg.get('s3 alt',"access_key")
+    config.alt_secret_key = cfg.get('s3 alt',"secret_key")
+    config.alt_display_name = cfg.get('s3 alt',"display_name")
+    config.alt_user_id = cfg.get('s3 alt',"user_id")
+    config.alt_email = cfg.get('s3 alt',"email")
+
+    config.tenant_access_key = cfg.get('s3 tenant',"access_key")
+    config.tenant_secret_key = cfg.get('s3 tenant',"secret_key")
+    config.tenant_display_name = cfg.get('s3 tenant',"display_name")
+    config.tenant_user_id = cfg.get('s3 tenant',"user_id")
+    config.tenant_email = cfg.get('s3 tenant',"email")
+
+    # vars from the fixtures section
+    try:
+        template = cfg.get('fixtures', "bucket prefix")
+    except (ConfigParser.NoOptionError):
+        template = 'test-{random}-'
+    prefix = choose_bucket_prefix(template=template)
+
+    alt_client = get_alt_client()
+    tenant_client = get_tenant_client()
+    nuke_prefixed_buckets(prefix=prefix)
+    nuke_prefixed_buckets(prefix=prefix, client=alt_client)
+    nuke_prefixed_buckets(prefix=prefix, client=tenant_client)
+
+def teardown():
+    alt_client = get_alt_client()
+    tenant_client = get_tenant_client()
+    nuke_prefixed_buckets(prefix=prefix)
+    nuke_prefixed_buckets(prefix=prefix, client=alt_client)
+    nuke_prefixed_buckets(prefix=prefix, client=tenant_client)
+
+def get_client(client_config=None):
+    if client_config == None:
+        client_config = Config(signature_version='s3v4')
+
+    endpoint_url = "http://%s:%d" % (config.default_host, config.default_port)
+
+    client = boto3.client(service_name='s3',
+                        aws_access_key_id=config.main_access_key,
+                        aws_secret_access_key=config.main_secret_key,
+                        endpoint_url=endpoint_url,
+                        use_ssl=config.default_is_secure,
+                        verify=False,
+                        config=client_config)
+    return client
+
+def get_v2_client():
+
+    endpoint_url = "http://%s:%d" % (config.default_host, config.default_port)
+
+    client = boto3.client(service_name='s3',
+                        aws_access_key_id=config.main_access_key,
+                        aws_secret_access_key=config.main_secret_key,
+                        endpoint_url=endpoint_url,
+                        use_ssl=config.default_is_secure,
+                        verify=False,
+                        config=Config(signature_version='s3'))
+    return client
+
+def get_alt_client(client_config=None):
+    if client_config == None:
+        client_config = Config(signature_version='s3v4')
+
+    endpoint_url = "http://%s:%d" % (config.default_host, config.default_port)
+
+    client = boto3.client(service_name='s3',
+                        aws_access_key_id=config.alt_access_key,
+                        aws_secret_access_key=config.alt_secret_key,
+                        endpoint_url=endpoint_url,
+                        use_ssl=config.default_is_secure,
+                        verify=False,
+                        config=client_config)
+    return client
+
+def get_tenant_client(client_config=None):
+    if client_config == None:
+        client_config = Config(signature_version='s3v4')
+
+    endpoint_url = "http://%s:%d" % (config.default_host, config.default_port)
+
+    client = boto3.client(service_name='s3',
+                        aws_access_key_id=config.tenant_access_key,
+                        aws_secret_access_key=config.tenant_secret_key,
+                        endpoint_url=endpoint_url,
+                        use_ssl=config.default_is_secure,
+                        verify=False,
+                        config=client_config)
+    return client
+
+def get_unauthenticated_client():
+
+    endpoint_url = "http://%s:%d" % (config.default_host, config.default_port)
+
+    client = boto3.client(service_name='s3',
+                        aws_access_key_id='',
+                        aws_secret_access_key='',
+                        endpoint_url=endpoint_url,
+                        use_ssl=config.default_is_secure,
+                        verify=False,
+                        config=Config(signature_version=UNSIGNED))
+    return client
+
+def get_bad_auth_client(aws_access_key_id='badauth'):
+
+    endpoint_url = "http://%s:%d" % (config.default_host, config.default_port)
+
+    client = boto3.client(service_name='s3',
+                        aws_access_key_id=aws_access_key_id,
+                        aws_secret_access_key='roflmao',
+                        endpoint_url=endpoint_url,
+                        use_ssl=config.default_is_secure,
+                        verify=False,
+                        config=Config(signature_version='s3v4'))
+    return client
+
+bucket_counter = itertools.count(1)
+
+def get_new_bucket_name():
+    """
+    Get a bucket name that probably does not exist.
+
+    We make every attempt to use a unique random prefix, so if a
+    bucket by this name happens to exist, it's ok if tests give
+    false negatives.
+    """
+    name = '{prefix}{num}'.format(
+        prefix=prefix,
+        num=next(bucket_counter),
+        )
+    return name
+
+def get_new_bucket_resource(name=None):
+    """
+    Get a bucket that exists and is empty.
+
+    Always recreates a bucket from scratch. This is useful to also
+    reset ACLs and such.
+    """
+    endpoint_url = "http://%s:%d" % (config.default_host, config.default_port)
+
+    s3 = boto3.resource('s3', 
+                        use_ssl=False,
+                        verify=False,
+                        endpoint_url=endpoint_url, 
+                        aws_access_key_id=config.main_access_key,
+                        aws_secret_access_key=config.main_secret_key)
+    if name is None:
+        name = get_new_bucket_name()
+    bucket = s3.Bucket(name)
+    bucket_location = bucket.create()
+    return bucket
+
+def get_new_bucket(client=None, name=None):
+    """
+    Get a bucket that exists and is empty.
+
+    Always recreates a bucket from scratch. This is useful to also
+    reset ACLs and such.
+    """
+    if client is None:
+        client = get_client()
+    if name is None:
+        name = get_new_bucket_name()
+
+    client.create_bucket(Bucket=name)
+    return name
+
+
+def get_config_is_secure():
+    return config.default_is_secure
+
+def get_config_host():
+    return config.default_host
+
+def get_config_port():
+    return config.default_port
+
+def get_main_aws_access_key():
+    return config.main_access_key
+
+def get_main_aws_secret_key():
+    return config.main_secret_key
+
+def get_main_display_name():
+    return config.main_display_name
+
+def get_main_user_id():
+    return config.main_user_id
+
+def get_main_email():
+    return config.main_email
+
+def get_main_api_name():
+    return config.main_api_name
+
+def get_main_kms_keyid():
+    return config.main_kms_keyid
+
+def get_alt_aws_access_key():
+    return config.alt_access_key
+
+def get_alt_aws_secret_key():
+    return config.alt_secret_key
+
+def get_alt_display_name():
+    return config.alt_display_name
+
+def get_alt_user_id():
+    return config.alt_user_id
+
+def get_alt_email():
+    return config.alt_email
+
+def get_tenant_aws_access_key():
+    return config.tenant_access_key
+
+def get_tenant_aws_secret_key():
+    return config.tenant_secret_key
+
+def get_tenant_display_name():
+    return config.tenant_display_name
+
+def get_tenant_user_id():
+    return config.tenant_user_id
+
+def get_tenant_email():
+    return config.tenant_email
--- a/s3tests_boto3/functional/policy.py
+++ b/s3tests_boto3/functional/policy.py
@ -0,0 +1,46 @@
+import json
+
+class Statement(object):
+    def __init__(self, action, resource, principal = {"AWS" : "*"}, effect= "Allow", condition = None):
+        self.principal = principal
+        self.action = action
+        self.resource = resource
+        self.condition = condition
+        self.effect = effect
+
+    def to_dict(self):
+        d = { "Action" : self.action,
+              "Principal" : self.principal,
+              "Effect" : self.effect,
+              "Resource" : self.resource
+        }
+
+        if self.condition is not None:
+            d["Condition"] = self.condition
+
+        return d
+
+class Policy(object):
+    def __init__(self):
+        self.statements = []
+
+    def add_statement(self, s):
+        self.statements.append(s)
+        return self
+
+    def to_json(self):
+        policy_dict = {
+            "Version" : "2012-10-17",
+            "Statement":
+            [s.to_dict() for s in self.statements]
+        }
+
+        return json.dumps(policy_dict)
+
+def make_json_policy(action, resource, principal={"AWS": "*"}, conditions=None):
+    """
+    Helper function to make single statement policies
+    """
+    s = Statement(action, resource, principal, condition=conditions)
+    p = Policy()
+    return p.add_statement(s).to_json()
--- a/s3tests_boto3/functional/rgw_interactive.py
+++ b/s3tests_boto3/functional/rgw_interactive.py
@ -0,0 +1,92 @@
+#!/usr/bin/python
+import boto3
+import os
+import random
+import string
+import itertools
+
+host = "localhost"
+port = 8000
+
+## AWS access key
+access_key = "0555b35654ad1656d804"
+
+## AWS secret key
+secret_key = "h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q=="
+
+prefix = "YOURNAMEHERE-1234-"
+
+endpoint_url = "http://%s:%d" % (host, port)
+
+client = boto3.client(service_name='s3',
+                    aws_access_key_id=access_key,
+                    aws_secret_access_key=secret_key,
+                    endpoint_url=endpoint_url,
+                    use_ssl=False,
+                    verify=False)
+
+s3 = boto3.resource('s3', 
+                    use_ssl=False,
+                    verify=False,
+                    endpoint_url=endpoint_url, 
+                    aws_access_key_id=access_key,
+                    aws_secret_access_key=secret_key)
+
+def choose_bucket_prefix(template, max_len=30):
+    """
+    Choose a prefix for our test buckets, so they're easy to identify.
+
+    Use template and feed it more and more random filler, until it's
+    as long as possible but still below max_len.
+    """
+    rand = ''.join(
+        random.choice(string.ascii_lowercase + string.digits)
+        for c in range(255)
+        )
+
+    while rand:
+        s = template.format(random=rand)
+        if len(s) <= max_len:
+            return s
+        rand = rand[:-1]
+
+    raise RuntimeError(
+        'Bucket prefix template is impossible to fulfill: {template!r}'.format(
+            template=template,
+            ),
+        )
+
+bucket_counter = itertools.count(1)
+
+def get_new_bucket_name():
+    """
+    Get a bucket name that probably does not exist.
+
+    We make every attempt to use a unique random prefix, so if a
+    bucket by this name happens to exist, it's ok if tests give
+    false negatives.
+    """
+    name = '{prefix}{num}'.format(
+        prefix=prefix,
+        num=next(bucket_counter),
+        )
+    return name
+
+def get_new_bucket(session=boto3, name=None, headers=None):
+    """
+    Get a bucket that exists and is empty.
+
+    Always recreates a bucket from scratch. This is useful to also
+    reset ACLs and such.
+    """
+    s3 = session.resource('s3', 
+                        use_ssl=False,
+                        verify=False,
+                        endpoint_url=endpoint_url, 
+                        aws_access_key_id=access_key,
+                        aws_secret_access_key=secret_key)
+    if name is None:
+        name = get_new_bucket_name()
+    bucket = s3.Bucket(name)
+    bucket_location = bucket.create()
+    return bucket
--- a/s3tests_boto3/functional/test_headers.py
+++ b/s3tests_boto3/functional/test_headers.py
@ -0,0 +1,793 @@
+import boto3
+from nose.tools import eq_ as eq
+from nose.plugins.attrib import attr
+import nose
+from botocore.exceptions import ClientError
+from email.utils import formatdate
+
+from .utils import assert_raises
+from .utils import _get_status_and_error_code
+from .utils import _get_status
+
+from . import (
+    get_client,
+    get_v2_client,
+    get_new_bucket,
+    get_new_bucket_name,
+    )
+
+def _add_header_create_object(headers, client=None):
+    """ Create a new bucket, add an object w/header customizations
+    """
+    bucket_name = get_new_bucket()
+    if client == None:
+        client = get_client()
+    key_name = 'foo'
+
+    # pass in custom headers before PutObject call
+    add_headers = (lambda **kwargs: kwargs['params']['headers'].update(headers))
+    client.meta.events.register('before-call.s3.PutObject', add_headers)
+    client.put_object(Bucket=bucket_name, Key=key_name)
+
+    return bucket_name, key_name
+
+
+def _add_header_create_bad_object(headers, client=None):
+    """ Create a new bucket, add an object with a header. This should cause a failure 
+    """
+    bucket_name = get_new_bucket()
+    if client == None:
+        client = get_client()
+    key_name = 'foo'
+
+    # pass in custom headers before PutObject call
+    add_headers = (lambda **kwargs: kwargs['params']['headers'].update(headers))
+    client.meta.events.register('before-call.s3.PutObject', add_headers)
+    e = assert_raises(ClientError, client.put_object, Bucket=bucket_name, Key=key_name, Body='bar')
+
+    return e
+
+
+def _remove_header_create_object(remove, client=None):
+    """ Create a new bucket, add an object without a header
+    """
+    bucket_name = get_new_bucket()
+    if client == None:
+        client = get_client()
+    key_name = 'foo'
+
+    # remove custom headers before PutObject call
+    def remove_header(**kwargs):
+        if (remove in kwargs['params']['headers']):
+            del kwargs['params']['headers'][remove]
+
+    client.meta.events.register('before-call.s3.PutObject', remove_header)
+    client.put_object(Bucket=bucket_name, Key=key_name)
+
+    return bucket_name, key_name
+
+def _remove_header_create_bad_object(remove, client=None):
+    """ Create a new bucket, add an object without a header. This should cause a failure
+    """
+    bucket_name = get_new_bucket()
+    if client == None:
+        client = get_client()
+    key_name = 'foo'
+
+    # remove custom headers before PutObject call
+    def remove_header(**kwargs):
+        if (remove in kwargs['params']['headers']):
+            del kwargs['params']['headers'][remove]
+
+    client.meta.events.register('before-call.s3.PutObject', remove_header)
+    e = assert_raises(ClientError, client.put_object, Bucket=bucket_name, Key=key_name, Body='bar')
+
+    return e
+
+
+def _add_header_create_bucket(headers, client=None):
+    """ Create a new bucket, w/header customizations
+    """
+    bucket_name = get_new_bucket_name()
+    if client == None:
+        client = get_client()
+
+    # pass in custom headers before PutObject call
+    add_headers = (lambda **kwargs: kwargs['params']['headers'].update(headers))
+    client.meta.events.register('before-call.s3.CreateBucket', add_headers)
+    client.create_bucket(Bucket=bucket_name)
+
+    return bucket_name
+
+
+def _add_header_create_bad_bucket(headers=None, client=None):
+    """ Create a new bucket, w/header customizations that should cause a failure 
+    """
+    bucket_name = get_new_bucket_name()
+    if client == None:
+        client = get_client()
+
+    # pass in custom headers before PutObject call
+    add_headers = (lambda **kwargs: kwargs['params']['headers'].update(headers))
+    client.meta.events.register('before-call.s3.CreateBucket', add_headers)
+    e = assert_raises(ClientError, client.create_bucket, Bucket=bucket_name)
+
+    return e
+
+
+def _remove_header_create_bucket(remove, client=None):
+    """ Create a new bucket, without a header
+    """
+    bucket_name = get_new_bucket_name()
+    if client == None:
+        client = get_client()
+
+    # remove custom headers before PutObject call
+    def remove_header(**kwargs):
+        if (remove in kwargs['params']['headers']):
+            del kwargs['params']['headers'][remove]
+
+    client.meta.events.register('before-call.s3.CreateBucket', remove_header)
+    client.create_bucket(Bucket=bucket_name)
+
+    return bucket_name
+
+def _remove_header_create_bad_bucket(remove, client=None):
+    """ Create a new bucket, without a header. This should cause a failure
+    """
+    bucket_name = get_new_bucket_name()
+    if client == None:
+        client = get_client()
+
+    # remove custom headers before PutObject call
+    def remove_header(**kwargs):
+        if (remove in kwargs['params']['headers']):
+            del kwargs['params']['headers'][remove]
+
+    client.meta.events.register('before-call.s3.CreateBucket', remove_header)
+    e = assert_raises(ClientError, client.create_bucket, Bucket=bucket_name)
+
+    return e
+
+def tag(*tags):
+    def wrap(func):
+        for tag in tags:
+            setattr(func, tag, True)
+        return func
+    return wrap
+
+#
+# common tests
+#
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/invalid MD5')
+@attr(assertion='fails 400')
+def test_object_create_bad_md5_invalid_short():
+    e = _add_header_create_bad_object({'Content-MD5':'YWJyYWNhZGFicmE='})
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+    eq(error_code, 'InvalidDigest')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/mismatched MD5')
+@attr(assertion='fails 400')
+def test_object_create_bad_md5_bad():
+    e = _add_header_create_bad_object({'Content-MD5':'rL0Y20xC+Fzt72VPzMSk2A=='})
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+    eq(error_code, 'BadDigest')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/empty MD5')
+@attr(assertion='fails 400')
+def test_object_create_bad_md5_empty():
+    e = _add_header_create_bad_object({'Content-MD5':''})
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+    eq(error_code, 'InvalidDigest')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/no MD5 header')
+@attr(assertion='succeeds')
+def test_object_create_bad_md5_none():
+    bucket_name, key_name = _remove_header_create_object('Content-MD5')
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/Expect 200')
+@attr(assertion='garbage, but S3 succeeds!')
+def test_object_create_bad_expect_mismatch():
+    bucket_name, key_name = _add_header_create_object({'Expect': 200})
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/empty expect')
+@attr(assertion='succeeds ... should it?')
+def test_object_create_bad_expect_empty():
+    bucket_name, key_name = _add_header_create_object({'Expect': ''})
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/no expect')
+@attr(assertion='succeeds')
+def test_object_create_bad_expect_none():
+    bucket_name, key_name = _remove_header_create_object('Expect')
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/empty content length')
+@attr(assertion='fails 400')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the content-length header
+@attr('fails_on_rgw')
+def test_object_create_bad_contentlength_empty():
+    e = _add_header_create_bad_object({'Content-Length':''})
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/negative content length')
+@attr(assertion='fails 400')
+@attr('fails_on_mod_proxy_fcgi')
+def test_object_create_bad_contentlength_negative():
+    client = get_client()
+    bucket_name = get_new_bucket()
+    key_name = 'foo'
+    e = assert_raises(ClientError, client.put_object, Bucket=bucket_name, Key=key_name, ContentLength=-1)
+    status = _get_status(e.response)
+    eq(status, 400)
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/no content length')
+@attr(assertion='fails 411')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the content-length header
+@attr('fails_on_rgw')
+def test_object_create_bad_contentlength_none():
+    remove = 'Content-Length'
+    e = _remove_header_create_bad_object('Content-Length')
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 411)
+    eq(error_code, 'MissingContentLength')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/content length too long')
+@attr(assertion='fails 400')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the content-length header
+@attr('fails_on_rgw')
+def test_object_create_bad_contentlength_mismatch_above():
+    content = 'bar'
+    length = len(content) + 1
+
+    client = get_client()
+    bucket_name = get_new_bucket()
+    key_name = 'foo'
+    headers = {'Content-Length': str(length)}
+    add_headers = (lambda **kwargs: kwargs['params']['headers'].update(headers))
+    client.meta.events.register('before-sign.s3.PutObject', add_headers_before_sign)
+
+    e = assert_raises(ClientError, client.put_object, Bucket=bucket_name, Key=key_name, Body=content)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/content type text/plain')
+@attr(assertion='succeeds')
+def test_object_create_bad_contenttype_invalid():
+    bucket_name, key_name = _add_header_create_object({'Content-Type': 'text/plain'})
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/empty content type')
+@attr(assertion='succeeds')
+def test_object_create_bad_contenttype_empty():
+    client = get_client()
+    key_name = 'foo'
+    bucket_name = get_new_bucket()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar', ContentType='')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/no content type')
+@attr(assertion='succeeds')
+def test_object_create_bad_contenttype_none():
+    bucket_name = get_new_bucket()
+    key_name = 'foo'
+    client = get_client()
+    # as long as ContentType isn't specified in put_object it isn't going into the request
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/empty authorization')
+@attr(assertion='fails 403')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the authorization header
+@attr('fails_on_rgw')
+def test_object_create_bad_authorization_empty():
+    e = _add_header_create_bad_object({'Authorization': ''})
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/date and x-amz-date')
+@attr(assertion='succeeds')
+# TODO: remove 'fails_on_rgw' and once we have learned how to pass both the 'Date' and 'X-Amz-Date' header during signing and not 'X-Amz-Date' before
+@attr('fails_on_rgw')
+def test_object_create_date_and_amz_date():
+    date = formatdate(usegmt=True)
+    bucket_name, key_name = _add_header_create_object({'Date': date, 'X-Amz-Date': date})
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/x-amz-date and no date')
+@attr(assertion='succeeds')
+# TODO: remove 'fails_on_rgw' and once we have learned how to pass both the 'Date' and 'X-Amz-Date' header during signing and not 'X-Amz-Date' before
+@attr('fails_on_rgw')
+def test_object_create_amz_date_and_no_date():
+    date = formatdate(usegmt=True)
+    bucket_name, key_name = _add_header_create_object({'Date': '', 'X-Amz-Date': date})
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+# the teardown is really messed up here. check it out
+@tag('auth_common')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/no authorization')
+@attr(assertion='fails 403')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the authorization header
+@attr('fails_on_rgw')
+def test_object_create_bad_authorization_none():
+    e = _remove_header_create_bad_object('Authorization')
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/no content length')
+@attr(assertion='succeeds')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the content-length header
+@attr('fails_on_rgw')
+def test_bucket_create_contentlength_none():
+    remove = 'Content-Length'
+    _remove_header_create_bucket(remove)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='acls')
+@attr(operation='set w/no content length')
+@attr(assertion='succeeds')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the content-length header
+@attr('fails_on_rgw')
+def test_object_acl_create_contentlength_none():
+    bucket_name = get_new_bucket()
+    client = get_client()
+    client.put_object(Bucket=bucket_name, Key='foo', Body='bar')
+
+    remove = 'Content-Length'
+    def remove_header(**kwargs):
+        if (remove in kwargs['params']['headers']):
+            del kwargs['params']['headers'][remove]
+
+    client.meta.events.register('before-call.s3.PutObjectAcl', remove_header)
+    client.put_object_acl(Bucket=bucket_name, Key='foo', ACL='public-read')
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='acls')
+@attr(operation='set w/invalid permission')
+@attr(assertion='fails 400')
+def test_bucket_put_bad_canned_acl():
+    bucket_name = get_new_bucket()
+    client = get_client()
+
+    headers = {'x-amz-acl': 'public-ready'}
+    add_headers = (lambda **kwargs: kwargs['params']['headers'].update(headers))
+    client.meta.events.register('before-call.s3.PutBucketAcl', add_headers)
+
+    e = assert_raises(ClientError, client.put_bucket_acl, Bucket=bucket_name, ACL='public-read')
+    status = _get_status(e.response)
+    eq(status, 400)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/expect 200')
+@attr(assertion='garbage, but S3 succeeds!')
+def test_bucket_create_bad_expect_mismatch():
+    bucket_name = get_new_bucket_name()
+    client = get_client()
+
+    headers = {'Expect': 200}
+    add_headers = (lambda **kwargs: kwargs['params']['headers'].update(headers))
+    client.meta.events.register('before-call.s3.CreateBucket', add_headers)
+    client.create_bucket(Bucket=bucket_name)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/expect empty')
+@attr(assertion='garbage, but S3 succeeds!')
+def test_bucket_create_bad_expect_empty():
+    headers = {'Expect': ''}
+    _add_header_create_bucket(headers)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/empty content length')
+@attr(assertion='fails 400')
+# TODO: The request isn't even making it to the RGW past the frontend
+# This test had 'fails_on_rgw' before the move to boto3
+@attr('fails_on_rgw')
+def test_bucket_create_bad_contentlength_empty():
+    headers = {'Content-Length': ''}
+    e = _add_header_create_bad_bucket(headers)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/negative content length')
+@attr(assertion='fails 400')
+@attr('fails_on_mod_proxy_fcgi')
+def test_bucket_create_bad_contentlength_negative():
+    headers = {'Content-Length': '-1'}
+    e = _add_header_create_bad_bucket(headers)
+    status = _get_status(e.response)
+    eq(status, 400)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/no content length')
+@attr(assertion='succeeds')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the content-length header
+@attr('fails_on_rgw')
+def test_bucket_create_bad_contentlength_none():
+    remove = 'Content-Length'
+    _remove_header_create_bucket(remove)
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/empty authorization')
+@attr(assertion='fails 403')
+# TODO: remove 'fails_on_rgw' and once we have learned how to manipulate the authorization header
+@attr('fails_on_rgw')
+def test_bucket_create_bad_authorization_empty():
+    headers = {'Authorization': ''}
+    e = _add_header_create_bad_bucket(headers)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_common')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/no authorization')
+@attr(assertion='fails 403')
+# TODO: remove 'fails_on_rgw' and once we have learned how to manipulate the authorization header
+@attr('fails_on_rgw')
+def test_bucket_create_bad_authorization_none():
+    e = _remove_header_create_bad_bucket('Authorization')
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/invalid MD5')
+@attr(assertion='fails 400')
+def test_object_create_bad_md5_invalid_garbage_aws2():
+    v2_client = get_v2_client()
+    headers = {'Content-MD5': 'AWS HAHAHA'}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+    eq(error_code, 'InvalidDigest')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/content length too short')
+@attr(assertion='fails 400')
+# TODO: remove 'fails_on_rgw' and once we have learned how to manipulate the Content-Length header
+@attr('fails_on_rgw')
+def test_object_create_bad_contentlength_mismatch_below_aws2():
+    v2_client = get_v2_client()
+    content = 'bar'
+    length = len(content) - 1
+    headers = {'Content-Length': str(length)}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+    eq(error_code, 'BadDigest')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/incorrect authorization')
+@attr(assertion='fails 403')
+# TODO: remove 'fails_on_rgw' and once we have learned how to manipulate the authorization header
+@attr('fails_on_rgw')
+def test_object_create_bad_authorization_incorrect_aws2():
+    v2_client = get_v2_client()
+    headers = {'Authorization': 'AWS AKIAIGR7ZNNBHC5BKSUB:FWeDfwojDSdS2Ztmpfeubhd9isU='}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'InvalidDigest')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/invalid authorization')
+@attr(assertion='fails 400')
+# TODO: remove 'fails_on_rgw' and once we have learned how to manipulate the authorization header
+@attr('fails_on_rgw')
+def test_object_create_bad_authorization_invalid_aws2():
+    v2_client = get_v2_client()
+    headers = {'Authorization': 'AWS HAHAHA'}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+    eq(error_code, 'InvalidArgument')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/empty user agent')
+@attr(assertion='succeeds')
+def test_object_create_bad_ua_empty_aws2():
+    v2_client = get_v2_client()
+    headers = {'User-Agent': ''}
+    bucket_name, key_name = _add_header_create_object(headers, v2_client)
+    v2_client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/no user agent')
+@attr(assertion='succeeds')
+def test_object_create_bad_ua_none_aws2():
+    v2_client = get_v2_client()
+    remove = 'User-Agent'
+    bucket_name, key_name = _remove_header_create_object(remove, v2_client)
+    v2_client.put_object(Bucket=bucket_name, Key=key_name, Body='bar')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/invalid date')
+@attr(assertion='fails 403')
+def test_object_create_bad_date_invalid_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Bad Date'}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/empty date')
+@attr(assertion='fails 403')
+def test_object_create_bad_date_empty_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': ''}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/no date')
+@attr(assertion='fails 403')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the date header
+@attr('fails_on_rgw')
+def test_object_create_bad_date_none_aws2():
+    v2_client = get_v2_client()
+    remove = 'x-amz-date'
+    e = _remove_header_create_bad_object(remove, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/date in past')
+@attr(assertion='fails 403')
+def test_object_create_bad_date_before_today_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Tue, 07 Jul 2010 21:53:04 GMT'}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'RequestTimeTooSkewed')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/date before epoch')
+@attr(assertion='fails 403')
+def test_object_create_bad_date_before_epoch_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Tue, 07 Jul 1950 21:53:04 GMT'}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='object')
+@attr(method='put')
+@attr(operation='create w/date after 9999')
+@attr(assertion='fails 403')
+def test_object_create_bad_date_after_end_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Tue, 07 Jul 9999 21:53:04 GMT'}
+    e = _add_header_create_bad_object(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'RequestTimeTooSkewed')
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/invalid authorization')
+@attr(assertion='fails 400')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the date header
+@attr('fails_on_rgw')
+def test_bucket_create_bad_authorization_invalid_aws2():
+    v2_client = get_v2_client()
+    headers = {'Authorization': 'AWS HAHAHA'}
+    e = _add_header_create_bad_bucket(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 400)
+    eq(error_code, 'InvalidArgument')
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/empty user agent')
+@attr(assertion='succeeds')
+def test_bucket_create_bad_ua_empty_aws2():
+    v2_client = get_v2_client()
+    headers = {'User-Agent': ''}
+    _add_header_create_bucket(headers, v2_client)
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/no user agent')
+@attr(assertion='succeeds')
+def test_bucket_create_bad_ua_none_aws2():
+    v2_client = get_v2_client()
+    remove = 'User-Agent'
+    _remove_header_create_bucket(remove, v2_client)
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/invalid date')
+@attr(assertion='fails 403')
+def test_bucket_create_bad_date_invalid_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Bad Date'}
+    e = _add_header_create_bad_bucket(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/empty date')
+@attr(assertion='fails 403')
+def test_bucket_create_bad_date_empty_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': ''}
+    e = _add_header_create_bad_bucket(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/no date')
+@attr(assertion='fails 403')
+# TODO: remove 'fails_on_rgw' and once we have learned how to remove the date header
+@attr('fails_on_rgw')
+def test_bucket_create_bad_date_none_aws2():
+    v2_client = get_v2_client()
+    remove = 'x-amz-date'
+    e = _remove_header_create_bad_bucket(remove, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/date in past')
+@attr(assertion='fails 403')
+def test_bucket_create_bad_date_before_today_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Tue, 07 Jul 2010 21:53:04 GMT'}
+    e = _add_header_create_bad_bucket(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'RequestTimeTooSkewed')
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/date in future')
+@attr(assertion='fails 403')
+def test_bucket_create_bad_date_after_today_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Tue, 07 Jul 2030 21:53:04 GMT'}
+    e = _add_header_create_bad_bucket(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'RequestTimeTooSkewed')
+
+@tag('auth_aws2')
+@attr(resource='bucket')
+@attr(method='put')
+@attr(operation='create w/date before epoch')
+@attr(assertion='fails 403')
+def test_bucket_create_bad_date_before_epoch_aws2():
+    v2_client = get_v2_client()
+    headers = {'x-amz-date': 'Tue, 07 Jul 1950 21:53:04 GMT'}
+    e = _add_header_create_bad_bucket(headers, v2_client)
+    status, error_code = _get_status_and_error_code(e.response)
+    eq(status, 403)
+    eq(error_code, 'AccessDenied')
--- a/s3tests_boto3/functional/test_s3.py
+++ b/s3tests_boto3/functional/test_s3.py
--- a/s3tests_boto3/functional/test_utils.py
+++ b/s3tests_boto3/functional/test_utils.py
@ -0,0 +1,11 @@
+from nose.tools import eq_ as eq
+
+import utils
+
+def test_generate():
+    FIVE_MB = 5 * 1024 * 1024
+    eq(len(''.join(utils.generate_random(0))), 0)
+    eq(len(''.join(utils.generate_random(1))), 1)
+    eq(len(''.join(utils.generate_random(FIVE_MB - 1))), FIVE_MB - 1)
+    eq(len(''.join(utils.generate_random(FIVE_MB))), FIVE_MB)
+    eq(len(''.join(utils.generate_random(FIVE_MB + 1))), FIVE_MB + 1)
--- a/s3tests_boto3/functional/utils.py
+++ b/s3tests_boto3/functional/utils.py
@ -0,0 +1,49 @@
+import random
+import requests
+import string
+import time
+
+from nose.tools import eq_ as eq
+
+def assert_raises(excClass, callableObj, *args, **kwargs):
+    """
+    Like unittest.TestCase.assertRaises, but returns the exception.
+    """
+    try:
+        callableObj(*args, **kwargs)
+    except excClass as e:
+        return e
+    else:
+        if hasattr(excClass, '__name__'):
+            excName = excClass.__name__
+        else:
+            excName = str(excClass)
+        raise AssertionError("%s not raised" % excName)
+
+def generate_random(size, part_size=5*1024*1024):
+    """
+    Generate the specified number random data.
+    (actually each MB is a repetition of the first KB)
+    """
+    chunk = 1024
+    allowed = string.ascii_letters
+    for x in range(0, size, part_size):
+        strpart = ''.join([allowed[random.randint(0, len(allowed) - 1)] for _ in xrange(chunk)])
+        s = ''
+        left = size - x
+        this_part_size = min(left, part_size)
+        for y in range(this_part_size / chunk):
+            s = s + strpart
+        s = s + strpart[:(this_part_size % chunk)]
+        yield s
+        if (x == size):
+            return
+
+def _get_status(response):
+    status = response['ResponseMetadata']['HTTPStatusCode']
+    return status
+
+def _get_status_and_error_code(response):
+    status = response['ResponseMetadata']['HTTPStatusCode']
+    error_code = response['Error']['Code']
+    return status, error_code
--- a/s3tests_boto3/fuzz/init.py
+++ b/s3tests_boto3/fuzz/init.py
--- a/s3tests_boto3/fuzz/headers.py
+++ b/s3tests_boto3/fuzz/headers.py
@ -0,0 +1,376 @@
+from boto.s3.connection import S3Connection
+from boto.exception import BotoServerError
+from boto.s3.key import Key
+from httplib import BadStatusLine
+from optparse import OptionParser
+from .. import common
+
+import traceback
+import itertools
+import random
+import string
+import struct
+import yaml
+import sys
+import re
+
+
+class DecisionGraphError(Exception):
+    """ Raised when a node in a graph tries to set a header or
+        key that was previously set by another node
+    """
+    def __init__(self, value):
+        self.value = value
+
+    def __str__(self):
+        return repr(self.value)
+
+
+class RecursionError(Exception):
+    """Runaway recursion in string formatting"""
+
+    def __init__(self, msg):
+        self.msg = msg
+
+    def __str__(self):
+        return '{0.__doc__}: {0.msg!r}'.format(self)
+
+
+def assemble_decision(decision_graph, prng):
+    """ Take in a graph describing the possible decision space and a random
+        number generator and traverse the graph to build a decision
+    """
+    return descend_graph(decision_graph, 'start', prng)
+
+
+def descend_graph(decision_graph, node_name, prng):
+    """ Given a graph and a particular node in that graph, set the values in
+        the node's "set" list, pick a choice from the "choice" list, and
+        recurse.  Finally, return dictionary of values
+    """
+    node = decision_graph[node_name]
+
+    try:
+        choice = make_choice(node['choices'], prng)
+        if choice == '':
+            decision = {}
+        else:
+            decision = descend_graph(decision_graph, choice, prng)
+    except IndexError:
+        decision = {}
+
+    for key, choices in node['set'].iteritems():
+        if key in decision:
+            raise DecisionGraphError("Node %s tried to set '%s', but that key was already set by a lower node!" %(node_name, key))
+        decision[key] = make_choice(choices, prng)
+
+    if 'headers' in node:
+        decision.setdefault('headers', [])
+
+        for desc in node['headers']:
+            try:
+                (repetition_range, header, value) = desc
+            except ValueError:
+                (header, value) = desc
+                repetition_range = '1'
+
+            try:
+                size_min, size_max = repetition_range.split('-', 1)
+            except ValueError:
+                size_min = size_max = repetition_range
+
+            size_min = int(size_min)
+            size_max = int(size_max)
+
+            num_reps = prng.randint(size_min, size_max)
+            if header in [h for h, v in decision['headers']]:
+                    raise DecisionGraphError("Node %s tried to add header '%s', but that header already exists!" %(node_name, header))
+            for _ in xrange(num_reps):
+                decision['headers'].append([header, value])
+
+    return decision
+
+
+def make_choice(choices, prng):
+    """ Given a list of (possibly weighted) options or just a single option!,
+        choose one of the options taking weights into account and return the
+        choice
+    """
+    if isinstance(choices, str):
+        return choices
+    weighted_choices = []
+    for option in choices:
+        if option is None:
+            weighted_choices.append('')
+            continue
+        try:
+            (weight, value) = option.split(None, 1)
+            weight = int(weight)
+        except ValueError:
+            weight = 1
+            value = option
+
+        if value == 'null' or value == 'None':
+            value = ''
+
+        for _ in xrange(weight):
+            weighted_choices.append(value)
+
+    return prng.choice(weighted_choices)
+
+
+def expand_headers(decision, prng):
+    expanded_headers = {} 
+    for header in decision['headers']:
+        h = expand(decision, header[0], prng)
+        v = expand(decision, header[1], prng)
+        expanded_headers[h] = v
+    return expanded_headers
+
+
+def expand(decision, value, prng):
+    c = itertools.count()
+    fmt = RepeatExpandingFormatter(prng)
+    new = fmt.vformat(value, [], decision)
+    return new
+
+
+class RepeatExpandingFormatter(string.Formatter):
+    charsets = {
+        'printable_no_whitespace': string.printable.translate(None, string.whitespace),
+        'printable': string.printable,
+        'punctuation': string.punctuation,
+        'whitespace': string.whitespace,
+        'digits': string.digits
+    }
+
+    def __init__(self, prng, _recursion=0):
+        super(RepeatExpandingFormatter, self).__init__()
+        # this class assumes it is always instantiated once per
+        # formatting; use that to detect runaway recursion
+        self.prng = prng
+        self._recursion = _recursion
+
+    def get_value(self, key, args, kwargs):
+        fields = key.split(None, 1)
+        fn = getattr(self, 'special_{name}'.format(name=fields[0]), None)
+        if fn is not None:
+            if len(fields) == 1:
+                fields.append('')
+            return fn(fields[1])
+
+        val = super(RepeatExpandingFormatter, self).get_value(key, args, kwargs)
+        if self._recursion > 5:
+            raise RecursionError(key)
+        fmt = self.__class__(self.prng, _recursion=self._recursion+1)
+
+        n = fmt.vformat(val, args, kwargs)
+        return n
+
+    def special_random(self, args):
+        arg_list = args.split()
+        try:
+            size_min, size_max = arg_list[0].split('-', 1)
+        except ValueError:
+            size_min = size_max = arg_list[0]
+        except IndexError:
+            size_min = '0'
+            size_max = '1000'
+
+        size_min = int(size_min)
+        size_max = int(size_max)
+        length = self.prng.randint(size_min, size_max)
+
+        try:
+            charset_arg = arg_list[1]
+        except IndexError:
+            charset_arg = 'printable'
+
+        if charset_arg == 'binary' or charset_arg == 'binary_no_whitespace':
+            num_bytes = length + 8
+            tmplist = [self.prng.getrandbits(64) for _ in xrange(num_bytes / 8)]
+            tmpstring = struct.pack((num_bytes / 8) * 'Q', *tmplist)
+            if charset_arg == 'binary_no_whitespace':
+                tmpstring = ''.join(c for c in tmpstring if c not in string.whitespace)
+            return tmpstring[0:length]
+        else:
+            charset = self.charsets[charset_arg]
+            return ''.join([self.prng.choice(charset) for _ in xrange(length)]) # Won't scale nicely
+
+
+def parse_options():
+    parser = OptionParser()
+    parser.add_option('-O', '--outfile', help='write output to FILE. Defaults to STDOUT', metavar='FILE')
+    parser.add_option('--seed', dest='seed', type='int',  help='initial seed for the random number generator')
+    parser.add_option('--seed-file', dest='seedfile', help='read seeds for specific requests from FILE', metavar='FILE')
+    parser.add_option('-n', dest='num_requests', type='int',  help='issue NUM requests before stopping', metavar='NUM')
+    parser.add_option('-v', '--verbose', dest='verbose', action="store_true",  help='turn on verbose output')
+    parser.add_option('-d', '--debug', dest='debug', action="store_true",  help='turn on debugging (very verbose) output')
+    parser.add_option('--decision-graph', dest='graph_filename',  help='file in which to find the request decision graph')
+    parser.add_option('--no-cleanup', dest='cleanup', action="store_false", help='turn off teardown so you can peruse the state of buckets after testing')
+
+    parser.set_defaults(num_requests=5)
+    parser.set_defaults(cleanup=True)
+    parser.set_defaults(graph_filename='request_decision_graph.yml')
+    return parser.parse_args()
+
+
+def randomlist(seed=None):
+    """ Returns an infinite generator of random numbers
+    """
+    rng = random.Random(seed)
+    while True:
+        yield rng.randint(0,100000) #100,000 seeds is enough, right?
+
+
+def populate_buckets(conn, alt):
+    """ Creates buckets and keys for fuzz testing and sets appropriate
+        permissions. Returns a dictionary of the bucket and key names.
+    """
+    breadable = common.get_new_bucket(alt)
+    bwritable = common.get_new_bucket(alt)
+    bnonreadable = common.get_new_bucket(alt)
+
+    oreadable = Key(breadable)
+    owritable = Key(bwritable)
+    ononreadable = Key(breadable)
+    oreadable.set_contents_from_string('oreadable body')
+    owritable.set_contents_from_string('owritable body')
+    ononreadable.set_contents_from_string('ononreadable body')
+
+    breadable.set_acl('public-read')
+    bwritable.set_acl('public-read-write')
+    bnonreadable.set_acl('private')
+    oreadable.set_acl('public-read')
+    owritable.set_acl('public-read-write')
+    ononreadable.set_acl('private')
+
+    return dict(
+        bucket_readable=breadable.name,
+        bucket_writable=bwritable.name,
+        bucket_not_readable=bnonreadable.name,
+        bucket_not_writable=breadable.name,
+        object_readable=oreadable.key,
+        object_writable=owritable.key,
+        object_not_readable=ononreadable.key,
+        object_not_writable=oreadable.key,
+    )
+
+
+def _main():
+    """ The main script
+    """
+    (options, args) = parse_options()
+    random.seed(options.seed if options.seed else None)
+    s3_connection = common.s3.main
+    alt_connection = common.s3.alt
+
+    if options.outfile:
+        OUT = open(options.outfile, 'w')
+    else:
+        OUT = sys.stderr
+
+    VERBOSE = DEBUG = open('/dev/null', 'w')
+    if options.verbose:
+        VERBOSE = OUT
+    if options.debug:
+        DEBUG = OUT
+        VERBOSE = OUT
+
+    request_seeds = None
+    if options.seedfile:
+        FH = open(options.seedfile, 'r')
+        request_seeds = [int(line) for line in FH if line != '\n']
+        print>>OUT, 'Seedfile: %s' %options.seedfile
+        print>>OUT, 'Number of requests: %d' %len(request_seeds)
+    else:
+        if options.seed:
+            print>>OUT, 'Initial Seed: %d' %options.seed
+        print>>OUT, 'Number of requests: %d' %options.num_requests
+        random_list = randomlist(options.seed)
+        request_seeds = itertools.islice(random_list, options.num_requests)
+
+    print>>OUT, 'Decision Graph: %s' %options.graph_filename
+
+    graph_file = open(options.graph_filename, 'r')
+    decision_graph = yaml.safe_load(graph_file)
+
+    constants = populate_buckets(s3_connection, alt_connection)
+    print>>VERBOSE, "Test Buckets/Objects:"
+    for key, value in constants.iteritems():
+        print>>VERBOSE, "\t%s: %s" %(key, value)
+
+    print>>OUT, "Begin Fuzzing..."
+    print>>VERBOSE, '='*80
+    for request_seed in request_seeds:
+        print>>VERBOSE, 'Seed is: %r' %request_seed
+        prng = random.Random(request_seed)
+        decision = assemble_decision(decision_graph, prng)
+        decision.update(constants)
+
+        method = expand(decision, decision['method'], prng)
+        path = expand(decision, decision['urlpath'], prng)
+
+        try:
+            body = expand(decision, decision['body'], prng)
+        except KeyError:
+            body = ''
+
+        try:
+            headers = expand_headers(decision, prng)
+        except KeyError:
+            headers = {}
+
+        print>>VERBOSE, "%r %r" %(method[:100], path[:100])
+        for h, v in headers.iteritems():
+            print>>VERBOSE, "%r: %r" %(h[:50], v[:50])
+        print>>VERBOSE, "%r\n" % body[:100]
+
+        print>>DEBUG, 'FULL REQUEST'
+        print>>DEBUG, 'Method: %r' %method
+        print>>DEBUG, 'Path: %r' %path
+        print>>DEBUG, 'Headers:'
+        for h, v in headers.iteritems():
+            print>>DEBUG, "\t%r: %r" %(h, v)
+        print>>DEBUG, 'Body: %r\n' %body
+
+        failed = False # Let's be optimistic, shall we?
+        try:
+            response = s3_connection.make_request(method, path, data=body, headers=headers, override_num_retries=1)
+            body = response.read()
+        except BotoServerError, e:
+            response = e
+            body = e.body
+            failed = True
+        except BadStatusLine, e:
+            print>>OUT, 'FAILED: failed to parse response (BadStatusLine); probably a NUL byte in your request?'
+            print>>VERBOSE, '='*80
+            continue
+
+        if failed:
+            print>>OUT, 'FAILED:'
+            OLD_VERBOSE = VERBOSE
+            OLD_DEBUG = DEBUG
+            VERBOSE = DEBUG = OUT
+        print>>VERBOSE, 'Seed was: %r' %request_seed
+        print>>VERBOSE, 'Response status code: %d %s' %(response.status, response.reason)
+        print>>DEBUG, 'Body:\n%s' %body
+        print>>VERBOSE, '='*80
+        if failed:
+            VERBOSE = OLD_VERBOSE
+            DEBUG = OLD_DEBUG
+
+    print>>OUT, '...done fuzzing'
+
+    if options.cleanup:
+        common.teardown()
+
+
+def main():
+    common.setup()
+    try:
+        _main()
+    except Exception as e:
+        traceback.print_exc()
+        common.teardown()
+
--- a/s3tests_boto3/fuzz/test/init.py
+++ b/s3tests_boto3/fuzz/test/init.py
--- a/s3tests_boto3/fuzz/test/test_fuzzer.py
+++ b/s3tests_boto3/fuzz/test/test_fuzzer.py
@ -0,0 +1,403 @@
+"""
+Unit-test suite for the S3 fuzzer
+
+The fuzzer is a grammar-based random S3 operation generator
+that produces random operation sequences in an effort to
+crash the server.  This unit-test suite does not test
+S3 servers, but rather the fuzzer infrastructure.
+
+It works by running the fuzzer off of a simple grammar,
+and checking the producted requests to ensure that they
+include the expected sorts of operations in the expected
+proportions.
+"""
+import sys
+import itertools
+import nose
+import random
+import string
+import yaml
+
+from ..headers import *
+
+from nose.tools import eq_ as eq
+from nose.tools import assert_true
+from nose.plugins.attrib import attr
+
+from ...functional.utils import assert_raises
+
+_decision_graph = {}
+
+def check_access_denied(fn, *args, **kwargs):
+    e = assert_raises(boto.exception.S3ResponseError, fn, *args, **kwargs)
+    eq(e.status, 403)
+    eq(e.reason, 'Forbidden')
+    eq(e.error_code, 'AccessDenied')
+
+
+def build_graph():
+    graph = {}
+    graph['start'] = {
+        'set': {},
+        'choices': ['node2']
+    }
+    graph['leaf'] = {
+        'set': {
+            'key1': 'value1',
+            'key2': 'value2'
+        },
+        'headers': [
+            ['1-2', 'random-header-{random 5-10 printable}', '{random 20-30 punctuation}']
+        ],
+        'choices': []
+    }
+    graph['node1'] = {
+        'set': {
+            'key3': 'value3',
+            'header_val': [
+                '3 h1',
+                '2 h2',
+                'h3'
+            ]
+        },
+        'headers': [
+            ['1-1', 'my-header', '{header_val}'],
+        ],
+        'choices': ['leaf']
+    }
+    graph['node2'] = {
+        'set': {
+            'randkey': 'value-{random 10-15 printable}',
+            'path': '/{bucket_readable}',
+            'indirect_key1': '{key1}'
+        },
+        'choices': ['leaf']
+    }
+    graph['bad_node'] = {
+        'set': {
+            'key1': 'value1'
+        },
+        'choices': ['leaf']
+    }
+    graph['nonexistant_child_node'] = {
+        'set': {},
+        'choices': ['leafy_greens']
+    }
+    graph['weighted_node'] = {
+        'set': {
+            'k1': [
+                'foo',
+                '2 bar',
+                '1 baz'
+            ]
+        },
+        'choices': [
+            'foo',
+            '2 bar',
+            '1 baz'
+        ]
+    }
+    graph['null_choice_node'] = {
+        'set': {},
+        'choices': [None]
+    }
+    graph['repeated_headers_node'] = {
+        'set': {},
+        'headers': [
+            ['1-2', 'random-header-{random 5-10 printable}', '{random 20-30 punctuation}']
+        ],
+        'choices': ['leaf']
+    }
+    graph['weighted_null_choice_node'] = {
+        'set': {},
+        'choices': ['3 null']
+    }
+    return graph
+
+
+#def test_foo():
+    #graph_file = open('request_decision_graph.yml', 'r')
+    #graph = yaml.safe_load(graph_file)
+    #eq(graph['bucket_put_simple']['set']['grantee'], 0)
+
+
+def test_load_graph():
+    graph_file = open('request_decision_graph.yml', 'r')
+    graph = yaml.safe_load(graph_file)
+    graph['start']
+
+
+def test_descend_leaf_node():
+    graph = build_graph()
+    prng = random.Random(1)
+    decision = descend_graph(graph, 'leaf', prng)
+
+    eq(decision['key1'], 'value1')
+    eq(decision['key2'], 'value2')
+    e = assert_raises(KeyError, lambda x: decision[x], 'key3')
+
+
+def test_descend_node():
+    graph = build_graph()
+    prng = random.Random(1)
+    decision = descend_graph(graph, 'node1', prng)
+
+    eq(decision['key1'], 'value1')
+    eq(decision['key2'], 'value2')
+    eq(decision['key3'], 'value3')
+
+
+def test_descend_bad_node():
+    graph = build_graph()
+    prng = random.Random(1)
+    assert_raises(DecisionGraphError, descend_graph, graph, 'bad_node', prng)
+
+
+def test_descend_nonexistant_child():
+    graph = build_graph()
+    prng = random.Random(1)
+    assert_raises(KeyError, descend_graph, graph, 'nonexistant_child_node', prng)
+
+
+def test_expand_random_printable():
+    prng = random.Random(1)
+    got = expand({}, '{random 10-15 printable}', prng)
+    eq(got, '[/pNI$;92@')
+
+
+def test_expand_random_binary():
+    prng = random.Random(1)
+    got = expand({}, '{random 10-15 binary}', prng)
+    eq(got, '\xdfj\xf1\xd80>a\xcd\xc4\xbb')
+
+
+def test_expand_random_printable_no_whitespace():
+    prng = random.Random(1)
+    for _ in xrange(1000):
+        got = expand({}, '{random 500 printable_no_whitespace}', prng)
+        assert_true(reduce(lambda x, y: x and y, [x not in string.whitespace and x in string.printable for x in got]))
+
+
+def test_expand_random_binary_no_whitespace():
+    prng = random.Random(1)
+    for _ in xrange(1000):
+        got = expand({}, '{random 500 binary_no_whitespace}', prng)
+        assert_true(reduce(lambda x, y: x and y, [x not in string.whitespace for x in got]))
+
+
+def test_expand_random_no_args():
+    prng = random.Random(1)
+    for _ in xrange(1000):
+        got = expand({}, '{random}', prng)
+        assert_true(0 <= len(got) <= 1000)
+        assert_true(reduce(lambda x, y: x and y, [x in string.printable for x in got]))
+
+
+def test_expand_random_no_charset():
+    prng = random.Random(1)
+    for _ in xrange(1000):
+        got = expand({}, '{random 10-30}', prng)
+        assert_true(10 <= len(got) <= 30)
+        assert_true(reduce(lambda x, y: x and y, [x in string.printable for x in got]))
+
+
+def test_expand_random_exact_length():
+    prng = random.Random(1)
+    for _ in xrange(1000):
+        got = expand({}, '{random 10 digits}', prng)
+        assert_true(len(got) == 10)
+        assert_true(reduce(lambda x, y: x and y, [x in string.digits for x in got]))
+
+
+def test_expand_random_bad_charset():
+    prng = random.Random(1)
+    assert_raises(KeyError, expand, {}, '{random 10-30 foo}', prng)
+
+
+def test_expand_random_missing_length():
+    prng = random.Random(1)
+    assert_raises(ValueError, expand, {}, '{random printable}', prng)
+
+
+def test_assemble_decision():
+    graph = build_graph()
+    prng = random.Random(1)
+    decision = assemble_decision(graph, prng)
+
+    eq(decision['key1'], 'value1')
+    eq(decision['key2'], 'value2')
+    eq(decision['randkey'], 'value-{random 10-15 printable}')
+    eq(decision['indirect_key1'], '{key1}')
+    eq(decision['path'], '/{bucket_readable}')
+    assert_raises(KeyError, lambda x: decision[x], 'key3')
+
+
+def test_expand_escape():
+    prng = random.Random(1)
+    decision = dict(
+        foo='{{bar}}',
+        )
+    got = expand(decision, '{foo}', prng)
+    eq(got, '{bar}')
+
+
+def test_expand_indirect():
+    prng = random.Random(1)
+    decision = dict(
+        foo='{bar}',
+        bar='quux',
+        )
+    got = expand(decision, '{foo}', prng)
+    eq(got, 'quux')
+
+
+def test_expand_indirect_double():
+    prng = random.Random(1)
+    decision = dict(
+        foo='{bar}',
+        bar='{quux}',
+        quux='thud',
+        )
+    got = expand(decision, '{foo}', prng)
+    eq(got, 'thud')
+
+
+def test_expand_recursive():
+    prng = random.Random(1)
+    decision = dict(
+        foo='{foo}',
+        )
+    e = assert_raises(RecursionError, expand, decision, '{foo}', prng)
+    eq(str(e), "Runaway recursion in string formatting: 'foo'")
+
+
+def test_expand_recursive_mutual():
+    prng = random.Random(1)
+    decision = dict(
+        foo='{bar}',
+        bar='{foo}',
+        )
+    e = assert_raises(RecursionError, expand, decision, '{foo}', prng)
+    eq(str(e), "Runaway recursion in string formatting: 'foo'")
+
+
+def test_expand_recursive_not_too_eager():
+    prng = random.Random(1)
+    decision = dict(
+        foo='bar',
+        )
+    got = expand(decision, 100*'{foo}', prng)
+    eq(got, 100*'bar')
+
+
+def test_make_choice_unweighted_with_space():
+    prng = random.Random(1)
+    choice = make_choice(['foo bar'], prng)
+    eq(choice, 'foo bar')
+
+def test_weighted_choices():
+    graph = build_graph()
+    prng = random.Random(1)
+
+    choices_made = {}
+    for _ in xrange(1000):
+        choice = make_choice(graph['weighted_node']['choices'], prng)
+        if choices_made.has_key(choice):
+            choices_made[choice] += 1
+        else:
+            choices_made[choice] = 1
+
+    foo_percentage = choices_made['foo'] / 1000.0
+    bar_percentage = choices_made['bar'] / 1000.0
+    baz_percentage = choices_made['baz'] / 1000.0
+    nose.tools.assert_almost_equal(foo_percentage, 0.25, 1)
+    nose.tools.assert_almost_equal(bar_percentage, 0.50, 1)
+    nose.tools.assert_almost_equal(baz_percentage, 0.25, 1)
+
+
+def test_null_choices():
+    graph = build_graph()
+    prng = random.Random(1)
+    choice = make_choice(graph['null_choice_node']['choices'], prng)
+
+    eq(choice, '')
+
+
+def test_weighted_null_choices():
+    graph = build_graph()
+    prng = random.Random(1)
+    choice = make_choice(graph['weighted_null_choice_node']['choices'], prng)
+
+    eq(choice, '')
+
+
+def test_null_child():
+    graph = build_graph()
+    prng = random.Random(1)
+    decision = descend_graph(graph, 'null_choice_node', prng)
+
+    eq(decision, {})
+
+
+def test_weighted_set():
+    graph = build_graph()
+    prng = random.Random(1)
+
+    choices_made = {}
+    for _ in xrange(1000):
+        choice = make_choice(graph['weighted_node']['set']['k1'], prng)
+        if choices_made.has_key(choice):
+            choices_made[choice] += 1
+        else:
+            choices_made[choice] = 1
+
+    foo_percentage = choices_made['foo'] / 1000.0
+    bar_percentage = choices_made['bar'] / 1000.0
+    baz_percentage = choices_made['baz'] / 1000.0
+    nose.tools.assert_almost_equal(foo_percentage, 0.25, 1)
+    nose.tools.assert_almost_equal(bar_percentage, 0.50, 1)
+    nose.tools.assert_almost_equal(baz_percentage, 0.25, 1)
+
+
+def test_header_presence():
+    graph = build_graph()
+    prng = random.Random(1)
+    decision = descend_graph(graph, 'node1', prng)
+
+    c1 = itertools.count()
+    c2 = itertools.count()
+    for header, value in decision['headers']:
+        if header == 'my-header':
+            eq(value, '{header_val}')
+            assert_true(next(c1) < 1)
+        elif header == 'random-header-{random 5-10 printable}':
+            eq(value, '{random 20-30 punctuation}')
+            assert_true(next(c2) < 2)
+        else:
+            raise KeyError('unexpected header found: %s' % header)
+
+    assert_true(next(c1))
+    assert_true(next(c2))
+
+
+def test_duplicate_header():
+    graph = build_graph()
+    prng = random.Random(1)
+    assert_raises(DecisionGraphError, descend_graph, graph, 'repeated_headers_node', prng)
+
+
+def test_expand_headers():
+    graph = build_graph()
+    prng = random.Random(1)
+    decision = descend_graph(graph, 'node1', prng)
+    expanded_headers = expand_headers(decision, prng)
+
+    for header, value in expanded_headers.iteritems():
+        if header == 'my-header':
+            assert_true(value in ['h1', 'h2', 'h3'])
+        elif header.startswith('random-header-'):
+            assert_true(20 <= len(value) <= 30)
+            assert_true(string.strip(value, RepeatExpandingFormatter.charsets['punctuation']) is '')
+        else:
+            raise DecisionGraphError('unexpected header found: "%s"' % header)
+
--- a/s3tests_boto3/generate_objects.py
+++ b/s3tests_boto3/generate_objects.py
@ -0,0 +1,117 @@
+from boto.s3.key import Key
+from optparse import OptionParser
+from . import realistic
+import traceback
+import random
+from . import common
+import sys
+
+
+def parse_opts():
+    parser = OptionParser()
+    parser.add_option('-O', '--outfile', help='write output to FILE. Defaults to STDOUT', metavar='FILE')
+    parser.add_option('-b', '--bucket', dest='bucket', help='push objects to BUCKET', metavar='BUCKET')
+    parser.add_option('--seed', dest='seed', help='optional seed for the random number generator')
+
+    return parser.parse_args()
+
+
+def get_random_files(quantity, mean, stddev, seed):
+    """Create file-like objects with pseudorandom contents.
+       IN:
+           number of files to create
+           mean file size in bytes
+           standard deviation from mean file size
+           seed for PRNG
+       OUT:
+           list of file handles
+    """
+    file_generator = realistic.files(mean, stddev, seed)
+    return [file_generator.next() for _ in xrange(quantity)]
+
+
+def upload_objects(bucket, files, seed):
+    """Upload a bunch of files to an S3 bucket
+       IN:
+         boto S3 bucket object
+         list of file handles to upload
+         seed for PRNG
+       OUT:
+         list of boto S3 key objects
+    """
+    keys = []
+    name_generator = realistic.names(15, 4, seed=seed)
+
+    for fp in files:
+        print >> sys.stderr, 'sending file with size %dB' % fp.size
+        key = Key(bucket)
+        key.key = name_generator.next()
+        key.set_contents_from_file(fp, rewind=True)
+        key.set_acl('public-read')
+        keys.append(key)
+
+    return keys
+
+
+def _main():
+    '''To run the static content load test, make sure you've bootstrapped your
+       test environment and set up your config.yaml file, then run the following:
+          S3TEST_CONF=config.yaml virtualenv/bin/s3tests-generate-objects.py --seed 1234
+
+        This creates a bucket with your S3 credentials (from config.yaml) and
+        fills it with garbage objects as described in the
+        file_generation.groups section of config.yaml.  It writes a list of
+        URLS to those objects to the file listed in file_generation.url_file
+        in config.yaml.
+
+        Once you have objcts in your bucket, run the siege benchmarking program:
+            siege --rc ./siege.conf -r 5
+
+        This tells siege to read the ./siege.conf config file which tells it to
+        use the urls in ./urls.txt and log to ./siege.log. It hits each url in
+        urls.txt 5 times (-r flag).
+
+        Results are printed to the terminal and written in CSV format to
+        ./siege.log
+    '''
+    (options, args) = parse_opts()
+
+    #SETUP
+    random.seed(options.seed if options.seed else None)
+    conn = common.s3.main
+
+    if options.outfile:
+        OUTFILE = open(options.outfile, 'w')
+    elif common.config.file_generation.url_file:
+        OUTFILE = open(common.config.file_generation.url_file, 'w')
+    else:
+        OUTFILE = sys.stdout
+
+    if options.bucket:
+        bucket = conn.create_bucket(options.bucket)
+    else:
+        bucket = common.get_new_bucket()
+
+    bucket.set_acl('public-read')
+    keys = []
+    print >> OUTFILE, 'bucket: %s' % bucket.name
+    print >> sys.stderr, 'setup complete, generating files'
+    for profile in common.config.file_generation.groups:
+        seed = random.random()
+        files = get_random_files(profile[0], profile[1], profile[2], seed)
+        keys += upload_objects(bucket, files, seed)
+
+    print >> sys.stderr, 'finished sending files. generating urls'
+    for key in keys:
+        print >> OUTFILE, key.generate_url(0, query_auth=False)
+
+    print >> sys.stderr, 'done'
+
+
+def main():
+    common.setup()
+    try:
+        _main()
+    except Exception as e:
+        traceback.print_exc()
+        common.teardown()
--- a/s3tests_boto3/readwrite.py
+++ b/s3tests_boto3/readwrite.py
@ -0,0 +1,265 @@
+import gevent
+import gevent.pool
+import gevent.queue
+import gevent.monkey; gevent.monkey.patch_all()
+import itertools
+import optparse
+import os
+import sys
+import time
+import traceback
+import random
+import yaml
+
+import realistic
+import common
+
+NANOSECOND = int(1e9)
+
+def reader(bucket, worker_id, file_names, queue, rand):
+    while True:
+        objname = rand.choice(file_names)
+        key = bucket.new_key(objname)
+
+        fp = realistic.FileValidator()
+        result = dict(
+                type='r',
+                bucket=bucket.name,
+                key=key.name,
+                worker=worker_id,
+                )
+
+        start = time.time()
+        try:
+            key.get_contents_to_file(fp._file)
+        except gevent.GreenletExit:
+            raise
+        except Exception as e:
+            # stop timer ASAP, even on errors
+            end = time.time()
+            result.update(
+                error=dict(
+                    msg=str(e),
+                    traceback=traceback.format_exc(),
+                    ),
+                )
+            # certain kinds of programmer errors make this a busy
+            # loop; let parent greenlet get some time too
+            time.sleep(0)
+        else:
+            end = time.time()
+
+            if not fp.valid():
+                m='md5sum check failed start={s} ({se}) end={e} size={sz} obj={o}'.format(s=time.ctime(start), se=start, e=end, sz=fp._file.tell(), o=objname)
+                result.update(
+                    error=dict(
+                        msg=m,
+                        traceback=traceback.format_exc(),
+                        ),
+                    )
+                print "ERROR:", m
+            else:
+                elapsed = end - start
+                result.update(
+                    start=start,
+                    duration=int(round(elapsed * NANOSECOND)),
+                    )
+        queue.put(result)
+
+def writer(bucket, worker_id, file_names, files, queue, rand):
+    while True:
+        fp = next(files)
+        fp.seek(0)
+        objname = rand.choice(file_names)
+        key = bucket.new_key(objname)
+
+        result = dict(
+            type='w',
+            bucket=bucket.name,
+            key=key.name,
+            worker=worker_id,
+            )
+
+        start = time.time()
+        try:
+            key.set_contents_from_file(fp)
+        except gevent.GreenletExit:
+            raise
+        except Exception as e:
+            # stop timer ASAP, even on errors
+            end = time.time()
+            result.update(
+                error=dict(
+                    msg=str(e),
+                    traceback=traceback.format_exc(),
+                    ),
+                )
+            # certain kinds of programmer errors make this a busy
+            # loop; let parent greenlet get some time too
+            time.sleep(0)
+        else:
+            end = time.time()
+
+            elapsed = end - start
+            result.update(
+                start=start,
+                duration=int(round(elapsed * NANOSECOND)),
+                )
+
+        queue.put(result)
+
+def parse_options():
+    parser = optparse.OptionParser(
+        usage='%prog [OPTS] <CONFIG_YAML',
+        )
+    parser.add_option("--no-cleanup", dest="cleanup", action="store_false",
+        help="skip cleaning up all created buckets", default=True)
+
+    return parser.parse_args()
+
+def write_file(bucket, file_name, fp):
+    """
+    Write a single file to the bucket using the file_name.
+    This is used during the warmup to initialize the files.
+    """
+    key = bucket.new_key(file_name)
+    key.set_contents_from_file(fp)
+
+def main():
+    # parse options
+    (options, args) = parse_options()
+
+    if os.isatty(sys.stdin.fileno()):
+        raise RuntimeError('Need configuration in stdin.')
+    config = common.read_config(sys.stdin)
+    conn = common.connect(config.s3)
+    bucket = None
+
+    try:
+        # setup
+        real_stdout = sys.stdout
+        sys.stdout = sys.stderr
+
+        # verify all required config items are present
+        if 'readwrite' not in config:
+            raise RuntimeError('readwrite section not found in config')
+        for item in ['readers', 'writers', 'duration', 'files', 'bucket']:
+            if item not in config.readwrite:
+                raise RuntimeError("Missing readwrite config item: {item}".format(item=item))
+        for item in ['num', 'size', 'stddev']:
+            if item not in config.readwrite.files:
+                raise RuntimeError("Missing readwrite config item: files.{item}".format(item=item))
+
+        seeds = dict(config.readwrite.get('random_seed', {}))
+        seeds.setdefault('main', random.randrange(2**32))
+
+        rand = random.Random(seeds['main'])
+
+        for name in ['names', 'contents', 'writer', 'reader']:
+            seeds.setdefault(name, rand.randrange(2**32))
+
+        print 'Using random seeds: {seeds}'.format(seeds=seeds)
+
+        # setup bucket and other objects
+        bucket_name = common.choose_bucket_prefix(config.readwrite.bucket, max_len=30)
+        bucket = conn.create_bucket(bucket_name)
+        print "Created bucket: {name}".format(name=bucket.name)
+
+        # check flag for deterministic file name creation
+        if not config.readwrite.get('deterministic_file_names'):
+            print 'Creating random file names'
+            file_names = realistic.names(
+                mean=15,
+                stddev=4,
+                seed=seeds['names'],
+                )
+            file_names = itertools.islice(file_names, config.readwrite.files.num)
+            file_names = list(file_names)
+        else:
+            print 'Creating file names that are deterministic'
+            file_names = []
+            for x in xrange(config.readwrite.files.num):
+                file_names.append('test_file_{num}'.format(num=x))
+
+        files = realistic.files2(
+            mean=1024 * config.readwrite.files.size,
+            stddev=1024 * config.readwrite.files.stddev,
+            seed=seeds['contents'],
+            )
+        q = gevent.queue.Queue()
+
+
+        # warmup - get initial set of files uploaded if there are any writers specified
+        if config.readwrite.writers > 0:
+            print "Uploading initial set of {num} files".format(num=config.readwrite.files.num)
+            warmup_pool = gevent.pool.Pool(size=100)
+            for file_name in file_names:
+                fp = next(files)
+                warmup_pool.spawn(
+                    write_file,
+                    bucket=bucket,
+                    file_name=file_name,
+                    fp=fp,
+                    )
+            warmup_pool.join()
+
+        # main work
+        print "Starting main worker loop."
+        print "Using file size: {size} +- {stddev}".format(size=config.readwrite.files.size, stddev=config.readwrite.files.stddev)
+        print "Spawning {w} writers and {r} readers...".format(w=config.readwrite.writers, r=config.readwrite.readers)
+        group = gevent.pool.Group()
+        rand_writer = random.Random(seeds['writer'])
+
+        # Don't create random files if deterministic_files_names is set and true
+        if not config.readwrite.get('deterministic_file_names'):
+            for x in xrange(config.readwrite.writers):
+                this_rand = random.Random(rand_writer.randrange(2**32))
+                group.spawn(
+                    writer,
+                    bucket=bucket,
+                    worker_id=x,
+                    file_names=file_names,
+                    files=files,
+                    queue=q,
+                    rand=this_rand,
+                    )
+
+        # Since the loop generating readers already uses config.readwrite.readers
+        # and the file names are already generated (randomly or deterministically),
+        # this loop needs no additional qualifiers. If zero readers are specified,
+        # it will behave as expected (no data is read)
+        rand_reader = random.Random(seeds['reader'])
+        for x in xrange(config.readwrite.readers):
+            this_rand = random.Random(rand_reader.randrange(2**32))
+            group.spawn(
+                reader,
+                bucket=bucket,
+                worker_id=x,
+                file_names=file_names,
+                queue=q,
+                rand=this_rand,
+                )
+        def stop():
+            group.kill(block=True)
+            q.put(StopIteration)
+        gevent.spawn_later(config.readwrite.duration, stop)
+
+        # wait for all the tests to finish
+        group.join()
+        print 'post-join, queue size {size}'.format(size=q.qsize())
+
+        if q.qsize() > 0:
+            for temp_dict in q:
+                if 'error' in temp_dict:
+                    raise Exception('exception:\n\t{msg}\n\t{trace}'.format(
+                                    msg=temp_dict['error']['msg'],
+                                    trace=temp_dict['error']['traceback'])
+                                   )
+                else:
+                    yaml.safe_dump(temp_dict, stream=real_stdout)
+
+    finally:
+        # cleanup
+        if options.cleanup:
+            if bucket is not None:
+                common.nuke_bucket(bucket)
--- a/s3tests_boto3/realistic.py
+++ b/s3tests_boto3/realistic.py
@ -0,0 +1,281 @@
+import hashlib
+import random
+import string
+import struct
+import time
+import math
+import tempfile
+import shutil
+import os
+
+
+NANOSECOND = int(1e9)
+
+
+def generate_file_contents(size):
+    """
+    A helper function to generate binary contents for a given size, and
+    calculates the md5 hash of the contents appending itself at the end of the
+    blob.
+    It uses sha1's hexdigest which is 40 chars long. So any binary generated
+    should remove the last 40 chars from the blob to retrieve the original hash
+    and binary so that validity can be proved.
+    """
+    size = int(size)
+    contents = os.urandom(size)
+    content_hash = hashlib.sha1(contents).hexdigest()
+    return contents + content_hash
+
+
+class FileValidator(object):
+
+    def __init__(self, f=None):
+        self._file = tempfile.SpooledTemporaryFile()
+        self.original_hash = None
+        self.new_hash = None
+        if f:
+            f.seek(0)
+            shutil.copyfileobj(f, self._file)
+
+    def valid(self):
+        """
+        Returns True if this file looks valid. The file is valid if the end
+        of the file has the md5 digest for the first part of the file.
+        """
+        self._file.seek(0)
+        contents = self._file.read()
+        self.original_hash, binary = contents[-40:], contents[:-40]
+        self.new_hash = hashlib.sha1(binary).hexdigest()
+        if not self.new_hash == self.original_hash:
+            print 'original  hash: ', self.original_hash
+            print 'new hash: ', self.new_hash
+            print 'size: ', self._file.tell()
+            return False
+        return True
+
+    # XXX not sure if we need all of these
+    def seek(self, offset, whence=os.SEEK_SET):
+        self._file.seek(offset, whence)
+
+    def tell(self):
+        return self._file.tell()
+
+    def read(self, size=-1):
+        return self._file.read(size)
+
+    def write(self, data):
+        self._file.write(data)
+        self._file.seek(0)
+
+
+class RandomContentFile(object):
+    def __init__(self, size, seed):
+        self.size = size
+        self.seed = seed
+        self.random = random.Random(self.seed)
+
+        # Boto likes to seek once more after it's done reading, so we need to save the last chunks/seek value.
+        self.last_chunks = self.chunks = None
+        self.last_seek = None
+
+        # Let seek initialize the rest of it, rather than dup code
+        self.seek(0)
+
+    def _mark_chunk(self):
+        self.chunks.append([self.offset, int(round((time.time() - self.last_seek) * NANOSECOND))])
+
+    def seek(self, offset, whence=os.SEEK_SET):
+        if whence == os.SEEK_SET:
+            self.offset = offset
+        elif whence == os.SEEK_END:
+            self.offset = self.size + offset;
+        elif whence == os.SEEK_CUR:
+            self.offset += offset
+
+        assert self.offset == 0
+
+        self.random.seed(self.seed)
+        self.buffer = ''
+
+        self.hash = hashlib.md5()
+        self.digest_size = self.hash.digest_size
+        self.digest = None
+
+        # Save the last seek time as our start time, and the last chunks
+        self.last_chunks = self.chunks
+        # Before emptying.
+        self.last_seek = time.time()
+        self.chunks = []
+
+    def tell(self):
+        return self.offset
+
+    def _generate(self):
+        # generate and return a chunk of pseudorandom data
+        size = min(self.size, 1*1024*1024) # generate at most 1 MB at a time
+        chunks = int(math.ceil(size/8.0))  # number of 8-byte chunks to create
+
+        l = [self.random.getrandbits(64) for _ in xrange(chunks)]
+        s = struct.pack(chunks*'Q', *l)
+        return s
+
+    def read(self, size=-1):
+        if size < 0:
+            size = self.size - self.offset
+
+        r = []
+
+        random_count = min(size, self.size - self.offset - self.digest_size)
+        if random_count > 0:
+            while len(self.buffer) < random_count:
+                self.buffer += self._generate()
+            self.offset += random_count
+            size -= random_count
+            data, self.buffer = self.buffer[:random_count], self.buffer[random_count:]
+            if self.hash is not None:
+                self.hash.update(data)
+            r.append(data)
+
+        digest_count = min(size, self.size - self.offset)
+        if digest_count > 0:
+            if self.digest is None:
+                self.digest = self.hash.digest()
+                self.hash = None
+            self.offset += digest_count
+            size -= digest_count
+            data = self.digest[:digest_count]
+            r.append(data)
+
+        self._mark_chunk()
+
+        return ''.join(r)
+
+
+class PrecomputedContentFile(object):
+    def __init__(self, f):
+        self._file = tempfile.SpooledTemporaryFile()
+        f.seek(0)
+        shutil.copyfileobj(f, self._file)
+
+        self.last_chunks = self.chunks = None
+        self.seek(0)
+
+    def seek(self, offset, whence=os.SEEK_SET):
+        self._file.seek(offset, whence)
+
+        if self.tell() == 0:
+            # only reset the chunks when seeking to the beginning
+            self.last_chunks = self.chunks
+            self.last_seek = time.time()
+            self.chunks = []
+
+    def tell(self):
+        return self._file.tell()
+
+    def read(self, size=-1):
+        data = self._file.read(size)
+        self._mark_chunk()
+        return data
+
+    def _mark_chunk(self):
+        elapsed = time.time() - self.last_seek
+        elapsed_nsec = int(round(elapsed * NANOSECOND))
+        self.chunks.append([self.tell(), elapsed_nsec])
+
+class FileVerifier(object):
+    def __init__(self):
+        self.size = 0
+        self.hash = hashlib.md5()
+        self.buf = ''
+        self.created_at = time.time()
+        self.chunks = []
+
+    def _mark_chunk(self):
+        self.chunks.append([self.size, int(round((time.time() - self.created_at) * NANOSECOND))])
+
+    def write(self, data):
+        self.size += len(data)
+        self.buf += data
+        digsz = -1*self.hash.digest_size
+        new_data, self.buf = self.buf[0:digsz], self.buf[digsz:]
+        self.hash.update(new_data)
+        self._mark_chunk()
+
+    def valid(self):
+        """
+        Returns True if this file looks valid. The file is valid if the end
+        of the file has the md5 digest for the first part of the file.
+        """
+        if self.size < self.hash.digest_size:
+            return self.hash.digest().startswith(self.buf)
+
+        return self.buf == self.hash.digest()
+
+
+def files(mean, stddev, seed=None):
+    """
+    Yields file-like objects with effectively random contents, where
+    the size of each file follows the normal distribution with `mean`
+    and `stddev`.
+
+    Beware, the file-likeness is very shallow. You can use boto's
+    `key.set_contents_from_file` to send these to S3, but they are not
+    full file objects.
+
+    The last 128 bits are the MD5 digest of the previous bytes, for
+    verifying round-trip data integrity. For example, if you
+    re-download the object and place the contents into a file called
+    ``foo``, the following should print two identical lines:
+
+      python -c 'import sys, hashlib; data=sys.stdin.read(); print hashlib.md5(data[:-16]).hexdigest(); print "".join("%02x" % ord(c) for c in data[-16:])' <foo
+
+    Except for objects shorter than 16 bytes, where the second line
+    will be proportionally shorter.
+    """
+    rand = random.Random(seed)
+    while True:
+        while True:
+            size = int(rand.normalvariate(mean, stddev))
+            if size >= 0:
+                break
+        yield RandomContentFile(size=size, seed=rand.getrandbits(32))
+
+
+def files2(mean, stddev, seed=None, numfiles=10):
+    """
+    Yields file objects with effectively random contents, where the
+    size of each file follows the normal distribution with `mean` and
+    `stddev`.
+
+    Rather than continuously generating new files, this pre-computes and
+    stores `numfiles` files and yields them in a loop.
+    """
+    # pre-compute all the files (and save with TemporaryFiles)
+    fs = []
+    for _ in xrange(numfiles):
+        t = tempfile.SpooledTemporaryFile()
+        t.write(generate_file_contents(random.normalvariate(mean, stddev)))
+        t.seek(0)
+        fs.append(t)
+
+    while True:
+        for f in fs:
+            yield f
+
+
+def names(mean, stddev, charset=None, seed=None):
+    """
+    Yields strings that are somewhat plausible as file names, where
+    the lenght of each filename follows the normal distribution with
+    `mean` and `stddev`.
+    """
+    if charset is None:
+        charset = string.ascii_lowercase
+    rand = random.Random(seed)
+    while True:
+        while True:
+            length = int(rand.normalvariate(mean, stddev))
+            if length > 0:
+                break
+        name = ''.join(rand.choice(charset) for _ in xrange(length))
+        yield name
--- a/s3tests_boto3/roundtrip.py
+++ b/s3tests_boto3/roundtrip.py
@ -0,0 +1,219 @@
+import gevent
+import gevent.pool
+import gevent.queue
+import gevent.monkey; gevent.monkey.patch_all()
+import itertools
+import optparse
+import os
+import sys
+import time
+import traceback
+import random
+import yaml
+
+import realistic
+import common
+
+NANOSECOND = int(1e9)
+
+def writer(bucket, objname, fp, queue):
+    key = bucket.new_key(objname)
+
+    result = dict(
+        type='w',
+        bucket=bucket.name,
+        key=key.name,
+        )
+
+    start = time.time()
+    try:
+        key.set_contents_from_file(fp, rewind=True)
+    except gevent.GreenletExit:
+        raise
+    except Exception as e:
+        # stop timer ASAP, even on errors
+        end = time.time()
+        result.update(
+            error=dict(
+                msg=str(e),
+                traceback=traceback.format_exc(),
+                ),
+            )
+        # certain kinds of programmer errors make this a busy
+        # loop; let parent greenlet get some time too
+        time.sleep(0)
+    else:
+        end = time.time()
+
+    elapsed = end - start
+    result.update(
+        start=start,
+        duration=int(round(elapsed * NANOSECOND)),
+        chunks=fp.last_chunks,
+        )
+    queue.put(result)
+
+
+def reader(bucket, objname, queue):
+    key = bucket.new_key(objname)
+
+    fp = realistic.FileVerifier()
+    result = dict(
+            type='r',
+            bucket=bucket.name,
+            key=key.name,
+            )
+
+    start = time.time()
+    try:
+        key.get_contents_to_file(fp)
+    except gevent.GreenletExit:
+        raise
+    except Exception as e:
+        # stop timer ASAP, even on errors
+        end = time.time()
+        result.update(
+            error=dict(
+                msg=str(e),
+                traceback=traceback.format_exc(),
+                ),
+            )
+        # certain kinds of programmer errors make this a busy
+        # loop; let parent greenlet get some time too
+        time.sleep(0)
+    else:
+        end = time.time()
+
+        if not fp.valid():
+            result.update(
+                error=dict(
+                    msg='md5sum check failed',
+                    ),
+                )
+
+    elapsed = end - start
+    result.update(
+        start=start,
+        duration=int(round(elapsed * NANOSECOND)),
+        chunks=fp.chunks,
+        )
+    queue.put(result)
+
+def parse_options():
+    parser = optparse.OptionParser(
+        usage='%prog [OPTS] <CONFIG_YAML',
+        )
+    parser.add_option("--no-cleanup", dest="cleanup", action="store_false",
+        help="skip cleaning up all created buckets", default=True)
+
+    return parser.parse_args()
+
+def main():
+    # parse options
+    (options, args) = parse_options()
+
+    if os.isatty(sys.stdin.fileno()):
+        raise RuntimeError('Need configuration in stdin.')
+    config = common.read_config(sys.stdin)
+    conn = common.connect(config.s3)
+    bucket = None
+
+    try:
+        # setup
+        real_stdout = sys.stdout
+        sys.stdout = sys.stderr
+
+        # verify all required config items are present
+        if 'roundtrip' not in config:
+            raise RuntimeError('roundtrip section not found in config')
+        for item in ['readers', 'writers', 'duration', 'files', 'bucket']:
+            if item not in config.roundtrip:
+                raise RuntimeError("Missing roundtrip config item: {item}".format(item=item))
+        for item in ['num', 'size', 'stddev']:
+            if item not in config.roundtrip.files:
+                raise RuntimeError("Missing roundtrip config item: files.{item}".format(item=item))
+
+        seeds = dict(config.roundtrip.get('random_seed', {}))
+        seeds.setdefault('main', random.randrange(2**32))
+
+        rand = random.Random(seeds['main'])
+
+        for name in ['names', 'contents', 'writer', 'reader']:
+            seeds.setdefault(name, rand.randrange(2**32))
+
+        print 'Using random seeds: {seeds}'.format(seeds=seeds)
+
+        # setup bucket and other objects
+        bucket_name = common.choose_bucket_prefix(config.roundtrip.bucket, max_len=30)
+        bucket = conn.create_bucket(bucket_name)
+        print "Created bucket: {name}".format(name=bucket.name)
+        objnames = realistic.names(
+            mean=15,
+            stddev=4,
+            seed=seeds['names'],
+            )
+        objnames = itertools.islice(objnames, config.roundtrip.files.num)
+        objnames = list(objnames)
+        files = realistic.files(
+            mean=1024 * config.roundtrip.files.size,
+            stddev=1024 * config.roundtrip.files.stddev,
+            seed=seeds['contents'],
+            )
+        q = gevent.queue.Queue()
+
+        logger_g = gevent.spawn(yaml.safe_dump_all, q, stream=real_stdout)
+
+        print "Writing {num} objects with {w} workers...".format(
+            num=config.roundtrip.files.num,
+            w=config.roundtrip.writers,
+            )
+        pool = gevent.pool.Pool(size=config.roundtrip.writers)
+        start = time.time()
+        for objname in objnames:
+            fp = next(files)
+            pool.spawn(
+                writer,
+                bucket=bucket,
+                objname=objname,
+                fp=fp,
+                queue=q,
+                )
+        pool.join()
+        stop = time.time()
+        elapsed = stop - start
+        q.put(dict(
+                type='write_done',
+                duration=int(round(elapsed * NANOSECOND)),
+                ))
+
+        print "Reading {num} objects with {w} workers...".format(
+            num=config.roundtrip.files.num,
+            w=config.roundtrip.readers,
+            )
+        # avoid accessing them in the same order as the writing
+        rand.shuffle(objnames)
+        pool = gevent.pool.Pool(size=config.roundtrip.readers)
+        start = time.time()
+        for objname in objnames:
+            pool.spawn(
+                reader,
+                bucket=bucket,
+                objname=objname,
+                queue=q,
+                )
+        pool.join()
+        stop = time.time()
+        elapsed = stop - start
+        q.put(dict(
+                type='read_done',
+                duration=int(round(elapsed * NANOSECOND)),
+                ))
+
+        q.put(StopIteration)
+        logger_g.get()
+
+    finally:
+        # cleanup
+        if options.cleanup:
+            if bucket is not None:
+                common.nuke_bucket(bucket)
--- a/s3tests_boto3/tests/test_realistic.py
+++ b/s3tests_boto3/tests/test_realistic.py
@ -0,0 +1,79 @@
+from s3tests import realistic
+import shutil
+import tempfile
+
+
+# XXX not used for now
+def create_files(mean=2000):
+    return realistic.files2(
+        mean=1024 * mean,
+        stddev=1024 * 500,
+        seed=1256193726,
+        numfiles=4,
+    )
+
+
+class TestFiles(object):
+    # the size and seed is what we can get when generating a bunch of files
+    # with pseudo random numbers based on sttdev, seed, and mean.
+
+    # this fails, demonstrating the (current) problem
+    #def test_random_file_invalid(self):
+    #    size = 2506764
+    #    seed = 3391518755
+    #    source = realistic.RandomContentFile(size=size, seed=seed)
+    #    t = tempfile.SpooledTemporaryFile()
+    #    shutil.copyfileobj(source, t)
+    #    precomputed = realistic.PrecomputedContentFile(t)
+    #    assert precomputed.valid()
+
+    #    verifier = realistic.FileVerifier()
+    #    shutil.copyfileobj(precomputed, verifier)
+
+    #    assert verifier.valid()
+
+    # this passes
+    def test_random_file_valid(self):
+        size = 2506001
+        seed = 3391518755
+        source = realistic.RandomContentFile(size=size, seed=seed)
+        t = tempfile.SpooledTemporaryFile()
+        shutil.copyfileobj(source, t)
+        precomputed = realistic.PrecomputedContentFile(t)
+
+        verifier = realistic.FileVerifier()
+        shutil.copyfileobj(precomputed, verifier)
+
+        assert verifier.valid()
+
+
+# new implementation
+class TestFileValidator(object):
+
+    def test_new_file_is_valid(self):
+        size = 2506001
+        contents = realistic.generate_file_contents(size)
+        t = tempfile.SpooledTemporaryFile()
+        t.write(contents)
+        t.seek(0)
+        fp = realistic.FileValidator(t)
+        assert fp.valid()
+
+    def test_new_file_is_valid_when_size_is_1(self):
+        size = 1
+        contents = realistic.generate_file_contents(size)
+        t = tempfile.SpooledTemporaryFile()
+        t.write(contents)
+        t.seek(0)
+        fp = realistic.FileValidator(t)
+        assert fp.valid()
+
+    def test_new_file_is_valid_on_several_calls(self):
+        size = 2506001
+        contents = realistic.generate_file_contents(size)
+        t = tempfile.SpooledTemporaryFile()
+        t.write(contents)
+        t.seek(0)
+        fp = realistic.FileValidator(t)
+        assert fp.valid()
+        assert fp.valid()
--- a/setup.py
+++ b/setup.py
@ -14,6 +14,7 @@ setup(

    install_requires=[
        'boto >=2.0b4',
+        'boto3 >=1.0.0',
        'PyYAML',
        'bunch >=1.0.0',
        'gevent >=1.0',