FsCheck turns 3.0? #198

kurtschelfthout · 2016-02-14T11:17:42Z

This issue is to discuss ideas or changes for FsCheck 3.0. I started thinking about this starting from existing issues and recurring questions that indicate more problematic parts; so far there aren't very really new features in here - mostly a rewrite or update of existing ones.

Where I have worked on these things I've included the branch name; these branches won't necessarily work or even compile. Sometimes they are just broad strokes to play with the idea.

EDIT: all development marked done below is merged to the fscheck3 branch.

branch replace-random Replace the random number generation by the algorithm in "Fast splittable pseudo-random number generators" by Steel et al. which provides a better basis to do Splittable PRNG. The current implementation taken from Hugs has been shown to have serious statistical defects. Also FsCheck doesn't need to rely on splitting quite so much, if a generator also returns the seed after use.
- Figure out why it's slower (tuple allocations, fixed)
branch Gen-CPS even though function using Gen may be tail-recursive, that doesn't necessarily mean the overall generator is tail recursive. Using a CPS style representation of Gen can solve some of these issues. This change is relatively straightforward I think, and can be made without changing any part of the API except the Gen constructor.
- This is A LOT slower. Figure out why, may have to can this one...
branch no-static-arb Remove the Arbitrary registration mechanism. (Perhaps keep an immutable one just keeping the default generators that ship with FsCheck). Replace with explicitly passing Arbitrary instance (e.g. FsList takes an Arbitrary<'a> to generate the elements with). We'll still need some kind of map 'a -> Arbitrary<'a> to parametrize the reflective Arbitrary instance with, so it's likely the "typeclass" discovery mechanism is still useful. I feel like this is perhaps slightly more verbose, but will remove a bunch of confusion and implicit "magic". In addition, it's hard to maintain some kind of consistency when tests can for example run concurrently, because it relies on modifying global static values.
Re-think how we handle running tests. I've been wondering if we want to keep maintaining integration with test running frameworks. From a maintenance perspective (and given all the issues on how to interopate with xunit/nunit and all their versions, perhaps from a user perspective too) it would certainly be a lot simpler if FsCheck would just be a library. For easy integration, we could reverse the defaults: Check.Quick could throw an exception by default, we can remove Check.QuickThrowOnFailure and add Check.QuickNoThrow. I'd also like some to expose running tests more explicitly, perhaps as an IEnumerable of test/shink results instead of the callback based interface it is now; this would hopefully allow things like storing the results in a DB or whatever we may want to add later on.
Remove some of the assertion-like property combinators, like Prop.within. Users tend to use the assertions from some more focused framework, no real point in maintaining these here really.
Maybe remove the possibility of returning bool to signify success/failure, standardize on throwing exceptions. This is standard practice anyway for .NET based testing tools; it removes a bunch of overloads on the C# side that the C# compiler is not super happy with; it may be possible to get rid of label (as the test failure is propagated via the exception message).

Any thoughts on these or further ideas are appreciated.

The text was updated successfully, but these errors were encountered:

moodmosaic · 2016-02-14T12:15:49Z

replace-random

The current implementation taken from Hugs has been shown to have serious statistical defects

Have you read this thread? — Granted, the same seed produces the same results, but if you come up with a good seed the probability distribution on the generated set ought to be good enough. (?)

Also FsCheck doesn't need to rely on splitting quite so much

In fact, splitting should be required only in bind. — I can't imagine any other place where you'd really need to split.

Gen-CPS

That'd be awesome and essentially it'd bring up this great discussion you had on Stack Overflow back then.

no-static-arb

FsCheck is great, but I think it was a design mistake to emulate Arbitrary this way. — I'd keep the arbitrary class, as a pair of a generator and a (optional) shrinker like so:

/// <summary>
/// Represents a pair of a generator and a shrinker, used for random generation
/// and (optional) shrinking of values.
/// </summary>
[<AbstractClass>]
type Arbitrary<'a>() =

    /// <summary>
    /// A generator for values of the given type.
    /// </summary>
    abstract Generator : Gen<'a>

    /// <summary>
    /// Produces a (possibly) empty list of all the possible immediate shrinks
    /// of the given value. The default implementation returns the empty list,
    /// so will not try to shrink the value.
    /// Most implementations of shrink should try at least three things:
    ///  - Shrink a term to any of its immediate subterms.
    ///  - Recursively apply shrink to all immediate subterms.
    ///  - Type-specific shrinkings such as replacing a type by a simpler ones.
    /// </summary>
    abstract      Shrink   : 'a -> 'a seq
    override this.Shrink _ = Seq.empty

But then I'd use this approach instead. — On a side note: Quviq's version doesn't even use the notion of Arbitrary.

Remove Prop.within

Yes, I think it makes sense. After all, the quantification is universal, isn't it? (You don't currently support existentials.)

ploeh · 2016-02-14T15:21:41Z

branch no-static-arb

That sounds good. I never felt comfortable with the more imperative, side-effecty part of that API.

I still think that it'd be useful to have an API that 'configures' FsCheck to behave differently than it does by default, but I suppose this can be done with a pure function instead; something like Config -> Config.

Test runners

I support rethinking this, but I'd hate to dispense with the xUnit.net Glue Library. I think the declarative nature is extremely valuable: simply being able to declare a function with some arguments, and have those arguments automatically supplied by FsCheck is, in my opinion, extraordinarily compelling.

I'm happy to support this library, but I haven't seen any issues in this regard since we completed the migration to xUnit.net 2.

I'll also be happy to help develop a new version of this library for FsCheck 3.

kurtschelfthout · 2016-02-14T16:52:29Z

@moodmosaic

replace-random

Splitting as implemented has no statistical basis, no tests, and is known to have problematic behavior, see e.g. Claessen and Palka, "Splittable Pseudorandom number generators using cryptographic hashing".

You don't need split for bind: https://github.com/fscheck/FsCheck/blob/replace-random/src/FsCheck/Gen.fs#L59-L62 but you do need split for e.g. running tests in parallel, and for generating functions - basically everywhere you need a "branch" but can't predict how many values you'll need in that branch.

no-static-arb

Frankly I'm not sure Arbitrary is worth keeping. I've run into a few cases where you want to use a shrinker for a type without using its shrinker or vice versa - e.g. when a generator fails, for example - and then tying the two together makes you jump through hoops. I know about Erlang's version, gen/shrink is just passed around - that's basically what I'm proposing to do. At the same time we'll need some kind of dictionary type -> gen/shrink for that type to e.g. configure the reflective generator/shrinker.

Prop.within

I think there is some confusion here - Prop.within runs something within a timeout. So it's not related to quantification, as far as I can see.

kurtschelfthout · 2016-02-14T17:02:47Z

@ploeh

no-static-arb

I think configuring the generator or shrinker would happen via the Prop.forAll function, broadly speaking. Not sure we still need the option in Config. Would be good to simplify it.

Test runners

I don't disagree. NUnit is more of a problem child :)

Would also like it if it could be more straightforwardly used out of the box with existing test runners. I sometimes forget and write Check.Quick inside a Fact and then have a test that always passes...

Perhaps we should revisit the old idea of splitting the glue libraries in different repos/releases.

ploeh · 2016-02-14T17:09:21Z

Perhaps we should revisit the old idea of splitting the glue libraries in different repos/releases.

We could. I think the most important is that it still lives under the FsCheck umbrella, so that it's readily discoverable, but I don't mind if it has a separate repository, and a separate release cadence.

For this, I'll defer to your decision.

moodmosaic · 2016-02-15T09:45:38Z

replace-random

see e.g. Claessen and Palka, "Splittable Pseudorandom number generators using cryptographic hashing"

I'd start by replacing Hugs/Random's StdGen algorithm with SplitMix, as it performs faster than TFGen.

What do you think?

kurtschelfthout · 2016-02-15T09:48:44Z

@moodmosaic Yes, that's what I did on the replace-random branch. That item is actually almost done.

moodmosaic · 2016-02-15T09:52:45Z

👍

kurtschelfthout · 2016-02-17T20:51:19Z

@ploeh sorry overlooked your reply on separating test runners. Yes, agreed - if it were split up it would live in a separate repo under the FsCheck organization.

kurtschelfthout · 2016-04-11T14:38:20Z

Some progress report.

As you can tell by the above, I've switched the random number generator on the fscheck3.0 branch, and also started passing the seed around instead of relying on splitting. Unfortunately, that made running FsCheck's test about twice as slow (~11 sec vs 5 sec on my machine). So that doesn't bode well, but I will do some profiling to see what has happened - I suspect it's the tuple allocations in Gen which is now seed -> size -> 'a * seed. Perhaps some - gasp - mutability helps here.

I also have the CPS working but that is even slower. I'm considering just not doing that, but we'll see. I'll push it first, but to yet another branch...

moodmosaic · 2016-05-04T10:14:58Z

no-static-arb

@kurtschelfthout, like you said, the Arbitrary registration could (or rather should?) be removed.

Actually, I'd propose to remove the entire 'type-class emulation' discovery mechanism, or rather bring it down to something minimal, like this:

(Pseudo-code)

let private lexicon =
    dict [ (typeof<unit>, gen/shrink of unit)
           (typeof<bool>, gen/shrink of bool)
           ... all built-in gen/shrink pairs go here ]

let arbitrary<'a> = lexicon.[typeof<'a>] |> unbox<Arbitrary<'a>>

That'll bring the v3 model closer to the Erlang version, instead of the Haskell version.

The 'type-class emulation' discovery mechanism could still be valuable for the xUnit.net data theories and so it could probably be moved into that project (?)

moodmosaic · 2016-05-04T10:15:40Z

Linking #236 with this issue as well, since it's v3 related.

kurtschelfthout · 2016-05-04T10:55:58Z

We need some kind of discovery to figure out type -> Arbitrary mapping for the reflective generators/shirnkers. So I don't think completely removing that is a goal, necessarily.

moodmosaic · 2016-05-04T10:58:40Z

For sure though, it could be simplified.

kurtschelfthout · 2016-05-04T11:15:28Z

I'm not so sure. I haven't looked at it recently, but afaik a lot of the code deals with putting together generic types, i.e. you ask for int * string * list<bool> and it figures out it needs to create a generator for 3-tuples, instantiates generic type arguments recursively with int, string and list<bool>.

Using static members on a class is useful because you can create generic methods, i.e. to easily create generic types that the generator/shrinker is parametrized on.

The pseudo code you write above does not work well because the second item in each tuple effectively needs a different type. You'd need to box them explicitly to object or something, which is ugly and unwieldy.

moodmosaic · 2016-05-04T13:10:27Z

That pseudo-code should work only for the built-in gen/shrink (Arbitrary) pairs. Obviously int * string * list<bool> isn't a built-in one.

The Arbitrary registration isn't part of the reflective Arbitrary so it may still be removed in v3...

kurtschelfthout · 2016-05-04T13:55:14Z

There seems to be some misunderstanding. It is a built-in one, effectively. It's inferred automatically from the built in Arbitrary instances for Tuple'3, list'1, int, string and bool.

What I want to remove is the side-effecting part of the registration, i.e. changing the type -> Arbitrary mapping by mutating the static dictionary, effectively.

But, it should still be possible to, say, instantiate the default list<'a> Arbitrary with your custom int Arbitrary so that you can reuse the list Arbitrary to generate a list<int> with the ints that you choose.

I can think of two ways of achieving this:

We change Arb.Default from a static method to Arb.Default(<some type representing type -> Arbitrary mapping>) which creates a new instance that has the given type -> Arbitrary mapping "in context". I.e. generate<'a> would be an instance method on this class.
We add a parameter to each default generator for each dependent type, i.e. instead of Arb.Default.FsList() it would be Arb.Default.FsList(elementArb:Arbitrary<'a>). Then we'd need to change the mechanism to figure out how to create a list<int> based on the type using some more reflection.

Note that 2 is kind of appealing and it seems we don't need a type -> Arbitrary dictionary, until you think about Arb.Default.Derive which is the reflective code that infers Arbitrary from a type. The only realistic way to parametrize that is to pass it a type -> Arbitrary dictionary. So I'm kind of leaning towards option 1 right now. But have yet to make changes, want to see how it looks.

moodmosaic · 2016-05-04T15:41:30Z

Yes. Option 1 could be easier. — Given the existing 'type-class emulation' mechanism, you'd basically replace the internal dictionary, right?

Option 2 feels more like the Erlang version, where you'd have to pass gen/shrink pair(s) around, for example:

> eqc_gen:sample(eqc_gen:list(eqc_gen:int())). 
[-2,-3,6,-9]
[-4]
[-2,-12,2]
[5,11,2,12,-7]
[11,-13,-1,-8]
[-14,-4]
[11,-6]
[-5,-15,17,-3,-4,0]
[]
[-6]
[-9,20,3,-8,1,15,-13]

And it'd still be possible to supply a custom eqc_gen:int() as well. But in this case for FsCheck, the reflective arbitrary mechanism gets in the way.

jack-pappas · 2016-06-30T14:17:40Z

@kurtschelfthout Now that my changes for the splittable RNG have been merged in #281, can you re-run your benchmark? I'm interested to see if you still see a performance impact from the new RNG (compared to FsCheck 2.x).

kurtschelfthout · 2016-07-04T07:24:07Z

@jack-pappas yes I certainly will; not a benchmark though, just FsCheck's tests itself.

Sorry for no or slow responses lately; life has been busy. I expect normal service to resume in about two weeks...

kurtschelfthout · 2016-08-03T20:36:26Z

@jack-pappas I have now merged your change into the fscheck3 branch. So far it seems to improve performance considerably; it's again up to par with the old implementation. Thanks a bunch!

theimowski · 2016-08-04T08:50:38Z

Do you think it is reasonable for v3 to remove ArbitraryAttribute (obsolete as of 2.6.0) in favor of new PropertiesAttribute?

kurtschelfthout · 2016-08-04T08:53:58Z

@theimowski yes, in v3 all the obsoleted stuff will go. But I intend to do that at the end to avoid too many merge problems. I am merging from master to v3 branch quite regularly. In case you are wondering, no need to send a PR for that.

This is a port of "Fast Splittable Pseudorandom Number Generators" by Steele et. al. [1]. The paper's algorithm provides decent randomness for most purposes but sacrifices cryptographic-quality randomness in favor of speed. The original implementation is tested with DieHarder and BigCrush; see the paper for details. This implementation is a port from the paper, and also taking into account the SplittableRandom.java source code in OpenJDK v8u40-b25 as well as splittable_random.ml in Jane Street's standard library overlay (kernel) v113.33.03, and Random.fs in FsCheck v3, which is the initial motivation of doing this port [2] although the idea of doing this port is for having it as the default, splittable random generator in dotnet-jack [3] – QuickCheck with shrinking for free. Other than the choice of initial seed for 'ofRandomSeed' this port should be faithful. Currently, we have not rerun the DieHarder, or BigCrush tests on this implementation. 1. Guy L. Steele, Jr., Doug Lea, Christine H. Flood Fast splittable pseudorandom number generators Comm ACM, 49(10), Oct 2014, pp453-472. 2. fscheck/FsCheck#198 3. #26

ploeh · 2016-12-17T16:13:04Z

FWIW, I've suggested an alternative for handling custom Arbitraries in #334. This would, AFAICT, provide a simple and decoupled, Reflection-based mechanism for custom types.

Remaining is figuring out how to deal with pre-defined types like string, bool, int64 and so on.

How about removing the Arbitrary label from FsCheck.Config and replacing it with a dictionary of well-known types? This would also provide an explicit extensibility mechanism for users.

kurtschelfthout · 2016-12-19T10:58:44Z

Yes - making the type -> Arbitrary dictionary more explicit is something I want to do. For further discussion on your suggestion, let's do that on the other issue.

haf · 2017-03-25T08:17:19Z

Please don't start throwing exceptions. They cost me time and reading of code as someone who maintains a runner in Expecto. Values please.

I think eiriktsarpalis/TypeShape is a nice way to moduralise the lookup of types.

Make it easy to generate sample data.

Can you let the IRunner implementation be responsible for creating and then unwrapping monads, given a list of parameters, and then (having the runner) calling back into FsCheck, with the unwrapped values as some later point? That way we can support cooperative multithreading and you don't have to create a monadic evaluator for every sort of monad there is (async, task, job, ...)

kurtschelfthout · 2017-09-24T20:26:41Z

Thanks for your thoughts on this thread so far. I have summarised and expanded the main ideas here in more focused issues #396 and #397 and am closing this thread.

The only thing that's new here is the Gen in CPS style - but I'm less inclined to work on that in the immediate future - initial prototype made things lots slower - and anyway it's more of an objective thing that either works or doesn't so doesn't need much discussion really.

kurtschelfthout mentioned this issue Mar 16, 2016

Add support for async methods returning Task<T> or Task #179

Closed

kurtschelfthout mentioned this issue May 16, 2016

Improve interaction with Unquote #130

Closed

This was referenced May 22, 2016

Consider making Random and TypeClass modules hidden in v3 #267

Closed

Make Gen.eval easier to use in v3 #268

Closed

ploeh mentioned this issue Dec 20, 2016

Convention-based Arbitrary registration #334

Open

kurtschelfthout added the vNext label Feb 6, 2017

moodmosaic mentioned this issue May 3, 2017

Gen registration and creation? hedgehogqa/fsharp-hedgehog#93

Closed

kurtschelfthout mentioned this issue Sep 24, 2017

RFC: Arbitrary derivation and registration #396

Closed

kurtschelfthout closed this as completed Sep 24, 2017

WalkerCodeRanger mentioned this issue Oct 9, 2018

XUnit Test Fails When Run with Others, Passes When Run Alone. Arb.register() being Lost? #453

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FsCheck turns 3.0? #198

FsCheck turns 3.0? #198

kurtschelfthout commented Feb 14, 2016 •

edited

Loading

moodmosaic commented Feb 14, 2016

ploeh commented Feb 14, 2016

kurtschelfthout commented Feb 14, 2016

kurtschelfthout commented Feb 14, 2016

ploeh commented Feb 14, 2016

moodmosaic commented Feb 15, 2016

kurtschelfthout commented Feb 15, 2016

moodmosaic commented Feb 15, 2016

kurtschelfthout commented Feb 17, 2016

kurtschelfthout commented Apr 11, 2016

moodmosaic commented May 4, 2016

moodmosaic commented May 4, 2016

kurtschelfthout commented May 4, 2016

moodmosaic commented May 4, 2016

kurtschelfthout commented May 4, 2016

moodmosaic commented May 4, 2016

kurtschelfthout commented May 4, 2016

moodmosaic commented May 4, 2016

jack-pappas commented Jun 30, 2016

kurtschelfthout commented Jul 4, 2016

kurtschelfthout commented Aug 3, 2016

theimowski commented Aug 4, 2016

kurtschelfthout commented Aug 4, 2016 •

edited

Loading

ploeh commented Dec 17, 2016

kurtschelfthout commented Dec 19, 2016

haf commented Mar 25, 2017

kurtschelfthout commented Sep 24, 2017

FsCheck turns 3.0? #198

FsCheck turns 3.0? #198

Comments

kurtschelfthout commented Feb 14, 2016 • edited Loading

moodmosaic commented Feb 14, 2016

ploeh commented Feb 14, 2016

kurtschelfthout commented Feb 14, 2016

kurtschelfthout commented Feb 14, 2016

ploeh commented Feb 14, 2016

moodmosaic commented Feb 15, 2016

kurtschelfthout commented Feb 15, 2016

moodmosaic commented Feb 15, 2016

kurtschelfthout commented Feb 17, 2016

kurtschelfthout commented Apr 11, 2016

moodmosaic commented May 4, 2016

moodmosaic commented May 4, 2016

kurtschelfthout commented May 4, 2016

moodmosaic commented May 4, 2016

kurtschelfthout commented May 4, 2016

moodmosaic commented May 4, 2016

kurtschelfthout commented May 4, 2016

moodmosaic commented May 4, 2016

jack-pappas commented Jun 30, 2016

kurtschelfthout commented Jul 4, 2016

kurtschelfthout commented Aug 3, 2016

theimowski commented Aug 4, 2016

kurtschelfthout commented Aug 4, 2016 • edited Loading

ploeh commented Dec 17, 2016

kurtschelfthout commented Dec 19, 2016

haf commented Mar 25, 2017

kurtschelfthout commented Sep 24, 2017

kurtschelfthout commented Feb 14, 2016 •

edited

Loading

kurtschelfthout commented Aug 4, 2016 •

edited

Loading