Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windowing: weird(?) behavior with the last window #363

Open
mufernando opened this issue Jun 25, 2021 · 4 comments
Open

windowing: weird(?) behavior with the last window #363

mufernando opened this issue Jun 25, 2021 · 4 comments

Comments

@mufernando
Copy link
Contributor

mufernando commented Jun 25, 2021

In the example below, we have four sites and I ask scikit-allel to output windowed diversity with windows of length 1.

Screen Shot 2021-06-25 at 15 30 33

Note, however, that I only get 3 values in the diversity array, and the last window ends up including sites 3 and 4.

This seems a bit odd to me, so I was trying to understand why scikit-allel handles it like this. Just to spell it out: I find it odd that the windows end up being 1,1], [2,2] and [3,4] instead of [1,1], [2,2], [3,3], [4,4].

cc @petrelharp

@petrelharp
Copy link
Collaborator

Here's what's happening:

  1. windows are specified as a sequence of (1-based) nonoverlapping [start, stop] values, with both ends inclusive.
  2. specifying start, stop, step uses range(start, stop, step) here to produce the starting points.
  3. range(1, 4, 1) produces [1,2,3], for the starting points, and then 4 is reserved for the last stop value, which makes sense, as you did say stop=4.

@mufernando, it's not clear to me what the right thing is here. At least this should be documented more (eg. say that endpoints of windows are both inclusive), but I'd have to work some more examples to figure out the consequences of different ways of doing this.

@alimanfoo
Copy link
Contributor

Looking at the implementation of position_windows again, I'm not sure why I did it like that. Perhaps this would be better:

    for window_start in range(start, stop, step):

        # determine window stop (inclusive)
        window_stop = window_start + size - 1
        if window_stop > stop:
            # last window
            window_stop = stop
            last = True

        windows.append([window_start, window_stop])

        if last:
            break

@petrelharp
Copy link
Collaborator

That seems like it'd be better...

@mufernando
Copy link
Contributor Author

mufernando commented Jun 28, 2021

That looks good @alimanfoo. The only thing to change there would be to do the range up until stop+1, because in general 1-based systems are left and right inclusive (see this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants