Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windowed_r_squared: IndexError for single-SNP windows #337

Open
pmonnahan opened this issue Aug 19, 2020 · 2 comments
Open

windowed_r_squared: IndexError for single-SNP windows #337

pmonnahan opened this issue Aug 19, 2020 · 2 comments

Comments

@pmonnahan
Copy link

Hi,

I have been trying to implement the windowed_r_squared function, and I believe that I've come across a bug related to windows containing a single SNP. Below is a snippet generated via pdb, showing that the error arises within the 'windowed_statistic' function. I noticed in the source code of 'windowed_statistic' that there is an 'if n==0' catch and am wondering if this needs to be generalized for r2 calculations where a pair of observations is required at a minimum.

I could increase window size or try to identify and remove these windows beforehand, but I thought I would check whether this is a known (or non-) issue on your end.

r2, windows, counts = allel.windowed_r_squared(pos, ct, size=winsize, start=start, stop=stop, step=int(winsize / 2), fill=-9)`
(Pdb) s
--Call--
/home/spectorl/pmonnaha/.conda/envs/py36/lib/python3.6/site-packages/allel/stats/ld.py(161)windowed
_r_squared()
-> def windowed_r_squared(pos, gn, size=None, start=None, stop=None, step=None,
(Pdb)
/home/spectorl/pmonnaha/.conda/envs/py36/lib/python3.6/site-packages/allel/stats/ld.py(214)windowed
_r_squared()
-> if isinstance(percentile, (list, tuple)):
(Pdb)
/home/spectorl/pmonnaha/.conda/envs/py36/lib/python3.6/site-packages/allel/stats/ld.py(222)windowed
_r_squared()
-> def statistic(gnw):
(Pdb)
/home/spectorl/pmonnaha/.conda/envs/py36/lib/python3.6/site-packages/allel/stats/ld.py(226)windowe$
_r_squared()

... After about 5 windows with error-free calculation, I encounter this:

return windowed_statistic(pos, gn, statistic, size, start=start,
(Pdb) s
wv = values[start_idx:stop_idx]
(Pdb) n
/home/spectorl/pmonnaha/.conda/envs/py36/lib/python3.6/site-packages/allel/stats/window.py(366)windowed_statistic()
s = statistic(wv)
(Pdb) print(n)
1
(Pdb) print(wv)
[[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 2 1 1 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0]]
(Pdb) wv.shape
(1, 99)
(Pdb) n
IndexError: cannot do a non-empty take from an empty axes.
/home/spectorl/pmonnaha/.conda/envs/py36/lib/python3.6/site-packages/allel/stats/window.py(366)windowed_statistic()
-> s = statistic(wv)

Thanks!

@hardingnj
Copy link
Collaborator

Thanks @pmonnahan .

This looks to reproduce with:

import allel
import numpy as np

pos = np.random.choice(100_000, 100, replace=False)

gn = np.random.randint(0, 3, (100, 100))

r2, windows, counts = allel.windowed_r_squared(pos, gn, size=5000, start=1, stop=100_000, fill=-9)

And this line (equivalent) causes the error:

np.percentile(np.array([]), 50)

I guess a check for the shape of gnw is the best option here.

@pmonnahan
Copy link
Author

Thanks @hardingnj for the reprex and marking for change. Also, thanks to you and others for such a great package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants