Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TableNumber and ImageNumber discrepancy #8

Closed
niranjchandrasekaran opened this issue Dec 20, 2022 · 4 comments
Closed

TableNumber and ImageNumber discrepancy #8

niranjchandrasekaran opened this issue Dec 20, 2022 · 4 comments

Comments

@niranjchandrasekaran
Copy link

@Arkkienkeli asked

I found that TableNumber and ImageNumber columns are used differently across sources,
that might be critical if those are used elsewhere for selection.

All below for one arbitrary plate in the source.

For source 3:
sqlite> select count(distinct TableNumber) from Nuclei;
384
sqlite> select count(distinct ImageNumber) from Nuclei;
9

For source 4:
sqlite> select count(distinct TableNumber) from Nuclei;
3416
sqlite> select count(distinct ImageNumber) from Nuclei;
3416

For source 2:
sqlite> select count(distinct TableNumber) from Nuclei;
1
sqlite> select count(distinct ImageNumber) from Nuclei;
2302

I believe that both TableNumber and ImageNumber should be unique for image as in source 4, though source 2 and 3 don't seem to follow this convention.
I noticed that because I use John Arevalo's script for location extraction (which uses join on TableNumber field), which usually works well on different datasets but fails on source 3.
The safe choice seems to use both TableNumber and ImageNumber in queries (if the distinction per image is needed).

@niranjchandrasekaran
Copy link
Author

tagging in @bethac07 and @shntnu

@bethac07
Copy link

I'm not surprised to see the source3 setup - it's what you might expect to see in certain ways of setting up CellProfiler (that's WHY we add TableNumber, because if ImageNumber was always unique, we wouldn't need it!) - I can go into more detail if you're curious but it's not critical to understand here.

I'm not sure why source2 looks the way it does, but as long as between the two columns we can always figure out which is which, then I'm fine with it :)

@shntnu shntnu transferred this issue from jump-cellpainting/datasets Dec 20, 2022
@shntnu
Copy link
Collaborator

shntnu commented Dec 20, 2022

I've moved this issue to the repo we are using to track our collaboration with Mike Ando + Verily
@Arkkienkeli - you might find this issue useful to skim #5

I'll try to peek back in later

@shntnu
Copy link
Collaborator

shntnu commented Feb 27, 2023

I'll close this in favor of #5

@shntnu shntnu closed this as completed Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants