Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discrepancies between (HRV) zarr files on gcp and downloaded satellite data #191

Open
tomasvanoyen opened this issue Aug 23, 2023 · 1 comment
Assignees
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@tomasvanoyen
Copy link

tomasvanoyen commented Aug 23, 2023

Describe the bug

Apparently, I came across a discrepancy between the public (HRV) dataset on gcp and data directly downloaded from EUMETSAT api.

To Reproduce

Steps to reproduce the behavior:

  1. Connect to public satellite data by:

gcs = gcsfs.GCSFileSystem()
zstore = 'gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_hrv.zarr'
mapper = gcs.get_mapper(zstore)
ds = xr.open_zarr(mapper, consolidated=True)
and plot the data with coastlines:

projection = { 'proj': 'geos', 'lon_0': 9.5, 'h': 35785831, 'x_0': 0, 'y_0': 0, 'a': 6378169, 'rf': 295.488065897014 }

fig = plt.figure(figsize=(20, 20))
crs = ccrs.Geostationary( central_longitude=projection['lon_0'], satellite_height=projection['h'], )
ax = plt.axes(projection=crs)
ax.coastlines(resolution='10m', alpha=0.5, color='blue')

ds['data'].sel(time=np.datetime64('2020-07-02T07:00:00'), variable='HRV').plot( ax=ax, cmap='gray', add_colorbar=False )
clearly shows that the coastlines are offset with the satelliet observation data (have a look at Libya).

On the other hand, after downloading with the same data with eumdac cli (eumdac download -c EO:EUM:DAT:MSG:MSG15-RSS --start 2020-07-02T06:45 --end 2020-07-02T07:15) and combining the *.NAT files with the methods in scripts/extend_gcp_zarr.py (temporary link here) removes the discrepancy between coastline and satellite observation.

Hence, it appears something is incorrect about the satellite data in the public gc-bucket.

I am guessing here, but could it be that this is because the public zarr file lumps all information over 1 year together with moving spatial dimensions - during the year - of the observations? If this is the case, the data should be temporally divided over move zarr files.

Best regards,

Tomas

@tomasvanoyen tomasvanoyen added the bug Something isn't working label Aug 23, 2023
@tomasvanoyen tomasvanoyen changed the title discrepancies between zarr files on gcp and downloaded satellite data discrepancies between (hrv) zarr files on gcp and downloaded satellite data Aug 23, 2023
@tomasvanoyen tomasvanoyen changed the title discrepancies between (hrv) zarr files on gcp and downloaded satellite data discrepancies between (HRV) zarr files on gcp and downloaded satellite data Aug 23, 2023
@jacobbieker
Copy link
Member

Hi,

Sorry about the delayed response, this slipped through the cracks! But yes, there are some issues with the public GCP data because it takes the coordinate information of the first timestep of the year, and applies it to the whole year. Theoretically, the datasets of x_geostationary_coordinates and y_geostationary_coordinates should have the per-timestep coordinates, but the processing seems to have not worked, so they don't actually contain that data.

I will try to fix that processing so that the newer zarrs can have that fixed, although it might take quite a while. Primarily, we need to

  • Get the x and y geostationary coordinates for all timesteps in the dataset (per year working backwards seems to make the most sense)
  • Replace the current ``x_geostationary_coordinatesandy_geostationary_coordinates` with the actual values

That should allow the images to be shifted to the correct locations for plotting and the like. Sorry for the issue with that data.

@jacobbieker jacobbieker added good first issue Good for newcomers help wanted Extra attention is needed labels Nov 22, 2023
@jacobbieker jacobbieker self-assigned this Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants