Skip to content
Berit Müller edited this page May 7, 2018 · 31 revisions

REEEM Database

Database setup

The database setup contains all (SQL) scripts to create the REEEM database structure. It is highly oriented on the OEP structure in order to assure compatibility between these databases. The database setup comprises:

Database adapters

Database adapters are (python) scripts that read in data from a file and write it to a database table.

The reeem.io script is needed for each adapter. It contains a method to establish a database connection to the REEEM PostgreSQL database. In addition it contains a python logging function and a special database scenario log function.

Each REEEM modelling team has a customized data adapter. The adapter does not effect the data itself but is used an importer or exporter between the (online) database and a (local) file. There is a main issue for each modelling team (with green flag help wanted).

Data

File names

The naming of the data files must follow the convention:

[Date]_[PATHWAY]_[MODEL]_[FRAMEWORK]_[VERSION]_[I/O]

Example: 2018-01-16_BASE_TimesPanEU_FrameworkV2_DataV1_Input.xslx

[Date] YYYY-MM-DD. (ISO 8601) Use the hyphen (-) to separate years, months and days.

[PATHWAY] Name of the pathway. As of now (document to be kept updated)

[MODEL] Name of the model/tool.

[FRAMEWORK] FrameworkVn. It refers to the improved modelling frameworks.

[VERSION] DataVn. Necessary since the pathway, the model and the framework may be the same, but some inputs may have been tweaked/corrected.

[I/O] Input or Output.

Data structure #5

Each data set (table) needs at least these columns [datatype]:

  • nid - A unique id per sheet or file [integer]
  • region - Country code (iso_2) [text]
  • schema - Database schema (supply, demand, economy, ...) [text]
  • field - Quite flexible I (Power, Heat, Mobility, Demands, ...) [text]
  • category - Quite flexible II (Technology, ...) [text]
  • year - YYYY [integer]
  • indicator - Parameter name [text]
  • value - number [double precision]
  • unit - unit abbreviation [text]
  • aggregation - Single or sum (for a category) [boolean]
  • source - Where does the value come from [text]

In addition there are some fields added automatically:

  • id (unique table id) [serial]
  • Pathway (From folder structure) [text]
  • Version (From TIMES Version) [text]
  • Updated (now()) [timestamp]

Paths

The database adapters are using relative file paths. In order to assure a successful run on different machines the folder structure has to be followed!

  1. Clone the reeem_db repository -> HELP
  2. Make a copy of the Model_Data folder and insert into folder database_adapter -> see #15
  3. Execute adapter from repository

Warning: Do not add data to git!

Folder structure #15

reeem_folder_structure

Database views #29 with jupyter notebooks

The data can be selected and filtered using views. These views are written in SQL and are executed in the database. As additional service, the RLI has gathered visualisation scripts in jupyter notebooks. To run them on your computer you have to create an according environment. We do that with anaconda (which you have to install). There are additional requirements to run the notebooks. Therefore you have to execute the following commands (only once - when you run them again you start with the "activate" command):

Create conda environment

open cmd
cd ...\reeem_db\database_views\reeem_jupyter\
conda env create -f requirement_reeem-vis.yml
conda info --envs

Run Jupyter Notebooks

open cmd
cd ...\reeem_db\database_views\reeem_jupyter\
activate reeem-vis
jupyter notebook

  • 1: One indicator for one region in one pathway over time (LinePlot)
  • 2: One indicator for one region in different pathways over time (LinePlot)
  • 3: One indicator for all regions in one pathway over time (LinePlot, StapelPlot)

to execute the code in the notepads you have to push enter+shift