Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import of "Services" data (to Service & ServiceSubmission) #107

Open
34 tasks
joncison opened this issue Sep 11, 2020 · 0 comments
Open
34 tasks

Import of "Services" data (to Service & ServiceSubmission) #107

joncison opened this issue Sep 11, 2020 · 0 comments
Labels
content Issues concerns content (migration of old, or addition of new) high priority A high-priority issue that should be acted on ASAP. needs triage Issues that need to be discussed

Comments

@joncison
Copy link
Collaborator

joncison commented Sep 11, 2020

Overarching issues

  • the "Services" data in the CSV and the corresponding entities (Service and ServiceSubmission) in the new model have a patchy mapping - a discussion is needed to optimise these models. If these data must be imported sooner rather than later, an intermediate solution (simply reuse the old model?) may be needed - to discuss. The mapping is below.
  • if going with the new model, a Service and ServiceSubmission object will need to be created for each "Service" in the CSV file.

Existing fields (in "Services")

Name of the service

Mapping: Service:name
Issues:

  • Names can include special characters which would not be suitable for database look-up. Check that pk is used for database lookup of Services.

Name

Mapping: ServiceSubmission:authors & ServiceSubmission:submitters ?
Issues:

  • not 100% certain of the mapping
  • distinction is not made between "author" and "submitter" - what to do?
  • data clean-up! One or more names, and sometimes email addresses are given, with no consistent syntax.

Laboratory

Institute

Adress

Mail

Name

Laboratory

Institute

Adress

Mail

Mapping: Service:bioinformaticsTeams
Issues:

  • very uncertain about the mapping, Service:bioinformaticsTeams is "The bioinformatics team(s) that provides this service."
  • Institute seems to correspond to Organisation in the new model
  • current values are a mixture of names and unit numbers, sometimes more than 1, often not specified
  • Corresponding BioinformaticsTeams would need to be created. Mandatory fields for BioinformaticsTeam are name, description, homepage, at least one member (a UserProfile), at least one maintainer (a UserProfile) , orgid, ifbMembership (an enum) and fundedBy (an Organisation).
  • Organisation objects may need to be created and a new field added to Service or ServiceSubmission as appropriate to make the connection.

Scope of the service

Mapping: ServiceSubmission:scope
Issues: none

Service category

Mapping: none
Issues:

  • This actually maps to ElixirPlatform but this annotation should be made on associated objects, e.g. BioinformaticsTeams.

Brief service description

Mapping: Service:description
Issues:

  • lots of these descriptions describe individual tools and databases, not the service around them.

Communities served

Mapping: none
Issues:

  • a new field may be required for this, but the text is existing content is very variable in nature.. NB: specific (formal) relevant communities can be annotated via the Databases and Tools included.

Please check the ELIXIR communities potentially served

Mapping: ComputingFacility | Database | Tool | TrainingMaterial :: communities
Issues:

  • specific communities are annotated at the resource not service level.

Year of establishment

Mapping: Service:dateEstablished
Issues:

  • Service:dateEstablished is a date, whereas the value in the CSV is a year.

Life Cycle stage

Mapping: none
Issues:

  • Life cycle stage (Mature, Emerging, Legacy) is a tool or database-specific annotation.

a. Access to the services

Mapping: none
Issues:

  • content is a mixture of URLs, license types, free text etc. for specific databases and tools. Nearly all the information is captured (as distinct fields) in bio.tools.

b. Quality of the service

Mapping: ServiceSubmission:qaqc ?
Issues:

  • Mapping very uncertain! ServiceSubmission:qaqc is "Short description of quality assurance and control processes in place aimed to ensure a high-quality service." but the text we have for "Quality of the service" includes many different sorts of different things.

a. Overall usage

Mapping: ServiceSubmission:usage
Issues:

  • ServiceSubmission:usage is "Description of the extent and ways the service is used, including quantitatve usage metrics or indicators." The actual values we have for "Overall usage" are very variable.

Number of publications citing the resource, acknowledgements,

Number of publications where the developers of the resource are co-authors

Mapping: none
Issues:

  • these are magic numbers - objective way to calculate such metrics from supplied data is needed.

b. Up to 5 key publications describing the resource with their DOI

List of the publications describing the resource

Mapping: Service:publications
Issues:

  • the values for "b. Up to 5 key publications describing the resource with their DOI" are very variable in nature (and usually don't include a DOI).
  • "List of the publications describing the resource" gives a file name (presumably an attachment made during the submission).
  • Service:publications is "Publication(s) that describe the service as a whole." - need to be mindful to avoid duplication with annotated publications on the Databases and Tools included in the service

a. Scientific Advisory Board, users committee,

Mapping: Service:governanceSab
Issues:

  • Service:governanceSab is "Link to the description of the SAB covering the service." but we have free text in the CSV file.
  • Service:governanceSab does not include user committee, but then in practice almost no resources have these.

b. Terms of Use

Mapping: none
Issues:

  • this is a property of the Database (especially) and Tools (sometimes) provided by the service.

c. Ethics policy

Mapping: none
Issues:

  • No field in the new model for this, but look at the data to inform if it's really needed. If so, a URL to an ethics policy would suffice.

Sustainable support and funding

Mapping: ServiceSubmission:sustainability
Issues:

  • ServiceSubmission:sustainability is "Service funding and sustainability plan, including past and future funding commitments, and number of FTE engaged during the last four years and next year." The values for "Sustainable support and funding" are usually applicable.

What are your motivation(s) for this application

Mapping: ServiceSubmission:motivation
Issues:

  • the text values need mapping to the controlled vocabulary defined for ServiceSubmission:motivation

support from IFB

Mapping: none
Issues:

  • if these data are still important, a new field is needed

Additional MANDATORY fields (in "ServiceSubmission")

ServiceSubmission:caseForSupport

"The motivation for the service and why it should be supported by IFB and/or included in the SDP."
Issues:

  • check existing columns in CSV again for something appropriate

ServiceSubmission:service

"The service associated with this submission to the ELIXIR FR SDP process."
Issues:

  • NB - A ServiceSubmission must be associated to a Service object

ServiceSubmission:year

"The year when the service was submitted for consideration of incluson in the French SDP."
Issues:

  • could be auto-generated once the system is up and running
@joncison joncison added the content Issues concerns content (migration of old, or addition of new) label Sep 11, 2020
@joncison joncison added the high priority A high-priority issue that should be acted on ASAP. label Sep 13, 2020
@bryan-brancotte bryan-brancotte added the needs triage Issues that need to be discussed label Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Issues concerns content (migration of old, or addition of new) high priority A high-priority issue that should be acted on ASAP. needs triage Issues that need to be discussed
Projects
None yet
Development

No branches or pull requests

2 participants