GEO logo
         GEOSS Portal Button

GEO 2017-2019 Work Programme


Research Data Science Summer Schools

Activity ID: 101


The ever-accelerating volume and variety of data being generated is having a huge impact of a wide variety of research disciplines, from the sciences to the humanities: the international, collective ability to create, share and analyse vast quantities of data is having a profound, transformative effect.  What can justly be called the ‘Data Revolution’ offers many opportunities coupled with significant challenges.  Prominent among these is the need to develop the necessary professions and skills.  There is a recognized need for individuals with the combination of skills necessary to optimize use of the new data sets. Such individuals may have a variety of different titles: Data Scientist, Data Engineer, Data Analyst, Data Visualizer, Data Curator. All of them are essential in making the most of the data generated.

Contemporary research – particularly when addressing the most significant, trans disciplinary research challenges – cannot effectively be done without a range of skills relating to data.  This includes the principles and practice of Open Science and research data management and curation, the use of a range of data platforms and infrastructures, large scale analysis, statistics, visualization and modeling techniques, software development and annotation, etc. The ensemble of these skills, we define as ‘Research Data Science’.

It is strategic priority for both CODATA and the Research Data Alliance to build capacity and to develop skills, training young researchers in the principles of Research Data Science. Particular attention is paid to the needs of young researchers in low and middle income countries (LMICs). It is important that Open Data and Open Science benefit research in LMICs and the unequal ability to exploit these developments does not become another lamentable aspect of the ‘digital divide’.  On the contrary, it has been argued that the ‘Data Revolution’ provides a notable opportunity for reducing that divide in a number of respects.

This activity relates most specifically to the GEO Strategic Objective of ‘Engage’ and the ‘Capacity Building’ activity therein.  The promotion and development of data science skills, as described here, is an important component of capacity building and essential to the greater use and reuse of earth observation data to meet Societal Benefit Areas.

The vision for the schools a series of data science short courses that use a quality assured set of reusable material, are supported by online delivery and are quality controlled and accredited by an appropriate body or bodies so that they can count towards students post-graduate qualifications.  The CODATA-RDA Working Group is seeking to put the mechanisms for these important features in place.

The CODATA-RDA Research Data Science Summer Schools will:

  • address a recognized need for Research Data Science skills across disciplines;
  • follow an accredited curriculum;
  • provide a pathway from a broad introductory course for all researchers (Vanilla) through more advanced and specialized courses (Flavors and Toppings);
  • be reproducible: all materials will be online with Open licenses;
  • be scalable: emphasis will be placed on Training New Teachers (TNT) and building sustainable partnerships.

Activities for the period

1) Vanilla School

The first school, named ‘Vanilla’ by analogy to the most basic flavour of ice cream, will provide a bedrock of introductory material, common to all research disciplines, and upon which more advanced schools can build. This school is designed to run for up to two weeks, for what the participants will gain, see the Reference Document. The programme will be run in partnership with the Software and Data Carpentry communities and the UK’s Digital Curation Centre.  Other partnerships are being explored.

2) Flavoured Schools

Schools following Vanilla will be more advanced and specialized, refined as required to the ‘Research Data Science’ needs of particular disciplines. Such ‘flavoured’ schools, which will run for 1 or 2 weeks, will allow a student to have a more specialized knowledge in Data Science, as it is applied in a more specific, disciplinary research context.  A flavoured school will not necessarily run directly after a Vanilla school and may be held in a completely different location.

Discussions are on-going on schools on:

Future Plans

The Working Group is liaising with a number of partners to host schools in future years.  The initiative builds on events held by CODATA in Beijing, Nairobi and Bangalore.  As well as the various organisations mentioned, the WG is exploring whether the regional offices of the International Council of Science and The World Academy of Science can host schools from 2017.

Strong emphasis will be placed on Training New Teachers.  Specific components and accreditation for participants wishing to instruct on and lead future schools will be established.


  1. The first full introductory or Vanilla course took place from 1-12 August 2016 at the Abdus Salam International Centre for Theoretical Physics in Trieste, Italy. As host, and following their general practice, the ICTP provided accommodation and subsistence for up to 120 students.  The ICTP committed 15K euros, TWAS 10K euros and CODATA at least 5K euros to support student travel;
  2. The current funding from ICTP, TWAS and CODATA will be prioritized for participants from LMICs. The Working Group is looking for additional support from partner organizations, funders and sponsors.  Thanks to the hosting support, funds will be used entirely for student and instructor travel; 
  3. Resources for Flavoured Schools will be confirmed with the confirmation of the schools.


Leadership & Contributors (this list is being populated)




Implementing Entity



Simon Hodson