Code the Fly is a citizen science activity in which participants contribute by analysing the Drosophila melanogaster genome using bioinformatic tools in order to estimate the frequencies of mobile elements (also called transposons) in different populations.

Code the Fly is a dynamic and virtual program aimed at university students, high school students and general citizens who are curious and interested in bioinformatics, genetics and big data. This activity offers the participants their first contact with scientific research and an introduction to bioinformatics, while helping to cultivate their knowledge about the scientific process and their talent; as well as facilitating the development of the citizen science project Melanogaster: Catch the Fly (#MelanogasterCTF). In addition, Code the Fly allows the students of centres located in arid climates, who have already collaborated with the project through the collection and classification of Drosophila biological samples, to continue participating once they finish their studies at the institute.

Code the Fly allows participants to work autonomously and at their own pace through the open online platform Stepik. On the Stepik platform, collaborators can find a bioinformatics module with all the educational materials created by the #MelanogasterCTF scientific team specifically for this activity. The module contains enough information to give an introductory overview of concepts such as DNA, genomes and mobile elements, along with guided steps to install all the necessary software, generate the bioinformatics data and send the results to the online platform.

To generate the data, we use the T-lex3 software. This software was developed by the Laboratory of Evolutionary and Functional Genomics at the Institute of Evolutionary Biology of Barcelona (IBE, CSIC-UPF). T-lex3 is a program that allows the detection and estimation of the mobile elements frequency in different populations of Drosophila melanogaster by analysing the sequenced DNA data that is openly available in a database. The data generated by the participants is then sent to the online platform to be evaluated. These results are either validated after several collaborators have sent equivalent findings or flagged if some of the results sent are inconsistent. Once the data have been validated, it is then used by the research personnel that are part of the European Drosophila Population Genomics Consortium (DrosEU). Currently, one of the needs of DrosEU scientists is to increase the number of mobile elements analysed in the Drosophila melanogaster genome. Thanks to this interaction, the participants can collaborate directly in a real research project, and feel that they are promoting science advancement while they are aiding to increase the data.


Code the Fly organization


The activity is structured as follows:

Part I: Introduction to the activity and software

  • Brief introduction to the Code the Fly project, transposable elements and T-lex3
  • Introduction to the virtual participation platform Code the Fly.
  • T-lex3 installation assistance.
  • Description of the datasets to be analysed.

Part II: Autonomous work by the participants

  • Time for the participants to run T-lex3 and enter the results on the Code the Fly
  • Online service for answering questions and resolving possible technical problems.


Part III: Results analysis

  • Sharing of the results obtained so far.
  • Preparation for selecting other datasets from the SRA database suitable for the project objectives.


The Code the Fly activity is run under the supervision of Dr. Sònia Casillas from the Genomics, Bioinformatics and Evolution group (GGBE) of the Department of Genetics and Microbiology of Autonomous University of Barcelona (UAB), and is carried out within the project of citizen science #MelanogasterCTF, which is directed by the Laboratory of Evolutionary and Functional Genomics of Institute of Evolutionary Biology (IBE, CSIC-UPF) and the science dissemination platform Science in Your Word (LCATM). The European Drosophila Population Genomics Consortium (DrosEU) also participates in the project. This project is part of a collaboration with the Spanish Foundation for Science and Technology-Ministry of Science, Innovation and Universities (FECYT), the European Research Council (ERC, H2020-ERC-2014-CoG-647900), and the CSIC General Foundation (FGCSIC).