Team:Heidelberg/Tempaltes/iGEM42-W-20c

From 2013.igem.org

Contents

Data update and completion

There were many automatic and manual corrections done on the data. These included synchronisation of track and award naming, as well as some double or missing awards.

Name synchronisation

The naming of the tracks and awards was not conserved over the years, but basically kept the value they had. One example is the extension of the 2007 Energy track to the Food or Energy track in 2008. The same applies for other compined tracks, but there were also minor differences, that needed fixing, as for example medals starting with upper or lower case letters. Most of these were solved using regular expressions, when converting the data from JSON to R. See table 23.2 for all synchronisations made.

Championship awards
Regular expression Replacement Covered occurances
Grand Prize Grand Prize
  • Grand Prize
  • Grand Prize, Winner of the BioBrick Trophy
  • Grand Prize Winner
(1st)|(First) Runner Up 1st Runner Up
  • 1st Runner Up
  • First Runner Up
  • 1st Runner Up, Winner of the PoPS Prize
(2nd)|(Second) Runner Up 2nd Runner Up
  • 2nd Runner Up
  • Second Runner Up
  • 2nd Runner Up, Winner of the Synthetic Standard
Environment Best Environment Project
  • Best Environment Project
  • Best Environmental Project
  • Environmental Sensing
Energy Best Food & Energy Project
  • Best Food & Energy Project
  • Best Food or Energy Project
  • Energy
Health Best Health & Medicine Project
  • Best Health & Medicine Project
  • Best Health or Medicine Project
  • Health & Medicine
Foundational Best Foundational Advance Project
  • Best Foundational Advance Project
  • Best Foundational Advance
  • Best Foundational Tech.
  • Foundational Research
New Application Best New Application Project
  • Best New Application Project
  • Best New Application Area
Part, Natural Best New BioBrick Part, Natural
  • Best New BioBrick Part, Natural
  • Best New BioBrick Part, Natural, Runner Up

(differenciating between great teams on this level
doesn't serve the purpose of the tool)

Best Model Best Model
  • Best Model
  • Best Modeling / Sim.
Information Processing Best Information Processing Project
  • Best Information Processing Project
  • Information Processing
Software Tool Best Software
  • Best Software Tool

(Best Software Tools will be one with the Best Software)

Presentation Best Presentation
  • Best Presentation
  • Best Presentation, Runner Up
Regional awards
Regular expression Replacement Covered occurances
Grand Prize Grand Prize

Regional prizes always end
with the corresponding region.
Besides this other minor differences
as for the championhsip awards are removed.

Finalist Regional Finalist
Human Practices Best Human Practices Advance
Experimental Measurement Best Experimental Measurement Approach
Model Best Model
Device, Engineered Best New BioBrick Device, Engineered
Part, Natural Best New BioBrick Part, Natural
Standard Best New Standard
Poster Best Poster
Presentation Best Presentation
Wiki Best Wiki
Safety Safety Commendation
Medals, Regions, Tracks
Regular expression Replacement Covered occurances
[Bb]ronze Bronze upper or lower case medals
[Ss]ilver Silver
[Gg]old Gold
America America All regions on american continents were put together.
US America
Canada America
Medic Health & Medicine
  • Health & Medicine
  • Health/Medicine
  • Medical
Energy Food & Energy
  • Food & Energy
  • Food/Energy
  • Energy
Foundational Foundational Advance
  • Foundational Advance
  • Foundational Research

Updated scoring function

After the in depth curation of all the award names, the scoring function had to be updated, since some awards had never been included in the scoring due to their rare awarding and some were lost in the naming issues. After updating the scores list, the data conversion to RData was rerun and the scoring was briefly checked by reviewing the list of the "best" teams.

Manual data curation

For some reason the 2011 championship awards were entirely missing on the standard results page. Thus we had to go directly to the jamboree results page and add the right awards to every team in 2011. This was done directly in the JSON file an the RData-file was updated imediately.

Bug-fix: NA-values

The award filters match the full team list to retrieve the names of the teams to keep in the dataset. These are then taken from the already reduced data set, which produces empty rows in the data for those teams, that match the award filter and were already removed the data-frame by another filter prior to the award matching. The naming of those empty rows is "NA.number". Thus in order to remove these empty rows those row-names containing "NA" were removed. This was a really bad idea, because we spend lots of time trying to find out why our tool doesn't like the UNAM MEXICO teams. This bug was fixed by adding a dot to the regular expression and separately removing the first empty row, which would exactly match "NA".