Team:Heidelberg/Tempaltes/iGEM42-W-4

From 2013.igem.org

(Difference between revisions)
JuliaS1992 (Talk | contribs)
(Created page with " == Wiki Scraping == * white space handling added * track scraping added * output-file specified via command-line * bug-fixes: ** exiting script ** proper construction of spider ...")
Newer edit →

Revision as of 23:11, 4 October 2013

Wiki Scraping

  • white space handling added
  • track scraping added
  • output-file specified via command-line
  • bug-fixes:
    • exiting script
    • proper construction of spider objects

!! To be done by Ilia !!

Data conversion to R

  • JSON file is needs to be converted to R compatible data for the analysis
  • target file contains one dataframe for all single value parameters and one list for all multiple value parameters / gib text contents.
  • Unique naming of the teams is achieved by combination of name and year
Data frame parameters List elements
  • Numerical:
    • year
    • students count
    • advisors count
    • instructors count
    • regional awards count
    • championship awards count
    • biobrick count
  • Character strings:
    • region
    • wiki
    • url (Team overview page)
  • year
  • character vector of regional awards
  • character vector of championship awards
  • parts range
  • advisor names
  • project title
  • abstract