Team:Heidelberg/Tempaltes/iGEM42-W-9b

From 2013.igem.org

(Difference between revisions)
(Created page with " == Text analysis drafts == !! To be done by Ilia !!")
Line 1: Line 1:
== Text analysis drafts ==
== Text analysis drafts ==
-
!! To be done by Ilia !!
+
For the text analysis we used python and it's nltk platform (see [http://nltk.org]). The only text corpus used was the stopword corpus.<br/>
 +
Independent of the analysis done a "stemmer" was run on the abstracts, which reduces all words to their very basic form. For the topwords and information content calculation simple counting was performed and for the information content the proportion of the stopwords corpus in the whole was determined.<br/>
 +
For the extraction of the meshterms a list of terms in synthetic biology from .... was used. Here the stemmer was applied to both the abstract and the words list and the matches were again counted.

Revision as of 00:44, 5 October 2013

Text analysis drafts

For the text analysis we used python and it's nltk platform (see [1]). The only text corpus used was the stopword corpus.
Independent of the analysis done a "stemmer" was run on the abstracts, which reduces all words to their very basic form. For the topwords and information content calculation simple counting was performed and for the information content the proportion of the stopwords corpus in the whole was determined.
For the extraction of the meshterms a list of terms in synthetic biology from .... was used. Here the stemmer was applied to both the abstract and the words list and the matches were again counted.