Team:TU Darmstadt/Modelling/Statistics
From 2013.igem.org
Line 186: | Line 186: | ||
<font size="3" color="#F0F8FF" face="Arial regular"> | <font size="3" color="#F0F8FF" face="Arial regular"> | ||
<p text-aligne:left style="margin-left:50px; margin-right:50px"> | <p text-aligne:left style="margin-left:50px; margin-right:50px"> | ||
- | The DKL Analysis | + | |
+ | <font size="5" color="#F0F8FF" face="Arial regular">The DKL Analysis</font> | ||
+ | <br> | ||
+ | <br> | ||
+ | |||
In information theory the Kullback-Leibler-Divergence (DKL) describes and quantifies the distance between | In information theory the Kullback-Leibler-Divergence (DKL) describes and quantifies the distance between | ||
Line 195: | Line 199: | ||
<img alt="DKL" src="/wiki/images/7/71/DKL.png" width="555" height="138"> | <img alt="DKL" src="/wiki/images/7/71/DKL.png" width="555" height="138"> | ||
</center> | </center> | ||
- | + | ||
- | + | ||
- | + | ||
<br> | <br> | ||
<br> | <br> | ||
Line 203: | Line 205: | ||
<font size="3" color="#F0F8FF" face="Arial regular"> | <font size="3" color="#F0F8FF" face="Arial regular"> | ||
<p text-aligne:left style="margin-left:50px; margin-right:50px"> | <p text-aligne:left style="margin-left:50px; margin-right:50px"> | ||
- | + | Here, P(i) and Q(i) denote the densities of P and Q at a position i. | |
- | + | In our study, we use the DKL to describe the distances of the survey datasets from the human practice project. | |
+ | Therefore, we have to calculate a histogram out of the different datasets. Here, it is important to perform a constant binsize. In this approach we assume that a hypothetical distribution Q is uniformly distributed. | ||
+ | To achieve this, we grate an appropriate test data set with the random generator runif in R. | ||
+ | |||
</p></font> | </p></font> | ||
<br> | <br> | ||
<br> | <br> | ||
- | + | ||
<br> | <br> | ||
<center> | <center> |
Revision as of 00:47, 5 October 2013
Information Theory
The DKL Analysis
In information theory the Kullback-Leibler-Divergence (DKL) describes and quantifies the distance between
two distributions P and Q. Where P denotes an experimental distribution, it is compared with Q, a reference distribution. DKL is also known as ‘relative entropy’ as well as ‘mutual information’.
Although DKL is often used as a metric or distance measurement, it is not a true measurement because it is not symmetric.
Here, P(i) and Q(i) denote the densities of P and Q at a position i. In our study, we use the DKL to describe the distances of the survey datasets from the human practice project. Therefore, we have to calculate a histogram out of the different datasets. Here, it is important to perform a constant binsize. In this approach we assume that a hypothetical distribution Q is uniformly distributed. To achieve this, we grate an appropriate test data set with the random generator runif in R.