Gender BADGE Report

BADGE Report (December 26, 2004)
My Experiment

Materials

The complete database comprises 18 expression measurements of 24568 genes, 9 in the first condition and 9 in the second condition. The recorded expression values range between 0.70 and 104461.30. Only those genes with at least one measurament above 150 were considered. As a result, a total of 23022 genes were included in our analysis.

Methods

Data were analyzed using BADGE (Bayesian Analysis of Differential Gene Expression) version 1.0 (1), a computer program implementing a Bayesian appraoch to identify differentially expressed genes across experimental conditions. For each gene, the method computes the posterior probability that the gene is expressed more than one fold in the first condition than in the second condition. This posterior probability is computed as the weighted average of the posterior probability computed under the assumption that the observations are generated by a Gamma distribution and a Log-normal distribution.

A predictive evaluation of the results obtained by this method was performed using leave-one-out cross validation. This techique consists of removing from the database one case at the time, estimating the model parameters from the remaining cases, and predicting the condition of the removed case on the basis of the selected genes and the parameters estimated from the remaining cases. If the condition predicted correponds to the condition of the predicted case, the prediction will be considred correct. If it does not, it will be taken as incorrect. The reproducibility of the results was assessed by computing the difference between the posterior probabilities of the genes identified by the same Description.

Results

An expected false positive rate of 0.200% selects the genes with more than 99.900% and less than 0.100% chances of being more expressed in the first condition. Of these genes, we selected only those with an expected positive or negative fold change between the first and the second condition higher than 3/2, for a total of 6 positively and 21 negatively changed genes. Figure 1 displays the distribution of the probabilities of all genes in the dataset.

Figure 1. Distribution of the posterior probability
of being differentially expressed for each gene in the dataset.

Figure 2 displays a colormap of the selected genes. A complete list of the
selected genes is available here.

Figure 2. A colormap of the genes selected by the analysis. The intensity of each color denotes the standardized ratio between each value and the estimated average expression of each gene. click
here to see an enlarged version.

Leave one out cross validation achived an accuracy of 94.44%.

Figure 3 displays the distribution of the distances between the posterior probabilities of differential expression of genes sharing the same Description.

Figure 3. Distribution of the distances between the posterior probabilities of differential expression of genes sharing the same Description. Red dots denote distances exceeding the difference between the maximum and the minumum posterior probability of differential expression.

The average distance between genes with the same Description was 0.26, with a minumum and a maximum distance of 0.00 and 0.96, respectively.

References

1.	P Sebastiani, IS Kohane and MF Ramoni (2003). Bayesian Analysis of Differential Gene Expression. Under review.

This report was generated by BADGE v1.0.