LoboVault Home

Statistical methods in microarrays and high-throughput flow cytometry

LoboVault

Please use this identifier to cite or link to this item: http://hdl.handle.net/1928/10273

Statistical methods in microarrays and high-throughput flow cytometry

Show full item record

Title: Statistical methods in microarrays and high-throughput flow cytometry
Author: Meirelles, Osorio
Advisor(s): Werner-Washburne, Maggie
Committee Member(s): Toolson, Eric
Natvig, Donald
Wearing, Helen
Department: University of New Mexico. Biology Dept.
Subject: Empirical Bayes Regression Calibration Flow cytometry Microarrays
Regression Calibration
Flow cytometry
Microarrays
LC Subject(s): Flow cytometry--Statistical methods
Cell populations--Mathematical models
Protein microarrays--Statistical methods
Bayesian statistical decision theory
Regression analysis
Degree Level: Doctoral
Abstract: Abstract-I Background: Heterogeneous cell populations have previously been described as noisy. However, recent studies have demonstrated that heterogeneity can be biologically significant. We present here an approach for rapid and complete identification of heterogeneous cell populations from high-throughput flow cytometry data. We have developed a novel measure Slope Differentiation Identification (SDI) using flow cytometry-based protein expression, quantifying the rate of change in protein expression between two conditions (exponential and stationary phase) of yeast cells, as a function of cell size or cell granularity. Results: SDI had superior Gene Ontology enrichment when compared with other approaches such as k-means clustering and an approach based on the bi-modality of the fluorescence intensity distribution. Cell populations were also validated using gradient-separation followed by microscopy, where proteins with high SDI measure showed significant levels of differentiation between high and low density cells. Conclusion: Overall, our approach has identified novel protein expression patterns that differentiate quiescent and non-quiescent cell populations. Abstract-II Background: With the advent of genomics, there has been a rapid increase in the use of two and onecolor microarrays, used to measure mRNA abundance for the entire genome. Variability in microarray analysis undermines its utility in identifying the entire subset of differentially expressed mRNAs. Recent microarray studies have shown that, although it is assumed that variances are constant for every hybridized spot within a microarray, variances may differ for each biological sample analyzed (Ritchie, Diyagama et al. 2006). Another common assumption is that log-intensity values for any given gene have a Normal distribution. For many datasets, both assumptions have been shown to be incorrect, resulting in distortions in the significance when testing for differential expression of each gene (Bar-Even, Paulsson et al. 2006; Wentzell, Karakach et al. 2006). Approach: To overcome the limitations of existing approaches in identifying significant, differentially expressed genes, we have developed a novel unsupervised statistical approach called Calibration Regression Analysis of Microarrays (CRAM) that uses a combination of empirical Bayes and regression calibration. The main novelty of our approach is the modeling of gene expression variances as a function of the log-intensity within each sample. Another version was later developed CRAM-GS in which the association between genes is captured using an adjusted gene correlation measure. Results: CRAM was compared to four existing approaches for identifying differentially expressed genes. Performance was based on the ability to identify co-regulated genes in the same Gene Ontology process. CRAM exhibited a marginal improvement in GO process enrichment compared with the other approaches. To the original datasets, three more were included in which the later version CRAM-GS, showed a significant improvement compared to CRAM, suggesting a major additional benefit of incorporating gene correlations into the model. All versions of CRAM were two orders of magnitude faster than the existing approaches. Overall, CRAM provides an adaptive, computationally efficient approach for accurate identification of differentially expressed genes.
Graduation Date: December 2009
URI: http://hdl.handle.net/1928/10273


Files in this item

Files Size Format View
Dissertation Osorio 20091109.pdf 1.393Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record

UNM Libraries

Search LoboVault


Browse

My Account