Duquesne CPMA Graduates (advisor: John Kern)
Jingyan Sun (Summer 2007)
Title: THE MATHEMATICS BEHIND SPECIATED ISOTOPE DILUTION MASS SPECTROMETRY.
Abstract: Speciated Isotope Dilution Mass Spectrometry (SIDMS) allows researchers to measure the concentration of species---usually elemental---in a sample by solving a system of non-linear equations. This thesis explores multiple mathematical methods to solve SIDMS equations (two existing, two new), and compares the properties of these solution methods. Simulation analysis is conducted to provide uncertainty estimates.
Interesting tidbit: The two-species non-linear SIDMS equations have a closed-form solution. Newton's method is recommended for solving three species equations (and beyond), as there exists convergence criteria that can be checked.
Links to paper and presentation slides:
Paper [pdf]
Presentation [pdf]
Nicholas Bernini (Spring 2006)
Title: BAYESIAN ANALYSIS OF DISCRETE LONGITUDINAL DATA.
Abstract: This thesis explores a Bayesian hierarchical model to
compare treatment effectiveness for menopausal symptom relief.
Specifically, this model recognizes the discrete nature of the data,
as well as its time dependency. Bayesian analisys is used to make
inference on each individual profile, as well as on a group profile
for each treatment group.
Interesting tidbit: By modeling the individual profiles of all subjects in a particular group, inference on a complete group profile can be made. Acupuncture administered in supposedly "non-effective" areas performs as well as acupuncture administered in supposedly "effective" areas.
Links to paper and presentation slides:
Paper [pdf]
Presentation [pdf]
Joseph Jordan (Spring 2005)
Title: BAYESIAN HIERARCHICAL MODELING FOR LONGITUDINAL FREQUENCY DATA.
Abstract: This research is to develop a longitudinal frequency model for data collected regularly for several individuals over an extended time period. This model must recognize explicitly the discrete nature of the data, as well as any dependence that exists among an individual's time consecutive measurements. Motivated by a study investigating alternative treatments for relief of menopausal symptoms, we apply this model to actual study data in an effort to compare treatment effectiveness. We propose a Bayesian hierarchical model to describe not only frequency measurements, but also the parameters that govern an individual profile.
Interesting tidbit: By modeling individually the profiles of all subjects in a particular group, inference on an overall group mean can be made.
Links to paper and presentation slides:
Paper [pdf]
Presentation [ppt]
Sara Bennett (Spring 2004)
Title: ANALYSIS OF FACTORS THAT INFLUENCE MEMBER TURNOVER IN A HEALTH INSURANCE PLAN.
Abstract: In this research, we implement a
multiple logistic regression
model in which the coefficients of indicator variables are constrained to
be zero or positive. By doing this, the contribution of each
dichotomous variable to the failure probability can be assessed. Due to
this restriction on the coefficients, a Bayesian approach to parameter
estimation---which assigns mixture priors to the coefficients---is
taken. The data is provided by a large health insurance company in
western Pennsylvania and includes the enrollment status and
corresponding values of 84 predictor variables for 1,280,612
individuals. The insurer feels the analysis is needed to determine why
its membership is declining, why its cost trend is higher than the
national average, and what logical steps can be taken to
reverse the current trends.
Interesting tidbit: Disenrollment probability increases if a resident of Mercer county or if diagnosed with vascular disease. Knowing whether an individual is insured through a local vs. national company does not help predict disenrollment probability.
Links to paper and presentation slides:
Paper [pdf]
Presentation [pdf]
Jennifer Borgesi (Spring 2004)
Title: A PIECEWISE LINEAR GENERALIZED POISSON REGRESSION APPROACH TO MODELING LONGITUDINAL FREQUENCY DATA.
Abstract: In this research we consider
experiments that generate longitudinal
frequency data. Often times this data comes from two or more experimental
groups. Experiments that yield such data are common in the medical field
and are often designed with the purpose of ascertaining differences among
experimental groups. Standard modeling techniques, such as repeated
measures ANOVA, are inadequate for application to longitudinal frequency
data because they ignore the correlation between the measurements as well
as the discrete nature of the data. We present a piecewise-linear,
generalized Poisson regression model for longitudinal frequency data.
Based on the generalized Poisson distribution, this model is flexible
enough to allow for (and detect) underdispersion, equidispersion, or
overdispersion in the data. We apply this model to frequency data
collected from a clinical trail studying the symptoms of menopausal women.
A simulation study that implements a generalized Poisson model for
univariate data is also provided.
Interesting tidbit: There are limits to the underdispersion modeling capabilities of a generalized Poisson distribution.
Links to paper and presentation slides:
Paper [pdf]
Presentation [pdf]
Yangchun Du (Spring 2003)
Title: MULTIPLE CORRESPONDENCE ANALYSIS IN MARKET RESEARCH.
Abstract: Multiple Correspondence Analysis (MCA)
is a data mining tool used to
display graphically the relationships among the categories of several
categorical variables. Data collected across such variables are used
by MCA algorithms to assign to each category of each categorical
variable a two-dimensional coordinate in a special manner: categories
whose coordinates are close (in Euclidean distance) share a greater
association than those categories whose coordinates are relatively
further apart. This research dissects the MCA algorithm currently used
by SAS software as well as the algorithm proposed by Greenacre
(1988). Features and properties of MCA are highlighted through
application to simulated data. We then apply MCA to a brand preference
data set provided by Management Science Associates, Inc. Comparison of
standard MCA with that of Greenacre in these applications reveal
little meaningful difference.
Interesting tidbit: A data set twice the size of an original data set (constructed from duplicating the original) yields the exact same correspondence analysis as the original. (Complete proof included.)
Links to paper and presentation slides:
Paper [pdf]
Presentation [pdf]
Duquesne Home | Computational Mathematics Home | Kern Home