Joseph C Watkins
 Professor, Mathematics
 Director, Data Science Academy
 Professor, BIO5 Institute
 Professor, Applied Mathematics  GIDP
 Professor, Genetics  GIDP
 Professor, Public Health
 Member of the Graduate Faculty
 Professor, StatisticsGIDP
 (520) 6215245
 MATHEMATICS, Rm. 115
 TUCSON, AZ 857210089
 jwatkins@math.arizona.edu
Degrees
 Ph.D. Mathematics
 University of Wisconsin, Madison, Wisconsin, United States
 A Central Limit Problem in Random Evolutions
 M.S. Mathematics
 University of Wisconsin, Madison, Wisconsin, United States
 none
 M.A. Mathematics
 University of Tennessee, Knoxville, Tennessee, United States
 Second Quantization
 B.A. Mathematics
 University of Tennessee, Knoxville, Tennessee, United States
Work Experience
 University of Arizona, Tucson, Arizona (2007  Ongoing)
 University of Arizona, Tucson, Arizona (1996  2007)
 University of Arizona, Tucson, Arizona (1992  1996)
 Northwestern University, Evanston, Illinois (1987)
 University of Southern California, Los Angeles, California (1986  1992)
 Institute for Mathematics and its Applications, University of Minnesota (1985)
 University of British Columbia, Vancouver, British Columbia (1982  1985)
 Freie Universität Berlin (1980)
Interests
Teaching
probability and statistics, stochastic processes, quantitative and mathematical biology
Research
probability theory and stochastic process, statistics, applications to the life sciences, especially to genetics and genomics
Courses
202425 Courses

Honors Thesis
DATA 498H (Fall 2024) 
Topics in Math
MATH 596A (Fall 2024)
202324 Courses

Honors Thesis
DATA 498H (Spring 2024) 
Honors Thesis
MATH 498H (Spring 2024) 
Intro Statistical Method
DATA 363 (Spring 2024) 
Intro Statistical Method
MATH 363 (Spring 2024) 
Thesis
STAT 910 (Spring 2024) 
Topics in Math
MATH 596A (Spring 2024) 
Honors Thesis
DATA 498H (Fall 2023) 
Honors Thesis
MATH 498H (Fall 2023) 
Topics in Math
MATH 596A (Fall 2023)
202223 Courses

Honors Thesis
DATA 498H (Spring 2023) 
Topics in Math
MATH 596A (Spring 2023) 
Honors Thesis
DATA 498H (Fall 2022) 
Topics in Math
MATH 596A (Fall 2022)
202122 Courses

Topics in Math
MATH 596A (Spring 2022) 
Topics in Math
MATH 596A (Fall 2021)
202021 Courses

Thesis
STAT 910 (Spring 2021) 
Topics in Math
MATH 596A (Spring 2021) 
Theory of Probability
MATH 564 (Fall 2020) 
Theory of Probability
STAT 564 (Fall 2020) 
Topics in Math
MATH 596A (Fall 2020)
201920 Courses

Honors Thesis
DATA 498H (Spring 2020) 
Intro to Statistical Computing
DATA 375 (Spring 2020) 
Theory of Statistics
MATH 566 (Spring 2020) 
Theory of Statistics
STAT 566 (Spring 2020) 
Thesis
STAT 910 (Spring 2020) 
Topics in Math
MATH 596A (Spring 2020) 
Honors Thesis
MATH 498H (Fall 2019) 
Independent Study
STAT 599 (Fall 2019) 
Research
STAT 900 (Fall 2019) 
Theory of Probability
MATH 564 (Fall 2019) 
Theory of Probability
STAT 564 (Fall 2019) 
Thesis
STAT 910 (Fall 2019) 
Topics in Math
MATH 596A (Fall 2019)
201819 Courses

Intro Statistical Method
DATA 363 (Spring 2019) 
Research
STAT 900 (Spring 2019) 
Topics in Math
MATH 596A (Spring 2019) 
Intro Statistical Method
MATH 363 (Fall 2018) 
Research
STAT 900 (Fall 2018) 
Topics in Math
MATH 596A (Fall 2018)
201718 Courses

Thesis
STAT 910 (Summer I 2018) 
Directed Research
MATH 492 (Spring 2018) 
Dissertation
STAT 920 (Spring 2018) 
Independent Study
MATH 499 (Spring 2018) 
Intro Statistical Method
MATH 363 (Spring 2018) 
Thesis
STAT 910 (Spring 2018) 
Topics in Math
MATH 596A (Spring 2018) 
Dissertation
STAT 920 (Fall 2017) 
Intro Statistical Method
MATH 363 (Fall 2017) 
Thesis
STAT 910 (Fall 2017) 
Topics in Math
MATH 596A (Fall 2017)
201617 Courses

Thesis
STAT 910 (Summer I 2017) 
Dissertation
STAT 920 (Spring 2017) 
Honors Thesis
MATH 498H (Spring 2017) 
Intro Statistical Method
MATH 363 (Spring 2017) 
Thesis
STAT 910 (Spring 2017) 
Topics in Math
MATH 596A (Spring 2017) 
Topics in Undergrad Math
MATH 396T (Spring 2017) 
Directed Research
MATH 392 (Fall 2016) 
Dissertation
GENE 920 (Fall 2016) 
Dissertation
STAT 920 (Fall 2016) 
Honors Thesis
MATH 498H (Fall 2016) 
Independent Study
GENE 699 (Fall 2016) 
Intro Statistical Method
MATH 363 (Fall 2016) 
Thesis
STAT 910 (Fall 2016) 
Topics in Math
MATH 596A (Fall 2016)
201516 Courses

Intro Ord Diff Equations
MATH 254 (Summer I 2016) 
Directed Research
MATH 392 (Spring 2016) 
Dissertation
GENE 920 (Spring 2016) 
Dissertation
STAT 920 (Spring 2016) 
Independent Study
STAT 599 (Spring 2016) 
Research
STAT 900 (Spring 2016) 
Topics in Math
MATH 596A (Spring 2016)
Scholarly Contributions
Books
 Harris, T. E., Alexander, K. S., & Watkins, J. C. (1991).
Spatial stochastic processes : a festschrift in honor of Ted Harris on his seventieth birthday
.More infoThis volume has been created in honor of the seventieth birthday of Ted Harris, which was celebrated on January 11th, 1989. The papers rep resent the wide range of subfields of probability theory in which Ted has made profound and fundamental contributions. This breadth in Ted's research complicates the task of putting together in his honor a book with a unified theme. One common thread noted was the spatial, or geometric, aspect of the phenomena Ted investigated. This volume has been organized around that theme, with papers covering four major subject areas of Ted's research: branching processes, percola tion, interacting particle systems, and stochastic flows. These four topics do not. exhaust his research interests; his major work on Markov chains is commemorated in the standard technology Harris chain and Harris recurrent . The editors would like to take this opportunity to thank the speakers at the symposium and the contributors to this volume. Their enthusi astic support is a tribute to Ted Harris. We would like to express our appreciation to Annette Mosley for her efforts in typing the manuscripts and to Arthur Ogawa for typesetting the volume. Finally, we gratefully acknowledge the National Science Foundation and the University of South ern California for their financial support.
Chapters
 Didelot, X., Taylor, J. E., & Watkins, J. C. (2008). A Duality Identity between a Model of Bacterial Recombination and the Wright–Fisher Diffusion. In Markov Processes and Related Topics: A Festschrift for Thomas G. Kurtz. Institute of Mathematical Statistics,. Institute of Mathematical Statistics. doi:10.1214/074921708000000453More infoIn this article, we establish, using a duality argument, an iden tity stating that the Laplace transform of the length of a contiguous bacterial recombination region equals the probability of choosing a given allele in a sta tionary population evolving according to the onedimensional WrightFisher diffusion model. Beyond giving us an improved inferential strategy for pa rameter estimation in bacterial recombination, the matching of the selection and recombination parameters in the identity also suggests the existence of an intriguing formal relationship between gene conversion and the ancestral selection graph.
Journals/Publications
 Andrews, J., Galindo, M. K., Hack, J. B., Watkins, J. C., Conecker, G. A., & Hammer, M. F. (2023).
The International SCN8A Patient Registry: A Scientific Resource to Advance the Understanding and Treatment of a Rare Pediatric Neurodevelopmental Syndrome.
. Journal of Registry Management, 50.More infoGenetic variants in the SCN8A gene underlie a wide spectrum of neurodevelopmental phenotypes that range from severe epileptic encephalopathy to benign familial infantile epilepsy to neurodevelopmental delays with or without seizures. A host of additional comorbidities also contribute to the phenotypic spectrum. As a result of the recent identification of the genetic etiology and the length of time it often takes to diagnose patients, little data are available on the natural history of these conditions. The International SCN8A Patient Registry was developed in 2015 to fill gaps in understanding the spectrum of the disease and its natural history, as well as the lived experiences of individuals with SCN8A syndrome. Another goal of the registry is to collect longitudinal data from participants on a regular basis. In this article, we describe the construction and structure of the International SCN8A Patient Registry, present the type of information available, and highlight particular analyses that demonstrate how registry data can provide insights into the clinical management of SCN8A syndrome.  Bahramnejad, E., Barney, E. R., Lester, S., Hurtado, A., Thompson, T., Watkins, J. C., & Hammer, M. F. (2023).
Greater female than male resilience to mortality and morbidity in the Scn8a mouse model of pediatric epilepsy
. International Journal of Neuroscience, 113. doi:10.1080/00207454.2023.2279497More infoABSTRACTAims Females and males of all ages are affected by epilepsy; however, unlike many clinical studies, most preclinical research has focused on males. Genetic variants in the voltagegated sodium channel gene, SCN8A, are associated with a broad spectrum of neurological and epileptic syndromes. Here we investigate sex differences in the natural history of the Scn8aN1768D knockin mouse model of pediatric epilepsy.Methods We utilize 24/7 video to monitor juveniles and adults of both sexes to investigate variability in seizure activity (e.g., onset and frequency), mortality and morbidity, response to cannabinoids, and mode of death. We also monitor sleep architecture using a noninvasive piezoelectric method in order to identify factors that influence seizure severity and outcome.Results Both sexes had nearly 100% penetrance in seizure onset and early mortality. However, adult heterozygous (D/+) females were more resilient as exhibited by the ability to tolerate more seizures over a longer lifespan. Homozygous (D/D) juveniles did not exhibit a sex difference in overall survival. Female estrus cycle was disrupted before seizure onset, while sleep was disrupted in both sexes in association with seizure onset. Females typically died while in convulsive status epilepticus, while a high proportion of males died while not experiencing behavioral seizures. Only juvenile and adult males benefited from cannabinoid administration.Conclusions These results support the hypothesis that factors associated with sexual differentiation play a role in the neurobiology of epilepsy and point to the importance of including both sexes in the design of studies to identify new epilepsy therapies.Key Words: mouse epilepsy modeltonicclonic seizuresSUDEPsleep architecturecannabinoidsDisclaimerAs a service to authors and researchers we are providing this version of an accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proofs will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and prepress, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to these versions also. FundingThe author(s) reported there is no funding associated with the work featured in this article.  Chung, K. M., Hack, J., Andrews, J., Galindo‐Kelly, M., Schreiber, J., Watkins, J., & Hammer, M. F. (2023).
Clinical severity is correlated with age at seizure onset and biophysical properties of recurrent gain of function variants associated with SCN8A‐related epilepsy
. Epilepsia, 84, 33653376. doi:10.1111/epi.17747More infoGenetic variants in the SCN8A gene underlie a wide spectrum of neurodevelopmental phenotypes including several distinct seizure types and a host of comorbidities. One of the major challenges facing clinicians and researchers alike is to identify genotypephenotype (GP) correlations that may improve prognosis, guide treatment decisions, and lead to precision medicine approaches.We investigated GP correlations among 270 participants harboring gainoffunction (GOF) variants enrolled in the International SCN8A Registry, a patientdriven online database. We performed correlation analyses stratifying the cohort by clinical phenotypes to identify diagnostic features that differ among patients with varying levels of clinical severity, and that differ among patients with distinct GOF variants.Our analyses confirm positive correlations between age at seizure onset and developmental skills acquisition (developmental quotient), rate of seizure freedom, and percentage of cohort with developmental delays, and identify negative correlations with number of current and weaned antiseizure medications. This set of features is more detrimentally affected in individuals with a priori expectations of more severe clinical phenotypes. Our analyses also reveal a significant correlation between a severity index combining clinical features of individuals with a particular highly recurrent variant and an independent electrophysiological score assigned to each variant based on in vitro testing.This is one of the first studies to identify statistically significant GP correlations for individual SCN8A variants with GOF properties. The results suggest that individual GOF variants (1) are predictive of clinical severity for individuals carrying those variants and (2) may underlie distinct clinical phenotypes of SCN8A disease, thus helping to explain the wide SCN8Arelated epilepsy disease spectrum. These results also suggest that certain features present at initial diagnosis are predictive of clinical severity, and with more informed treatment plans, may serve to improve prognosis for patients with SCN8A GOF variants.  Hack, J. B., Horning, K. J., Short, D. M., Schreiber, J. M., Watkins, J. C., & Hammer, M. F. (2023).
Distinguishing LossofFunction and GainofFunctionSCN8AVariants Using a Random Forest Classification Model Trained on Clinical Features
. Neurology: Genetics, 9. doi:10.1212/nxg.0000000000200060More infoBackground and Objectives Pathogenic variants at the voltagegated sodium channel gene, SCN8A , are associated with a wide spectrum of clinical disease outcomes. A critical challenge for neurologists is to determine whether patients carry gainoffunction (GOF) or lossoffunction (LOF) variants to guide treatment decisions, yet in vitro studies to infer channel function are often not feasible in the clinic. In this study, we develop a predictive modeling approach to classify variants based on clinical features present at initial diagnosis. Methods We performed an exhaustive search for individuals deemed to carry SCN8A GOF and LOF variants by means of in vitro studies in heterologous cell systems, or because the variant was classified as truncating, and recorded clinical features. This resulted in a total of 69 LOF variants: 34 missense and 35 truncating variants, including 9 nonsense, 13 frameshift, 6 splice site, 6 indels, and 1 large deletion. We then assembled a truth set of variants with known functional effects, excluding individuals carrying variants at other loci associated with epilepsy. We then trained a predictive model based on random forest using this truth set of 45 LOF variants and 45 GOF variants randomly selected from a set of variants tested by in vitro methods. Results Phenotypic categories assigned to individuals correlated strongly with GOF or LOF variants. All patients with GOF variants experienced earlyonset seizures (mean age at onset = 4.5 ± 3.1 months) while only 64.4% patients with LOF variants had seizures, most of which were lateonset absence seizures (mean age at onset = 40.0 ± 38.1 months). With high accuracy (95.4%), our model including 5 key clinical features classified individuals with GOF and LOF variants into 2 distinct cohorts differing in age at seizure onset, development of seizures, seizure type, intellectual disability, and developmental and epileptic encephalopathy. Discussion The results support the hypothesis that patients with SCN8A GOF and LOF variants represent distinct clinical phenotypes. The clinical model developed in this study has great utility because it provides a rapid and highly accurate platform for predicting the functional class of patient variants during SCN8A diagnosis, which can aid in initial treatment decisions and improve prognosis.  Watkins, J. C., Hack, J. B., Hammer, M. F., Screiber, J. M., Horning, K., & Juroske Short, D. M. (2023). Distinguishing Loss and GainofFunction SCN8A Variants Using a Random Forest Classification Model Trained on Clinical Features. Neurology Genetics, 15.
 Sahneh, F. D., Fries, W., Watkins, J. C., & Lega, J. (2022).
Epidemics from the Eye of the Pathogen
. SIAM Journal on Applied Mathematics, 82, 20362056. doi:10.1137/21m1450719More infoWhile a common trend in disease modeling is to develop models of increasing complexity, it was recently pointed out that outbreaks appear remarkably simple when viewed in the incidence vs. cumulative cases (ICC) plane. This article details the theory behind this phenomenon by analyzing the stochastic SIR (Susceptible, Infected, Recovered) model in the cumulative cases domain. We prove that the Markov chain associated with this model reduces, in the ICC plane, to a pure birth chain for the cumulative number of cases, whose limit leads to an independent increments Gaussian process that fluctuates about a deterministic ICC curve. We calculate the associated variance and quantify the additional variability due to estimating incidence over a finite period of time. We also illustrate the universality brought forth by the ICC concept on realworld data for Influenza A and for the COVID19 outbreak in Arizona.  Watkins, J. C., Chilton, F. H., Yao, G., Zhang, H. H., McCall, C. E., Seeds, M. C., Hallmark, J. M., Hallmark, B., Sun, S., Hara, A., & Lu, E. (2022). Temporal Associations of Plasma Levels of the Secreted Phospholipase A_{2}Family and Mortality in Severe COVID19. Cold Spring Harbor Laboratory  medRxiv. doi:10.1101/2022.11.21.22282595More infoAbstract Previous research suggests that group IIA secreted phospholipase A 2 (sPLA 2 IIA) plays a role in and predicts severe COVID19 disease. The current study reanalyzed a longitudinal proteomic data set to determine the temporal (days 0, 3 and 7) relationship between the levels of several members of a family of sPLA 2 isoforms and the severity of COVID19 in 214 ICU patients. The levels of six secreted PLA 2 isoforms, sPLA 2 IIA, sPLA 2 V, sPLA 2 X, sPLA 2 IB, sPLA 2 IIC, and sPLA 2 XVI, increased over the first 7 ICU days in those who succumbed to the disease. sPLA 2 IIA outperformed top ranked cytokines and chemokines as predictors of patient outcome. A decision tree corroborated these results with day 0 to day 3 kinetic changes of sPLA 2 IIA that separated the death and severe categories from the mild category and increases from day 3 to day 7 significantly enriched the lethal category. In contrast, there was a timedependent decrease in sPLA 2 IID and sPLA 2 XIIB in patients with severe or lethal disease, and these two isoforms were at higher levels in mild patients. Taken together, proteomic analysis revealed temporal sPLA 2 patterns that reflect the critical roles of sPLA 2 isoforms in severe COVID19 disease.
 Gentry, B., Richardson, M., Lopez, D. P., & Watkins, J. (2021). Indigenous Language Migration along the US Southwestern Border?the View from Arizona. Chance, 34(3), 4755.
 Sahneh, F. D., Fries, W., Watkins, J. C., & Lega, J. (2021). Epidemics from the Eye of the Pathogen. arXiv preprint arXiv:2103.12848.
 Watkins, J. C., Zhou, J., Zhou, H., Liu, Y., & Zhang, M. (2021). A novel nonlinear dimension reduction approach to infer population structure for lowcoverage sequencing data. BMC Bioinformatics.
 Watkins, J., Gentry, B., Richardson, M., & Lopez, D. P. (2021). Indigenous Language Migration along the U.S. Southwestern Border—the View from Arizona. CHANCE, 34(3), 4755. doi:10.1080/09332480.2021.1979814
 Zhang, M., Liu, Y., Zhou, H., Zhou, J., & Watkins, J. C. (2021). A novel nonlinear dimension reduction approach to infer population structure for lowcoverage sequencing data. BMC Bioinformatics, 22. doi:https://doi.org/10.1186/s1285902104265
 Watkins, J. C., Longoria, I. A., Johnson, J. P., Hammer, M. F., & Encinas, A. C. (2020). Variable patterns of mutation density among NaV1.1, NaV1.2 and NaV1.6 point to channelspecific functional differences associated with childhood epilepsy.. PloS one, 15(8), e0238121. doi:10.1371/journal.pone.0238121More infoVariants implicated in childhood epilepsy have been identified in all four voltagegated sodium channels that initiate action potentials in the central nervous system. Previous research has focused on the functional effects of particular variants within the most studied of these channels (NaV1.1, NaV1.2 and NaV1.6); however, there have been few comparative studies across channels to infer the impact of mutations in patients with epilepsy. Here we compare patterns of variation in patient and public databases to test the hypothesis that regions of known functional significance within voltagegated sodium (NaV) channels have an increased burden of deleterious variants. We assessed mutational burden in different regions of the Nav channels by (1) performing Fisher exact tests on odds ratios to infer excess variants in domains, segments, and loops of each channel in patient databases versus public "control" databases, and (2) comparing the cumulative distribution of variant sites along DNA sequences of each gene in patient and public databases (i.e., independent of protein structure). Patient variant density was concordant among channels in regions known to play a role in channel function, with statistically significant higher patient variant density in S4S6 and DIIIDIV and an excess of public variants in SIS3, DIDII, DIIDIII. On the other hand, channelspecific patterns of patient burden were found in the NaV1.6 inactivation gate and NaV1.1 S5S6 linkers, while NaV1.2 and NaV1.6 S4S5 linkers and S5 segments shared patient variant patterns that contrasted with those in NaV1.1. These different patterns may reflect different roles played by the NaV1.6 inactivation gate in action potential propagation, and by NaV1.1 S5S6 linkers in loss of function and haploinsufficiency. Interestingly, NaV1.2 and NaV1.6 both lack amino acid substitutions over significantly long stretches in both the patient and public databases suggesting that new mutations in these regions may cause embryonic lethality or a nonepileptic disease phenotype.
 Ahmed, R., Angelini, P., Sahneh, F. D., Efrat, A., Glickenstein, D., Gronemann, M., Heinsohn, N., Kobourov, S. G., Spence, R. K., Watkins, J. C., & Wolff, A. (2019).
MultiLevel Steiner Trees
. Journal of Experimental Algorithmics, 24, 122. doi:10.48550/arxiv.1804.02627More infoIn the classical Steiner tree problem, given an undirected, connected graph $G=(V,E)$ with nonnegative edge costs and a set of \emph{terminals} $T\subseteq V$, the objective is to find a minimumcost tree $E' \subseteq E$ that spans the terminals. The problem is APXhard; the best known approximation algorithm has a ratio of $\rho = \ln(4)+\varepsilon < 1.39$. In this paper, we study a natural generalization, the \emph{multilevel Steiner tree} (MLST) problem: given a nested sequence of terminals $T_{\ell} \subset \dots \subset T_1 \subseteq V$, compute nested trees $E_{\ell}\subseteq \dots \subseteq E_1\subseteq E$ that span the corresponding terminal sets with minimum total cost. The MLST problem and variants thereof have been studied under various names including Multilevel Network Design, QualityofService Multicast tree, GradeofService Steiner tree, and MultiTier tree. Several approximation results are known. We first present two simple $O(\ell)$approximation heuristics. Based on these, we introduce a rudimentary composite algorithm that generalizes the above heuristics, and determine its approximation ratio by solving a linear program. We then present a method that guarantees the same approximation ratio using at most $2\ell$ Steiner tree computations. We compare these heuristics experimentally on various instances of up to 500 vertices using three different network generation models. We also present various integer linear programming (ILP) formulations for the MLST problem, and compare their running times on these instances. To our knowledge, the composite algorithm achieves the best approximation ratio for up to $\ell=100$ levels, which is sufficient for most applications such as network visualization or designing multilevel infrastructure.  Ahmed, R., Watkins, J. C., Wolff, A., Angelini, P., Efrat, A., Gronemann, M., Heinsohn, N., Kobourov, S. G., Spence, R., Sahneh, F. D., & Glickenstein, D. A. (2019). Multilevel Steiner Trees. Journal of Experimental Algorithmics (JEA). doi:10.1145/3368621
 Encinas, A. C., Moore, I. K., Watkins, J. C., & Hammer, M. F. (2019). Influence of age at seizure onset on the acquisition of neurodevelopmental skills in an SCN8A cohort. Epilepsia, 60(8), 17111720.More infoTo characterize a cohort of patients with SCN8Arelated epilepsy and to perform analyses to identify correlations involving the acquisition of neurodevelopmental skills.
 Hammer, M. F., Watkins, J. C., Encinas, A. C., & Moore, I. (. (2019). Influence of age at seizure onset on the acquisition of neurodevelopmental skills in an SCN8A cohort. Epilepsia, 60(8), 17111720. doi:10.1111/epi.16288
 Osipova, L. P., Lichman, D. V., Hallmark, B., Karafet, T. M., Hsieh, P. H., Watkins, J. C., & Hammer, M. F. (2019).
Genomic evidence of local adaptation to climate and diet in indigenour Siberians
. Molecular Biology and Evolution, 36, 315327. doi:10.18413/2658653320206304  Sahneh, F. D., Efrat, A., Glickenstein, D., Wolff, A., Watkins, J. C., Spence, R., Sahneh, F. D., Kobourov, S. G., Heinsohn, N., Gronemann, M., Glickenstein, D., Efrat, A., Angelini, P., & Ahmed, R. (2019). Multilevel Steiner Trees. ACM Journal of Experimental Algorithms, 24(1), 122. doi:10.1145/3368621More infoIn the classical Steiner tree problem, given an undirected, connected graph G=(V,E) with nonnegative edge costs and a set of terminalsT⊆ V, the objective is to find a minimumcost tree Ep the bestknown approximation algorithm has a ratio of ρ = ln (4)+e
 Watkins, J. C., Osipova, L. P., Karafet, T. M., Hsieh, P., Hammer, M. F., & Hallmark, B. (2019). Genomic Evidence of Local Adaptation to Climate and Diet in Indigenous Siberians.. Molecular biology and evolution, 36(2), 315327. doi:10.1093/molbev/msy211More infoThe indigenous inhabitants of Siberia live in some of the harshest environments on earth, experiencing extended periods of severe cold temperatures, dramatic variation in photoperiod, and limited and highly variable food resources. While the successful longterm settlement of this area by humans required multiple behavioral and cultural innovations, the nature of the underlying genetic changes has generally remained elusive. In this study, we used a threepart approach to identify putative targets of positive natural selection in Siberians. We first performed selection scans on whole exome and genomewide single nucleotide polymorphism array data from multiple Siberian populations. We then annotated candidates in the tails of the empirical distributions, focusing on candidates with evidence linking them to biological processes and phenotypes previously identified as relevant to adaptation in circumpolar groups. The top candidates were then genotyped in additional populations to determine their spatial allele frequency distributions and associations with climate variables. Our analysis reveals missense mutations in three genes involved in lipid metabolism (PLA2G2A, PLIN1, and ANGPTL8) that exhibit genomic and spatial patterns consistent with selection for cold climate and/or diet. These variants are unified by their connection to brown adipose tissue and may help to explain previously observed physiological differences in Siberians such as low serum lipid levels and increased basal metabolic rate. These results support the hypothesis that indigenous Siberians have genetically adapted to their local environment by selection on multiple genes.
 QuintoCortés, C. D., Woerner, A. E., Watkins, J. C., & Hammer, M. F. (2018). Modeling SNP array ascertainment with Approximate Bayesian Computation for demographic inference. Scientific reports, 8(1), 10209.More infoSingle nucleotide polymorphisms (SNPs) in commercial arrays have often been discovered in a small number of samples from selected populations. This ascertainment skews patterns of nucleotide diversity and affects population genetic inferences. We propose a demographic inference pipeline that explicitly models the SNP discovery protocol in an Approximate Bayesian Computation (ABC) framework. We simulated genomic regions according to a demographic model incorporating parameters for the divergence of three wellcharacterized HapMap populations and recreated the SNP distribution of a commercial array by varying the number of haploid samples and the allele frequency cutoff in the given regions. We then calculated summary statistics obtained from both the ascertained and genomic data and inferred ascertainment and demographic parameters. We implemented our pipeline to study the admixture process that gave rise to the presentday Mexican population. Our estimate of the time of admixture is closer to the historical dates than those in previous works which did not consider ascertainment bias. Although the use of whole genome sequences for demographic inference is becoming the norm, there are still underrepresented areas of the world from where only SNP array data are available. Our inference framework is applicable to those cases and will help with the demographic inference.
 Watkins, J. C., Woerner, A. E., Veeramah, K. R., & Hammer, M. F. (2018). The Role of Phylogenetically Conserved Elements in Shaping Patterns of Human Genomic Diversity. Molecular Biology and Evolution, 35(9), 22842295. doi:10.1093/molbev/msy145
 Woerner, A. E., Veeramah, K. R., Watkins, J. C., & Hammer, M. F. (2018). The role of phylogenetically conserved elements in shaping patterns of human genomic diversity. Molecular biology and evolution.More infoEvolutionary genetic studies have shown a positive correlation between levels of nucleotide diversity and either rates of recombination or genetic distance to genes. Both positivedirectional and purifying selection have been offered as the source of these correlations via genetic hitchhiking and background selection, respectively. Phylogenetically conserved elements (CEs) are short (∼100bp), widely distributed (comprising ∼5% of genome), sequences that are often found far from genes. While the function of many CEs is unknown, CEs also are associated with reduced diversity at linked sites. Using high coverage (>80x) whole genome data from two human populations, the Yoruba and the CEU, we perform fine scale evaluations of diversity, rates of recombination, and linkage to genes. We find that the local rate of recombination has a stronger effect on levels of diversity than linkage to genes, and that these effects of recombination persist even in regions far from genes. Our whole genome modeling demonstrates that, rather than recombination or GCbiased gene conversion, selection on sites within or linked to CEs better explains the observed genomic diversity patterns. A major implication is that very few sites in the human genome are predicted to be free of the effects of selection. These sites, which we refer to as the human "neutralome", comprise only 1.2% of the autosomes and 5.1% of the X chromosome. Demographic analysis of the neutralome reveals larger population sizes and lower rates of growth for ancestral human populations than inferred by previous analyses.
 Hammer, M. F., Ishii, A., Johnstone, L., Tchourbanov, A., Lau, B., Sprissler, R., Hallmark, B., Zhang, M., Zhou, J., Watkins, J., & Hirose, S. (2017). Rare variants of small effect size in neuronal excitability genes influence clinical outcome in Japanese cases of SCN1A truncationpositive Dravet syndrome. PloS one, 12(7), e0180485.More infoDravet syndrome (DS) is a rare, devastating form of childhood epilepsy that is often associated with mutations in the voltagegated sodium channel gene, SCN1A. There is considerable variability in expressivity within families, as well as among individuals carrying the same primary mutation, suggesting that clinical outcome is modulated by variants at other genes. To identify modifier gene variants that contribute to clinical outcome, we sequenced the exomes of 22 individuals at both ends of a phenotype distribution (i.e., mild and severe cognitive condition). We controlled for variation associated with different mutation types by limiting inclusion to individuals with a de novo truncation mutation resulting in SCN1A haploinsufficiency. We performed tests aimed at identifying 1) single common variants that are enriched in either phenotypic group, 2) sets of common or rare variants aggregated in and around genes associated with clinical outcome, and 3) rare variants in 237 candidate genes associated with neuronal excitability. While our power to identify enrichment of a common variant in either phenotypic group is limited as a result of the rarity of mild phenotypes in individuals with SCN1A truncation variants, our top candidates did not map to functional regions of genes, or in genes that are known to be associated with neurological pathways. In contrast, we found a statisticallysignificant excess of rare variants predicted to be damaging and of small effect size in genes associated with neuronal excitability in severely affected individuals. A KCNQ2 variant previously associated with benign neonatal seizures is present in 3 of 12 individuals in the severe category. To compare our results with the healthy population, we performed a similar analysis on whole exome sequencing data from 70 Japanese individuals in the 1000 genomes project. Interestingly, the frequency of rare damaging variants in the same set of neuronal excitability genes in healthy individuals is nearly as high as in severely affected individuals. Rather than a single common gene/variant modifying clinical outcome in SCN1Arelated epilepsies, our results point to the cumulative effect of rare variants with little to no measurable phenotypic effect (i.e., typical genetic background) unless present in combination with a diseasecausing truncation mutation in SCN1A.
 Hsieh, P., Hallmark, B., Watkins, J. C., Karafet, T. C., Osipova, L. P., Gutenkunst, R. N., & Hammer, M. F. (2017). Exome sequencing provides evidence of polygenic adaptation to a fatrich animal diet in indigenous Siberian populations. Molecular Biology and Evolution, 34, 2914.
 Hsieh, P., Hallmark, B., Watkins, J., Karafet, T. M., Osipova, L. P., Gutenkunst, R. N., & Hammer, M. F. (2017). Exome Sequencing Provides Evidence of Polygenic Adaptation to a FatRich Animal Diet in Indigenous Siberian Populations. Molecular biology and evolution, 34(11), 29132926.More infoSiberia is one of the coldest environments on Earth and has great seasonal temperature variation. Longterm settlement in northern Siberia undoubtedly required biological adaptation to severe cold stress, dramatic variation in photoperiod, and limited food resources. In addition, recent archeological studies show that humans first occupied Siberia at least 45,000 years ago; yet our understanding of the demographic history of modern indigenous Siberians remains incomplete. In this study, we use wholeexome sequencing data from the Nganasans and Yakuts to infer the evolutionary history of these two indigenous Siberian populations. Recognizing the complexity of the adaptive process, we designed a modelbased test to systematically search for signatures of polygenic selection. Our approach accounts for stochasticity in the demographic process and the hitchhiking effect of classic selective sweeps, as well as potential biases resulting from recombination rate and mutation rate heterogeneity. Our demographic inference shows that the Nganasans and Yakuts diverged ∼12,00013,000 years ago from EastAsian ancestors in a process involving continuous gene flow. Our polygenic selection scan identifies seven candidate gene sets with Siberianspecific signals. Three of these gene sets are related to diet, especially to fat metabolism, consistent with the hypothesis of adaptation to a fatrich animal diet. Additional testing rejects the effect of hitchhiking and favors a model in which selection yields small allele frequency changes at multiple unlinked genes.
 Ishii, A., Kang, J. Q., Schornak, C. C., Hernandez, C. C., Shen, W., Watkins, J. C., Macdonald, R. L., & Hirose, S. (2017). A de novo missense mutation of GABRB2 causes early myoclonic encephalopathy. Journal of medical genetics, 54(3), 202211.More infoEarly myoclonic encephalopathy (EME), a disease with a devastating prognosis, is characterised by neonatal onset of seizures and massive myoclonus accompanied by a continuous suppressionburst EEG pattern. Three genes are associated with EMEs that have metabolic features. Here, we report a pathogenic mutation of an ion channel as a cause of EME for the first time.
 Ishii, A., Watkins, J. C., Chen, D., Hirose, S., & Hammer, M. F. (2017). Clinical implications of SCN1A missense and truncation variants in a large Japanese cohort with Dravet syndrome. Epilepsia, 58(2), 282290.More infoTwo major classes of SCN1A variants are associated with Dravet syndrome (DS): those that result in haploinsufficiency (truncating) and those that result in an amino acid substitution (missense). The aim of this retrospective study was to describe the first large cohort of Japanese patients with SCN1A mutationpositive DS (n = 285), and investigate the relationship between variant (type and position) and clinical expression and response to treatment.
 Alberts, D. S., Watkins, J. C., Patel, C., Glazer, E. S., Zhang, H. H., Hill, K. A., Kha, S. T., Yozwiak, M. L., Bartels, H., Nafissi, N. N., & Krouse, R. S. (2016). Evaluating IPMN and pancreatic carcinoma utilizing quantitative histopathology. Cancer Medicine, 5(10), 28412847. doi:10.1002/cam4.923
 Glazer, E. S., Zhang, H. H., Hill, K. A., Patel, C., Kha, S. T., Yozwiak, M. L., Bartels, H., Nafissi, N. N., Watkins, J. C., Alberts, D. S., & Krouse, R. S. (2016). Evaluating IPMN and pancreatic carcinoma utilizing quantitative histopathology. Cancer medicine, 5(10), 28412847.More infoIntraductal papillary mucinous neoplasms (IPMN) are pancreatic lesions with uncertain biologic behavior. This study sought objective, accurate prediction tools, through the use of quantitative histopathological signatures of nuclear images, for classifying lesions as chronic pancreatitis (CP), IPMN, or pancreatic carcinoma (PC). Fortyfour pancreatic resection patients were retrospectively identified for this study (12 CP; 16 IPMN; 16 PC). Regularized multinomial regression quantitatively classified each specimen as CP, IPMN, or PC in an automated, blinded fashion. Classification certainty was determined by subtracting the smallest classification probability from the largest probability (of the three groups). The certainty function varied from 1.0 (perfectly classified) to 0.0 (random). From each lesion, 180 ± 22 nuclei were imaged. Overall classification accuracy was 89.6% with six unique nuclear features. No CP cases were misclassified, 1/16 IPMN cases were misclassified, and 4/16 PC cases were misclassified. Certainty function was 0.75 ± 0.16 for correctly classified lesions and 0.47 ± 0.10 for incorrectly classified lesions (P = 0.0005). Uncertainty was identified in four of the five misclassified lesions. Quantitative histopathology provides a robust, novel method to distinguish among CP, IPMN, and PC with a quantitative measure of uncertainty. This may be useful when there is uncertainty in diagnosis.
 Hammer, M. F., Watkins, J. C., Ishii, A., Chen, D., & Hirose, S. (2016). Clinical implications ofSCN1Amissense and truncation variants in a large Japanese cohort with Dravet syndrome. Epilepsia, 58(2), 282290. doi:10.1111/epi.13639
 Ishii, A., Kang, J., Schornak, C. C., Hernández, C. C., Shen, W., Watkins, J. C., Macdonald, R. L., & Hirose, S. (2016).
Ade novomissense mutation ofGABRB2causes early myoclonic encephalopathy
. Journal of medical genetics, 54, 202211. doi:10.1136/jmedgenet2016104083More infoBackground Early myoclonic encephalopathy (EME), a disease with a devastating prognosis, is characterised by neonatal onset of seizures and massive myoclonus accompanied by a continuous suppressionburst EEG pattern. Three genes are associated with EMEs that have metabolic features. Here, we report a pathogenic mutation of an ion channel as a cause of EME for the first time. Methods Sequencing was performed for 214 patients with epileptic seizures using a gene panel with 109 genes that are known or suspected to cause epileptic seizures. Functional assessments were demonstrated by using electrophysiological experiments and immunostaining for mutant γaminobutyric acidA (GABAA) receptor subunits in HEK293T cells. Results We discovered a de novo heterozygous missense mutation (c.859A>C [p.Thr287Pro]) in the GABRB2encoded β2 subunit of the GABAA receptor in an infant with EME. No GABRB2 mutations were found in three other EME cases or in 166 patients with infantile spasms. GABAA receptors bearing the mutant β2 subunit were poorly trafficked to the cell membrane and prevented γ2 subunits from trafficking to the cell surface. The peak amplitudes of currents from GABAA receptors containing only mutant β2 subunits were smaller than that of those from receptors containing only wildtype β2 subunits. The decrease in peak current amplitude (96.4% reduction) associated with the mutant GABAA receptor was greater than expected, based on the degree to which cell surface expression was reduced (66% reduction). Conclusion This mutation has complex functional effects on GABAA receptors, including reduction of cell surface expression and attenuation of channel function, which would significantly perturb GABAergic inhibition in the brain.  Ishii, A., Kang, J., Schornak, C. C., Hernández, C. C., Shen, W., Watkins, J. C., Macdonald, R. L., Hirose, S., Ishii, A., Kang, J., Schornak, C. C., Hernández, C. C., Shen, W., Watkins, J. C., Macdonald, R. L., & Hirose, S. (2016).
Ade novomissense mutation ofGABRB2causes early myoclonic encephalopathy
. Journal of Medical Genetics, 54, 202211. doi:10.1136/jmedgenet2016104083More infoBackground Early myoclonic encephalopathy (EME), a disease with a devastating prognosis, is characterised by neonatal onset of seizures and massive myoclonus accompanied by a continuous suppressionburst EEG pattern. Three genes are associated with EMEs that have metabolic features. Here, we report a pathogenic mutation of an ion channel as a cause of EME for the first time. Methods Sequencing was performed for 214 patients with epileptic seizures using a gene panel with 109 genes that are known or suspected to cause epileptic seizures. Functional assessments were demonstrated by using electrophysiological experiments and immunostaining for mutant γaminobutyric acidA (GABAA) receptor subunits in HEK293T cells. Results We discovered a de novo heterozygous missense mutation (c.859A>C [p.Thr287Pro]) in the GABRB2encoded β2 subunit of the GABAA receptor in an infant with EME. No GABRB2 mutations were found in three other EME cases or in 166 patients with infantile spasms. GABAA receptors bearing the mutant β2 subunit were poorly trafficked to the cell membrane and prevented γ2 subunits from trafficking to the cell surface. The peak amplitudes of currents from GABAA receptors containing only mutant β2 subunits were smaller than that of those from receptors containing only wildtype β2 subunits. The decrease in peak current amplitude (96.4% reduction) associated with the mutant GABAA receptor was greater than expected, based on the degree to which cell surface expression was reduced (66% reduction). Conclusion This mutation has complex functional effects on GABAA receptors, including reduction of cell surface expression and attenuation of channel function, which would significantly perturb GABAergic inhibition in the brain.  Bartels, P. H., Zhang, H. H., Yozwiak, M. L., Watkins, J. C., Patel, C., Krouse, R. S., Kha, S. T., Hill, K. A., Glazer, E. S., Bartels, P. H., Bartels, H. G., & Alberts, D. S. (2015). Abstract A83: Nuclear morphometry differentiates chronic pancreatitis, IPMN, and pancreatic carcinoma. Cancer Research, 75. doi:10.1158/15387445.panca2014a83More infoBackground: It can be difficult to distinguish between chronic pancreatitis (CP), IPMN, and pancreatic carcinoma (PC) on tissue biopsy. Nuclear morphometry can measure up to 93 unique nuclear features based on standard histopathology. The goal of this work is to build novel, objective, and accurate prediction tools, based on nuclear morphometric signatures in high resolution images of nuclei of histologic sections, for classifying pancreatic tissues into three distinct groups. Materials & Methods: 44 patients who underwent pancreatic resections were identified. 12 cases of CP, 16 cases of IPMN, and 16 cases of PC were utilized in this pilot study. 180 ± 22 nuclei from each lesion were imaged with high resolution microscopy. Clincodemographic data was obtained retrospectively from the medical record. Statistically significant nuclear features were determined by a fully automated penalized multinomial regression algorithm in order to determine a multiclass classifier and simultaneously identify important nuclear features. The LASSO penalty function, and associated regularization parameter, is adaptively chosen by cross validation to prevent overfitting. In order to test the veracity of the automated algorithm, we randomly removed 25% of the cases as a training set and utilized the remaining cases as a test set; this was repeated 10 times. Results: The average age was 64 ± 15 years, with patients in the CP being slightly younger; 63% were male. Median followup time was 3 years in the CP group, 3 years in the IPMN group, and 5 years in PC group. The method described automatically identified 6 unique and statistically significant nuclear features (corrected overall P Conclusions: Nuclear morphometry classifies pancreatic lesions into CP, IPMN, and PC with 84.5% accuracy using a fully automated algorithm to determine statistically significant and unique nuclear features. Since the incorrectly classified lesions had a larger proportion of mixed nuclei, diagnostic uncertainty may be determined in a quantitative manner allowing for a confidence probability estimation of whether a given lesions should be classified as a CP, IPMN, or PC. Further studies will validate these results in a resected cohort as well as a cohort based on biopsied specimens alone. Citation Format: Evan S. Glazer, Hao Zhang, Kimberly A. Hill, Charmi Patel, Stephanie T. Kha, Peter H. Bartels, Michael L. Yozwiak, Hubert G. Bartels, Joseph C. Watkins, David S. Alberts, Robert S. Krouse. Nuclear morphometry differentiates chronic pancreatitis, IPMN, and pancreatic carcinoma. [abstract]. In: Proceedings of the AACR Special Conference on Pancreatic Cancer: Innovations in Research and Treatment; May 1821, 2014; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2015;75(13 Suppl):Abstract nr A83.
 Bailey, B. L., Visscher, K., & Watkins, J. (2014). A stochastic model of translation with 1 programmed ribosomal frameshifting. Physical biology, 11(1), 016009.More infoMany viruses produce multiple proteins from a single mRNA sequence by encoding overlapping genes. One mechanism to decode both genes, which reside in alternate reading frames, is 1 programmed ribosomal frameshifting. Although recognized for over 25 years, the molecular and physical mechanism of 1 frameshifting remains poorly understood. We have developed a mathematical model that treats mRNA translation and associated 1 frameshifting as a stochastic process in which the transition probabilities are based on the energetics of local molecular interactions. The model predicts both the location and efficiency of 1 frameshift events in HIV1. Moreover, we compute 1 frameshift efficiencies upon mutations in the viral mRNA sequence and variations in relative tRNA abundances, predictions that are directly testable in experiment.
 Bartels, P. H., Zhang, H. H., Watkins, J. C., Krouse, R. S., Hill, K. A., Glazer, E. S., Bartels, P. H., & Alberts, D. S. (2014). Abstract 1362: Nuclear morphometry measures progressive atypia in the development of pancreatic carcinoma. Cancer Research, 74, 13621362. doi:10.1158/15387445.am20141362More infoPancreatic lesions that are not clearly benign are often treated as malignant despite uncertainty in the true diagnosis due to the nearly universally fatal nature of pancreatic carcinoma (PC). Nuclear morphometry is a technique to quantify nuclear features too complex for the human eye to discern. We hypothesized that nuclear atypia can be quantified with morphometry in order to distinguish between chronic pancreatitis, IPMN, and PC. We retrospectively analyzed 14 specimens of chronic pancreatitis, 16 IPMN lesions, and 19 PC lesions. Clinicopathologic data were obtained. Nuclear morphometry determined overall atypia based on the average nuclear abnormality of 95 distinct nuclear features (a nuclear signature). For PC lesions, 5 nuclear features defined a classification score (CS) representing the proportion of aggressive nuclei in a given PC lesion. Statistical significance was determined with ANOVA and the KruskalWallis test. The average age for all patients was 63 ± 15 years while 62% were male; there were no differences between the 3 groups. The follow up was approximately 4 years in all groups as well. The average nuclear atypia of chronic pancreatitis was 0.80, 0.99 for IPMN, and 1.08 for PC (P = 0.02). Importantly, based on 5 nuclear features, the CS for PC lesions that recurred was 87% while it was only 56% for those PC lesions that did recur (P = 0.04). The research describes a precise, accurate, and objective method to distinguish IPMN from PC. The CS and overall atypia score provides a method to not only objectively describe a given lesion, but it importantly describes where in the progression from benign to malignant lesion an unknown lesion exists. Clinically, this may be of great utility in risk stratifying unknown pancreatic lesions at the time of diagnosis or tissue biopsy. Citation Format: Evan S. Glazer, Kimberly A. Hill, Hao (Helen) Zhang, Peter Bartels, Joseph Watkins, David S. Alberts, Robert S. Krouse. Nuclear morphometry measures progressive atypia in the development of pancreatic carcinoma. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 59; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2014;74(19 Suppl):Abstract nr 1362. doi:10.1158/15387445.AM20141362
 Veeramah, K. R., Gutenkunst, R. N., Woerner, A. E., Watkins, J. C., & Hammer, M. F. (2014).
Evidence for Increased Levels of Positive and Negative Selection on the X Chromosome versus Autosomes in Humans
. Molecular Biology and Evolution, 31, 22672282. doi:10.1093/molbev/msu166More infoPartially recessive variants under positive selection are expected to go to fixation more quickly on the X chromosome as a result of hemizygosity, an effect known as fasterX. Conversely, purifying selection is expected to reduce substitution rates more effectively on the X chromosome. Previous work in humans contrasted divergence on the autosomes and X chromosome, with results tending to support the fasterX effect. However, no study has yet incorporated both divergence and polymorphism to quantify the effects of both purifying and positive selection, which are opposing forces with respect to divergence. In this study, we develop a framework that integrates previously developed theory addressing differential rates of X and autosomal evolution with methods that jointly estimate the level of purifying and positive selection via modeling of the distribution of fitness effects (DFE). We then utilize this framework to estimate the proportion of nonsynonymous substitutions fixed by positive selection (α) using exome sequence data from a West African population. We find that varying the female to male breeding ratio (β) has minimal impact on the DFE for the X chromosome, especially when compared with the effect of varying the dominance coefficient of deleterious alleles (h). Estimates of α range from 46% to 51% and from 4% to 24% for the X chromosome and autosomes, respectively. While dependent on h, the magnitude of the difference between α values estimated for these two systems is highly statistically significant over a range of biologically realistic parameter values, suggesting fasterX has been operating in humans.  Veeramah, K. R., Gutenkunst, R. N., Woerner, A. E., Watkins, J. C., & Hammer, M. F. (2014). Evidence for increased levels of positive and negative selection on the X chromosome versus autosomes in humans. Molecular biology and evolution, 31(9), 226782.More infoPartially recessive variants under positive selection are expected to go to fixation more quickly on the X chromosome as a result of hemizygosity, an effect known as fasterX. Conversely, purifying selection is expected to reduce substitution rates more effectively on the X chromosome. Previous work in humans contrasted divergence on the autosomes and X chromosome, with results tending to support the fasterX effect. However, no study has yet incorporated both divergence and polymorphism to quantify the effects of both purifying and positive selection, which are opposing forces with respect to divergence. In this study, we develop a framework that integrates previously developed theory addressing differential rates of X and autosomal evolution with methods that jointly estimate the level of purifying and positive selection via modeling of the distribution of fitness effects (DFE). We then utilize this framework to estimate the proportion of nonsynonymous substitutions fixed by positive selection (α) using exome sequence data from a West African population. We find that varying the female to male breeding ratio (β) has minimal impact on the DFE for the X chromosome, especially when compared with the effect of varying the dominance coefficient of deleterious alleles (h). Estimates of α range from 46% to 51% and from 4% to 24% for the X chromosome and autosomes, respectively. While dependent on h, the magnitude of the difference between α values estimated for these two systems is highly statistically significant over a range of biologically realistic parameter values, suggesting fasterX has been operating in humans.
 Mendez, F. L., Watkins, J. C., & Hammer, M. F. (2013).
Neandertal Origin of Genetic Variation at the Cluster of OAS Immunity Genes
. Molecular Biology and Evolution, 30, 798801. doi:10.1093/molbev/mst004More infoAnalyses of ancient DNA from extinct humans reveal signals of at least two independent hybridization events in the history of nonAfrican populations. To date, there are very few examples of specific genetic variants that have been rigorously identified as introgressive. Here, we survey DNA sequence variation in the OAS gene cluster on chromosome 12 and provide strong evidence that a haplotype extending for ∼185 kb introgressed from Neandertals. This haplotype is nearly restricted to Eurasians and is estimated to have diverged from the Neandertal sequence ∼125 kya. Despite the potential for novel functional variation, the observed frequency of this haplotype is consistent with neutral introgression. This is the second locus in the human genome, after STAT2, carrying distinct haplotypes that appear to have introgressed separately from both Neandertals and Denisova.  Mendez, F. L., Watkins, J. C., & Hammer, M. F. (2013). Neandertal origin of genetic variation at the cluster of OAS immunity genes. Molecular biology and evolution, 30(4), 798801.More infoAnalyses of ancient DNA from extinct humans reveal signals of at least two independent hybridization events in the history of nonAfrican populations. To date, there are very few examples of specific genetic variants that have been rigorously identified as introgressive. Here, we survey DNA sequence variation in the OAS gene cluster on chromosome 12 and provide strong evidence that a haplotype extending for ~185 kb introgressed from Neandertals. This haplotype is nearly restricted to Eurasians and is estimated to have diverged from the Neandertal sequence ~125 kya. Despite the potential for novel functional variation, the observed frequency of this haplotype is consistent with neutral introgression. This is the second locus in the human genome, after STAT2, carrying distinct haplotypes that appear to have introgressed separately from both Neandertals and Denisova.
 Mendez, F. L., Watkins, J. C., & Hammer, M. F. (2012).
A Haplotype at STAT2 Introgressed from Neanderthals and Serves as a Candidate of Positive Selection in Papua New Guinea
. The American Journal of Human Genetics, 91, 265274. doi:10.1016/j.ajhg.2012.06.015More infoSignals of archaic admixture have been identified through comparisons of the draft Neanderthal and Denisova genomes with those of living humans. Studies of individual loci contributing to these genomewide average signals are required for characterization of the introgression process and investigation of whether archaic variants conferred an adaptive advantage to the ancestors of contemporary human populations. However, no definitive case of adaptive introgression has yet been described. Here we provide a DNA sequence analysis of the innate immune gene STAT2 and show that a haplotype carried by many Eurasians (but not subSaharan Africans) has a sequence that closely matches that of the Neanderthal STAT2. This haplotype, referred to as N, was discovered through a resequencing survey of the entire coding region of STAT2 in a global sample of 90 individuals. Analyses of publicly available complete genome sequence data show that haplotype N shares a recent common ancestor with the Neanderthal sequence (∼80 thousand years ago) and is found throughout Eurasia at an average frequency of ∼5%. Interestingly, N is found in Melanesian populations at ∼10fold higher frequency (∼54%) than in Eurasian populations. A neutrality test that controls for demography rejects the hypothesis that a variant of N rose to high frequency in Melanesia by genetic drift alone. Although we are not able to pinpoint the precise target of positive selection, we identify nonsynonymous mutations in ERBB3, ESYT1, and STAT2—all of which are part of the same 250 kb introgressive haplotype—as good candidates. Signals of archaic admixture have been identified through comparisons of the draft Neanderthal and Denisova genomes with those of living humans. Studies of individual loci contributing to these genomewide average signals are required for characterization of the introgression process and investigation of whether archaic variants conferred an adaptive advantage to the ancestors of contemporary human populations. However, no definitive case of adaptive introgression has yet been described. Here we provide a DNA sequence analysis of the innate immune gene STAT2 and show that a haplotype carried by many Eurasians (but not subSaharan Africans) has a sequence that closely matches that of the Neanderthal STAT2. This haplotype, referred to as N, was discovered through a resequencing survey of the entire coding region of STAT2 in a global sample of 90 individuals. Analyses of publicly available complete genome sequence data show that haplotype N shares a recent common ancestor with the Neanderthal sequence (∼80 thousand years ago) and is found throughout Eurasia at an average frequency of ∼5%. Interestingly, N is found in Melanesian populations at ∼10fold higher frequency (∼54%) than in Eurasian populations. A neutrality test that controls for demography rejects the hypothesis that a variant of N rose to high frequency in Melanesia by genetic drift alone. Although we are not able to pinpoint the precise target of positive selection, we identify nonsynonymous mutations in ERBB3, ESYT1, and STAT2—all of which are part of the same 250 kb introgressive haplotype—as good candidates.  Mendez, F. L., Watkins, J. C., & Hammer, M. F. (2012).
Global Genetic Variation at OAS1 Provides Evidence of Archaic Admixture in Melanesian Populations
. Molecular Biology and Evolution, 29, 15131520. doi:10.1093/molbev/msr301More infoRecent analysis of DNA extracted from two Eurasian forms of archaic human shows that more genetic variants are shared with humans currently living in Eurasia than with anatomically modern humans in subSaharan Africa. Although these genomewide average measures of genetic similarity are consistent with the hypothesis of archaic admixture in Eurasia, analyses of individual loci exhibiting the signal of archaic introgression are needed to test alternative hypotheses and investigate the admixture process. Here, we provide a detailed sequence analysis of the innate immune gene OAS1, a locus with a divergent Melanesian haplotype that is very similar to the Denisova sequence from the Altai region of Siberia. We resequenced a 7kb region encompassing the OAS1 gene in 88 individuals from six Old World populations (San, Biaka, Mandenka, French Basque, Han Chinese, and Papua New Guineans) and discovered previously unknown and ancient genetic variation. The 5′ region of this gene has unusual patterns of diversity, including 1) higher levels of nucleotide diversity in Papuans than in subSaharan Africans, 2) very deep ancestry with an estimated time to the most recent common ancestor of >3 myr, and 3) a basal branching pattern with Papuan individuals on either side of the rooted network. A global geographic survey of >1,500 individuals showed that the divergent Papuan haplotype is nearly restricted to populations from eastern Indonesia and Melanesia. Polymorphic sites within this haplotype are shared with the draft Denisova genome over a span of ∼90 kb and are associated with an extended block of linkage disequilibrium, supporting the hypothesis that this haplotype introgressed from an archaic source that likely lived in Eurasia.  Mendez, F. L., Watkins, J. C., & Hammer, M. F. (2012). A haplotype at STAT2 Introgressed from neanderthals and serves as a candidate of positive selection in Papua New Guinea. American journal of human genetics, 91(2), 26574.More infoSignals of archaic admixture have been identified through comparisons of the draft Neanderthal and Denisova genomes with those of living humans. Studies of individual loci contributing to these genomewide average signals are required for characterization of the introgression process and investigation of whether archaic variants conferred an adaptive advantage to the ancestors of contemporary human populations. However, no definitive case of adaptive introgression has yet been described. Here we provide a DNA sequence analysis of the innate immune gene STAT2 and show that a haplotype carried by many Eurasians (but not subSaharan Africans) has a sequence that closely matches that of the Neanderthal STAT2. This haplotype, referred to as N, was discovered through a resequencing survey of the entire coding region of STAT2 in a global sample of 90 individuals. Analyses of publicly available complete genome sequence data show that haplotype N shares a recent common ancestor with the Neanderthal sequence (~80 thousand years ago) and is found throughout Eurasia at an average frequency of ~5%. Interestingly, N is found in Melanesian populations at ~10fold higher frequency (~54%) than in Eurasian populations. A neutrality test that controls for demography rejects the hypothesis that a variant of N rose to high frequency in Melanesia by genetic drift alone. Although we are not able to pinpoint the precise target of positive selection, we identify nonsynonymous mutations in ERBB3, ESYT1, and STAT2all of which are part of the same 250 kb introgressive haplotypeas good candidates.
 Mendez, F. L., Watkins, J. C., & Hammer, M. F. (2012). Global genetic variation at OAS1 provides evidence of archaic admixture in Melanesian populations. Molecular biology and evolution, 29(6), 151320.More infoRecent analysis of DNA extracted from two Eurasian forms of archaic human shows that more genetic variants are shared with humans currently living in Eurasia than with anatomically modern humans in subSaharan Africa. Although these genomewide average measures of genetic similarity are consistent with the hypothesis of archaic admixture in Eurasia, analyses of individual loci exhibiting the signal of archaic introgression are needed to test alternative hypotheses and investigate the admixture process. Here, we provide a detailed sequence analysis of the innate immune gene OAS1, a locus with a divergent Melanesian haplotype that is very similar to the Denisova sequence from the Altai region of Siberia. We resequenced a 7kb region encompassing the OAS1 gene in 88 individuals from six Old World populations (San, Biaka, Mandenka, French Basque, Han Chinese, and Papua New Guineans) and discovered previously unknown and ancient genetic variation. The 5' region of this gene has unusual patterns of diversity, including 1) higher levels of nucleotide diversity in Papuans than in subSaharan Africans, 2) very deep ancestry with an estimated time to the most recent common ancestor of >3 myr, and 3) a basal branching pattern with Papuan individuals on either side of the rooted network. A global geographic survey of >1,500 individuals showed that the divergent Papuan haplotype is nearly restricted to populations from eastern Indonesia and Melanesia. Polymorphic sites within this haplotype are shared with the draft Denisova genome over a span of ∼90 kb and are associated with an extended block of linkage disequilibrium, supporting the hypothesis that this haplotype introgressed from an archaic source that likely lived in Eurasia.
 Veeramah, K. R., Wegmann, D., Woerner, A., Mendez, F. L., Watkins, J. C., DestroBisol, G., Soodyall, H., Louie, L., & Hammer, M. F. (2012). An early divergence of KhoeSan ancestors from those of other modern humans is supported by an ABCbased analysis of autosomal resequencing data. Molecular biology and evolution, 29(2), 61730.More infoSubSaharan Africa has consistently been shown to be the most genetically diverse region in the world. Despite the fact that a substantial portion of this variation is partitioned between groups practicing a variety of subsistence strategies and speaking diverse languages, there is currently no consensus on the genetic relationships of subSaharan African populations. San (a subgroup of KhoeSan) and many Pygmy groups maintain huntergatherer lifestyles and cluster together in autosomalbased analysis, whereas nonPygmy NigerKordofanian speakers (nonPygmy NKs) predominantly practice agriculture and show substantial genetic homogeneity despite their wide geographic range throughout subSaharan Africa. However, KhoeSan, who speak a set of relatively unique clickbased languages, have long been thought to be an early branch of anatomically modern humans based on phylogenetic analysis. To formally test models of divergence among the ancestors of modern African populations, we resequenced a sample of San, Eastern, and Western Pygmies and nonPygmy NKs individuals at 40 nongenic (∼2 kb) regions and then analyzed these data within an Approximate Bayesian Computation (ABC) framework. We find substantial support for a model of an early divergence of KhoeSan ancestors from a protoPygmynonPygmy NKs group ∼110 thousand years ago over a model incorporating a protoKhoeSanPygmy huntergatherer divergence from the ancestors of nonPygmy NKs. The results of our analyses are consistent with previously identified signals of a strong bottleneck in Mbuti Pygmies and a relatively recent expansion of nonPygmy NKs. We also develop a number of methodologies that utilize "pseudoobserved" data sets to optimize our ABCbased inference. This approach is likely to prove to be an invaluable tool for demographic inference using genomewide resequencing data.
 Hammer, M. F., Woerner, A. E., Mendez, F. L., Watkins, J. C., & Wall, J. D. (2011). Genetic evidence for archaic admixture in Africa. Proceedings of the National Academy of Sciences of the United States of America, 108(37), 151238.More infoA longdebated question concerns the fate of archaic forms of the genus Homo: did they go extinct without interbreeding with anatomically modern humans, or are their genes present in contemporary populations? This question is typically focused on the genetic contribution of archaic forms outside of Africa. Here we use DNA sequence data gathered from 61 noncoding autosomal regions in a sample of three subSaharan African populations (Mandenka, Biaka, and San) to test models of African archaic admixture. We use two complementary approximatelikelihood approaches and a model of human evolution that involves recent population structure, with and without gene flow from an archaic population. Extensive simulation results reject the null model of no admixture and allow us to infer that contemporary African populations contain a small proportion of genetic material (≈ 2%) that introgressed ≈ 35 kya from an archaic population that split from the ancestors of anatomically modern humans ≈ 700 kya. Three candidate regions showing deep haplotype divergence, unusual patterns of linkage disequilibrium, and small basal clade size are identified and the distributions of introgressive haplotypes surveyed in a sample of populations from across subSaharan Africa. One candidate locus with an unusual segment of DNA that extends for >31 kb on chromosome 4 seems to have introgressed into modern Africans from a nowextinct taxon that may have lived in central Africa. Taken together our results suggest that polymorphisms present in extant populations introgressed via relatively recent interbreeding with hominin forms that diverged from the ancestors of modern humans in the LowerMiddle Pleistocene.
 Veeramah, K. R., Wegmann, D., Woerner, A. E., Mendez, F. L., Watkins, J. C., DestroBisol, G., Soodyall, H., Louie, L., & Hammer, M. F. (2011).
An Early Divergence of KhoeSan Ancestors from Those of Other Modern Humans Is Supported by an ABCBased Analysis of Autosomal Resequencing Data
. Molecular Biology and Evolution, 29, 617630. doi:10.1093/molbev/msr212More infoSubSaharan Africa has consistently been shown to be the most genetically diverse region in the world. Despite the fact that a substantial portion of this variation is partitioned between groups practicing a variety of subsistence strategies and speaking diverse languages, there is currently no consensus on the genetic relationships of subSaharan African populations. San (a subgroup of KhoeSan) and many Pygmy groups maintain huntergatherer lifestyles and cluster together in autosomalbased analysis, whereas nonPygmy NigerKordofanian speakers (nonPygmy NKs) predominantly practice agriculture and show substantial genetic homogeneity despite their wide geographic range throughout subSaharan Africa. However, KhoeSan, who speak a set of relatively unique clickbased languages, have long been thought to be an early branch of anatomically modern humans based on phylogenetic analysis. To formally test models of divergence among the ancestors of modern African populations, we resequenced a sample of San, Eastern, and Western Pygmies and nonPygmy NKs individuals at 40 nongenic (∼2 kb) regions and then analyzed these data within an Approximate Bayesian Computation (ABC) framework. We find substantial support for a model of an early divergence of KhoeSan ancestors from a protoPygmynonPygmy NKs group ∼110 thousand years ago over a model incorporating a protoKhoeSan–Pygmy huntergatherer divergence from the ancestors of nonPygmy NKs. The results of our analyses are consistent with previously identified signals of a strong bottleneck in Mbuti Pygmies and a relatively recent expansion of nonPygmy NKs. We also develop a number of methodologies that utilize “pseudoobserved” data sets to optimize our ABCbased inference. This approach is likely to prove to be an invaluable tool for demographic inference using genomewide resequencing data.  Watkins, J. C., Woerner, A. E., Hammer, M. F., Mendez, F. L., & Wall, J. D. (2011). Genetic evidence for archaic admixture in Africa. Proceedings of the National Academy of Sciences, 108(37), 1512315128. doi:10.1073/pnas.1109300108
 Hammer, M. F., Woerner, A. E., Mendez, F. L., Watkins, J. C., Cox, M. P., & Wall, J. D. (2010). The ratio of human X chromosome to autosome diversity is positively correlated with genetic distance from genes. Nature genetics, 42(10), 8301.More infoThe ratio of Xlinked to autosomal diversity was estimated from an analysis of six human genome sequences and found to deviate from the expected value of 0.75. However, the direction of this deviation depends on whether a particular sequence is close to or far from the nearest gene. This pattern may be explained by stronger locally acting selection on Xlinked genes compared with autosomal genes, combined with larger effective population sizes for females than for males.
 Marsteller, P., de Pillis, L., Findley, A., Joplin, K., Pelesko, J., Nelson, K., Thompson, K., Usher, D., & Watkins, J. (2010). Toward integration: from quantitative biology to mathbiobiomath?. CBE life sciences education, 9(3), 16571.More infoIn response to the call of BIO2010 for integrating quantitative skills into undergraduate biology education, 30 Howard Hughes Medical Institute (HHMI) Program Directors at the 2006 HHMI Program Directors Meeting established a consortium to investigate, implement, develop, and disseminate best practices resulting from the integration of math and biology. With the assistance of an HHMIfunded minigrant, led by Karl Joplin of East Tennessee State University, and support in institutional HHMI grants at Emory and University of Delaware, these institutions held a series of summer institutes and workshops to document progress toward and address the challenges of implementing a more quantitative approach to undergraduate biology education. This report summarizes the results of the four summer institutes (20072010). The group developed four draft white papers, a wiki site, and a listserv. One major outcome of these meetings is this issue of CBELife Sciences Education, which resulted from proposals at our 2008 meeting and a January 2009 planning session. Many of the papers in this issue emerged from or were influenced by these meetings.
 Watkins, J. C. (2010). Convergence time to the Ewens sampling formula in the infinite alleles Moran model. Journal of Mathematical Biology, 60(2), 189206.More infoPMID: 19288263;Abstract: In this paper, we establish an upper bound for time to convergence to stationarity for the discrete time infinite alleles Moran model. If M is the population size and μ is the mutation rate, this bound gives a cutoff time of log(M μ)/μ generations. The stationary distribution for this process in the case of sampling without replacement is the Ewens sampling formula. We show that the bound for the total variation distance from the generation t distribution to the Ewens sampling formula is well approximated by one of the extreme value distributions, namely, a standard Gumbel distribution. Beginning with the card shuffling examples of Aldous and Diaconis and extending the ideas of Donnelly and Rodrigues for the two allele model, this model adds to the list of Markov chains that show evidence for the cutoff phenomenon. Because of the broad use of infinite alleles models, this cutoff sets the time scale of applicability for statistical tests based on the Ewens sampling formula and other tests of neutrality in a number of population genetic studies. © SpringerVerlag 2009.
 Watkins, J. C. (2010). On a Calculusbased Statistics Course for Life Science Students. CBE—Life Sciences Education, 9(3), 298310. doi:10.1187/cbe.10030035
 Watkins, J. C. (2010). On a calculusbased statistics course for life science students. CBE Life Sciences Education, 9(3), 298310.More infoPMID: 20810962;PMCID: PMC2931677;Abstract: The choice of pedagogy in statistics should take advantage of the quantitative capabilities and scientific background of the students. In this article, we propose a model for a statistics course that assumes student competency in calculus and a broadening knowledge in biology. We illustrate our methods and practices through examples from the curriculum. © 2010 The American Society for Cell Biology.
 Watkins, J., & Watkins, J. C. (2010). Convergence time to the Ewens sampling formula in the infinite alleles Moran model. Journal of mathematical biology, 60(2).More infoIn this paper, we establish an upper bound for time to convergence to stationarity for the discrete time infinite alleles Moran model. If M is the population size and mu is the mutation rate, this bound gives a cutoff time of log(Mmu)/mu generations. The stationary distribution for this process in the case of sampling without replacement is the Ewens sampling formula. We show that the bound for the total variation distance from the generation t distribution to the Ewens sampling formula is well approximated by one of the extreme value distributions, namely, a standard Gumbel distribution. Beginning with the card shuffling examples of Aldous and Diaconis and extending the ideas of Donnelly and Rodrigues for the two allele model, this model adds to the list of Markov chains that show evidence for the cutoff phenomenon. Because of the broad use of infinite alleles models, this cutoff sets the time scale of applicability for statistical tests based on the Ewens sampling formula and other tests of neutrality in a number of population genetic studies.
 Watkins, J., & Watkins, J. C. (2010). On a calculusbased statistics course for life science students. CBE life sciences education, 9(3).More infoThe choice of pedagogy in statistics should take advantage of the quantitative capabilities and scientific background of the students. In this article, we propose a model for a statistics course that assumes student competency in calculus and a broadening knowledge in biology. We illustrate our methods and practices through examples from the curriculum.
 Watkins, J., Marsteller, P., de Pillis, L., Findley, A., Joplin, K., Pelesko, J., Nelson, K., Thompson, K., & Usher, D. (2010). Toward Integration: From Quantitative Biology to MathbioBiomath?. CBE—Life Sciences Education, 9(3), 165171. doi:10.1187/cbe.10030053
 Watkins, J. C. (2009).
Convergence time to the Ewens sampling formula in the infinite alleles Moran model
. Journal of Mathematical Biology, 66, 189206. doi:10.1007/s002850090255x  Hallmark, B., Watkins, J. C., Lansing, J. S., Cox, M. P., Karafet, T. M., Sudoyo, H., & Hammer, M. F. (2008). Male dominance rarely skews the frequency distribution of Y chromosome haplotypes in human populations. Proceedings of the National Academy of Sciences, 105(33), 1164511650. doi:10.1073/pnas.0710158105
 Lansing, J. S., Watkins, J. C., Hallmark, B., Cox, M. P., Karafet, T. M., Sudoyo, H., & Hammer, M. F. (2008). Male dominance rarely skews the frequency distribution of Y chromosome haplotypes in human populations. Proceedings of the National Academy of Sciences of the United States of America, 105(33), 1164550.More infoA central tenet of evolutionary social science holds that behaviors, such as those associated with social dominance, produce fitness effects that are subject to cultural selection. However, evidence for such selection is inconclusive because it is based on shortterm statistical associations between behavior and fertility. Here, we show that the evolutionary effects of dominance at the population level can be detected using noncoding regions of DNA. Highly variable polymorphisms on the nonrecombining portion of the Y chromosome can be used to trace lines of descent from a common male ancestor. Thus, it is possible to test for the persistence of differential fertility among patrilines. We examine haplotype distributions defined by 12 short tandem repeats in a sample of 1269 men from 41 Indonesian communities and test for departures from neutral mutationdrift equilibrium based on the Ewens sampling formula. Our tests reject the neutral model in only 5 communities. Analysis and simulations show that we have sufficient power to detect such departures under varying demographic conditions, including founder effects, bottlenecks, and migration, and at varying levels of social dominance. We conclude that patrilines seldom are dominant for more than a few generations, and thus traits or behaviors that are strictly paternally inherited are unlikely to be under strong cultural selection.
 Lansing, J. S., Cox, M. P., Downey, S. S., Gabler, B. M., Hallmark, B., Karafet, T. M., Norquest, P., Schoenfelder, J. W., Sudoyo, H., Watkins, J. C., & Hammer, M. F. (2007). Coevolution of languages and genes on the island of Sumba, eastern Indonesia. Proceedings of the National Academy of Sciences of the United States of America, 104(41), 160226.More infoNumerous studies indicate strong associations between languages and genes among human populations at the global scale, but all broader scale genetic and linguistic patterns must arise from processes originating at the community level. We examine linguistic and genetic variation in a contact zone on the eastern Indonesian island of Sumba, where Neolithic Austronesian farming communities settled and began interacting with aboriginal foraging societies approximately 3,500 years ago. Phylogenetic reconstruction based on a 200word Swadesh list sampled from 29 localities supports the hypothesis that Sumbanese languages derive from a single ancestral Austronesian language. However, the proportion of cognates (words with a common origin) traceable to ProtoAustronesian (PAn) varies among language subgroups distributed across the island. Interestingly, a positive correlation was found between the percentage of Y chromosome lineages that derive from Austronesian (as opposed to aboriginal) ancestors and the retention of PAn cognates. We also find a striking correlation between the percentage of PAn cognates and geographic distance from the site where many Sumbanese believe their ancestors arrived on the island. These languagegenegeography correlations, unprecedented at such a fine scale, imply that historical patterns of social interaction between expanding farmers and resident huntergatherers largely explain communitylevel language evolution on Sumba. We propose a model to explain linguistic and demographic coevolution at fine spatial and temporal scales.
 Norquest, P. K., Lansing, J. S., Hammer, M. F., Watkins, J. C., Karafet, T. M., Cox, M. P., Downey, S. S., Gabler, B. M., Schoenfelder, J. W., & Sudoyo, H. (2007). Coevolution of languages and genes on the island of Sumba, eastern Indonesia. Proceedings of the National Academy of Sciences, 104(41), 16022–16026. doi:10.1073/pnas.0704451104More infoNumerous studies indicate strong associations between languages and genes among human populations at the global scale, but all broader scale genetic and linguistic patterns must arise from processes originating at the community level. We examine linguistic and genetic variation in a contact zone on the eastern Indonesian island of Sumba, where Neolithic Austronesian farming communities settled and began interacting with aboriginal foraging societies ≈3,500 years ago. Phylogenetic reconstruction based on a 200word Swadesh list sampled from 29 localities supports the hypothesis that Sumbanese languages derive from a single ancestral Austronesian language. However, the proportion of cognates (words with a common origin) traceable to ProtoAustronesian (PAn) varies among language subgroups distributed across the island. Interestingly, a positive correlation was found between the percentage of Y chromosome lineages that derive from Austronesian (as opposed to aboriginal) ancestors and the retention of PAn cognates. We also find a striking correlation between the percentage of PAn cognates and geographic distance from the site where many Sumbanese believe their ancestors arrived on the island. These language–gene–geography correlations, unprecedented at such a fine scale, imply that historical patterns of social interaction between expanding farmers and resident huntergatherers largely explain communitylevel language evolution on Sumba. We propose a model to explain linguistic and demographic coevolution at fine spatial and temporal scales.
 Watkins, J. C. (2007). Microsatellite evolution: Markov transition functions for a suite of models. Theoretical Population Biology, 71(2), 147159.More infoPMID: 17123560;Abstract: This paper takes from the collection of models considered by Whittaker et al. [2003. Likelihoodbased estimation of microsatellite mutation rates. Genetics 164, 781787] derived from direct observation of microsatellite mutation in parentchild pairs and provides analytical expressions for the probability distributions for the change in number of repeats over any given number of generations. The mathematical framework for this analysis is the theory of Markov processes. We find these expressions using two approaches, approximating by circulant matrices and solving a partial differential equation satisfied by the generating function. The impact of the differing choice of models is examined using likelihood estimates for time to most recent common ancestor. The analysis presented here may play a role in elucidating the connections between these two approaches and shows promise in reconciling differences between estimates for mutation rates based on Whittaker's approach and methods based on phylogenetic analyses. © 2006 Elsevier Inc. All rights reserved.
 Watkins, J. C., Norquest, P., Lansing, J. S., Cox, M. P., Downey, S. S., Gabler, B. M., Hallmark, B., Karafet, T. M., Schoenfelder, J. W., Sudoyo, H., & Hammer, M. F. (2007). Coevolution of languages and genes on the island of Sumba, eastern Indonesia. Proceedings of the National Academy of Sciences, 104(41), 1602216026. doi:10.1073/pnas.0704451104
 Watkins, J., & Watkins, J. C. (2007). Microsatellite evolution: Markov transition functions for a suite of models. Theoretical population biology, 71(2).More infoThis paper takes from the collection of models considered by Whittaker et al. [2003. Likelihoodbased estimation of microsatellite mutation rates. Genetics 164, 781787] derived from direct observation of microsatellite mutation in parentchild pairs and provides analytical expressions for the probability distributions for the change in number of repeats over any given number of generations. The mathematical framework for this analysis is the theory of Markov processes. We find these expressions using two approaches, approximating by circulant matrices and solving a partial differential equation satisfied by the generating function. The impact of the differing choice of models is examined using likelihood estimates for time to most recent common ancestor. The analysis presented here may play a role in elucidating the connections between these two approaches and shows promise in reconciling differences between estimates for mutation rates based on Whittaker's approach and methods based on phylogenetic analyses.
 Karafet, T. M., Lansing, J. S., Redd, A. J., Reznikova, S., Watkins, J. C., Surata, K., Arthawiguna, W. A., Mayer, L., Bamshad, M. J., Jorde, L. B., & Hammer, M. F. (2005).
Balinese YChromosome Perspective on the Peopling of Indonesia: Genetic Contributions from PreNeolithic HunterGatherers, Austronesian Farmers, and Indian Traders
. Human Biology, 77, 93114. doi:10.1353/hub.2005.0030More infoThe island of Bali lies near the center of the southern chain of islands in the Indonesian archipelago, which served as a steppingstone for early migrations of huntergatherers to Melanesia and Australia and for more recent migrations of Austronesian farmers from mainland Southeast Asia to the Pacific. Bali is the only Indonesian island with a population that currently practices the Hindu religion and preserves various other Indian cultural, linguistic, and artistic traditions (Lansing 1983). Here, we examine genetic variation on the Y chromosomes of 551 Balinese men to investigate the relative contributions of Austronesian farmers and preNeolithic huntergatherers to the contemporary Balinese paternal gene pool and to test the hypothesis of recent paternal gene flow from the Indian subcontinent. Seventyone Ychromosome binary polymorphisms (single nucleotide polymorphisms, SNPs) and 10 Ychromosomelinked short tandem repeats (STRs) were genotyped on a sample of 1,989 Y chromosomes from 20 populations representing Indonesia (including Bali), southern China, Southeast Asia, South Asia, the Near East, and Oceania. SNP genotyping revealed 22 Balinese lineages, 3 of which (OM95, OM119, and OM122) account for nearly 83.7% of Balinese Y chromosomes. Phylogeographic analyses suggest that all three major Ychromosome haplogroups migrated to Bali with the arrival of Austronesian speakers; however, STR diversity patterns associated with these haplogroups are complex and may be explained by multiple waves of Austronesian expansion to Indonesia by different routes. Approximately 2.2% of contemporary Balinese Y chromosomes (i.e., KM9*, KM230, and M lineages) may represent the preNeolithic component of the Indonesian paternal gene pool. In contrast, eight other haplogroups (e.g., within H, J, L, and R), making up approximately 12% of the Balinese paternal gene pool, appear to have migrated to Bali from India. These results indicate that the Austronesian expansion had a profound effect on the composition of the Balinese paternal gene pool and that cultural transmission from India to Bali was accompanied by substantial levels of gene flow.  Karafet, T. M., Lansing, J. S., Redd, A. J., Reznikova, S., Watkins, J. C., Surata, S. P., Arthawiguna, W. A., Mayer, L., Bamshad, M., Jorde, L. B., & Hammer, M. F. (2005). Balinese Ychromosome perspective on the peopling of Indonesia: genetic contributions from preneolithic huntergatherers, Austronesian farmers, and Indian traders. Human biology, 77(1), 93114.More infoThe island of Bali lies near the center of the southern chain of islands in the Indonesian archipelago, which served as a steppingstone for early migrations of huntergatherers to Melanesia and Australia and for more recent migrations of Austronesian farmers from mainland Southeast Asia to the Pacific. Bali is the only Indonesian island with a population that currently practices the Hindu religion and preserves various other Indian cultural, linguistic, and artistic traditions (Lansing 1983). Here, we examine genetic variation on the Y chromosomes of 551 Balinese men to investigate the relative contributions of Austronesian farmers and preNeolithic huntergatherers to the contemporary Balinese paternal gene pool and to test the hypothesis of recent paternal gene flow from the Indian subcontinent. Seventyone Ychromosome binary polymorphisms (single nucleotide polymorphisms, SNPs) and 10 Ychromosomelinked short tandem repeats (STRs) were genotyped on a sample of 1,989 Y chromosomes from 20 populations representing Indonesia (including Bali), southern China, Southeast Asia, South Asia, the Near East, and Oceania. SNP genotyping revealed 22 Balinese lineages, 3 of which (OM95, OM119, and OM122) account for nearly 83.7% of Balinese Y chromosomes. Phylogeographic analyses suggest that all three major Ychromosome haplogroups migrated to Bali with the arrival of Austronesian speakers; however, STR diversity patterns associated with these haplogroups are complex and may be explained by multiple waves of Austronesian expansion to Indonesia by different routes. Approximately 2.2% of contemporary Balinese Y chromosomes (i.e., KM9*, KM230, and M lineages) may represent the preNeolithic component of the Indonesian paternal gene pool. In contrast, eight other haplogroups (e.g., within H, J, L, and R), making up approximately 12% of the Balinese paternal gene pool, appear to have migrated to Bali from India. These results indicate that the Austronesian expansion had a profound effect on the composition of the Balinese paternal gene pool and that cultural transmission from India to Bali was accompanied by substantial levels of gene flow.
 Lansing, J. S., Redd, A. J., Karafet, T. M., Watkins, J., Ardika, I. W., Surata, S. P., Schoenfelder, J. S., Campbell, M., Merriwether, A. M., & Hammer, M. F. (2004). An Indian trader in ancient Bali?. Antiquity, 78(300), 287293.More infoAbstract: DNA analysis of a tooth found with imported pottery in Bali offers a strong possibility of the presence of a trader of Indian extraction in the late first millennium BC.
 Watkins, J. C. (2004). The role of marriage rules in the structure of genetic relatedness. Theoretical Population Biology, 66(1), 1324.More infoPMID: 15225572;Abstract: In this work, we take a forward in time approach to compute the probabilities of nonidentity by descent for a population consisting of n sections obeying one of a class of marriage rules that is invariant under cyclical relabeling of sections. A perturbation method allows for exact asymptotics using the reciprocal of the section population as a small parameter. The analysis yields relatedness measures that generalize Wright's Fstatistics. © 2004 Elsevier Inc. All rights reserved.
 Watkins, J., & Watkins, J. C. (2004). The role of marriage rules in the structure of genetic relatedness. Theoretical population biology, 66(1).More infoIn this work, we take a forward in time approach to compute the probabilities of nonidentity by descent for a population consisting of n sections obeying one of a class of marriage rules that is invariant under cyclical relabeling of sections. A perturbation method allows for exact asymptotics using the reciprocal of the section population as a small parameter. The analysis yields relatedness measures that generalize Wright's Fstatistics.
 Anderson, K. R., Mendelson, N. H., & Watkins, J. C. (2000).
A New Mathematical Approach Predicts Individual Cell Growth Behavior using Bacterial Population Information
. Journal of Bacteriology, 181, 600609. doi:10.1006/jtbi.1999.1051More infoA theoretical methodology has been developed for studying the growth kinetics of bacterial cells. It utilizes the steadystate cell length distribution in a bacterial population to predict the dependency of growth and division rates on cell length and age. The mathematical model has been applied to the analysis of two bacterial populations, a wildtype strain of Bacillus subtilis, and a minicellproducing strain that carries the divIVB1 mutation. The results show that our model describes the wildtype population very well and that the assumptions typically used in traditional methods are unrealistic. In the case of the minicellproducing mutant we find evidence that the rate of cell division must be a function not only of cell size but also of cell age.  Anderson, K. R., Mendelson, N. H., & Watkins, J. C. (2000). A new mathematical approach predicts individual cell growth behavior using bacterial population information. Journal of Theoretical Biology, 202(1), 8794.More infoPMID: 10623502;Abstract: A theoretical methodology has been developed for studying the growth kinetics of bacterial cells. It utilizes the steadystate cell length distribution in a bacterial population to predict the dependency of growth and division rates on cell length and age. The mathematical model has been applied to the analysis of two bacterial populations, a wildtype strain of Bacillus subtilis, and a minicellproducing strain that carries the divIVB1 mutation. The results show that our model describes the wildtype population very well and that the assumptions typically used in traditional methods are unrealistic. In the case of the minicellproducing mutant we find evidence that the rate of cell division must be a function not only of cell size but also of cell age. (C) 2000 Academic Press.
 DeGrandiHoffman, G., & Watkins, J. C. (2000). The foraging activity of honey bees Apis mellifera and nonApis bees on hybrid sunflowers (Helianthus annuus) and its influence on crosspollination and seed set. Journal of Apicultural Research, 39(12), 3745.More infoAbstract: The repercussions of concurrent foraging by honey bee (Apis mellifera) and nonApis bee populations on crosspollination and seed set in hybrid sunflowers (Helianthus annuus) was investigated. The amount of sunflower pollen on the bodies of honey bees foraging in rows of malesterile (MS) sunflowers was positively correlated with the size of the nonApis bee population. The combined population of nonApis bees and honey bees foraging on malefertile (MF) and MS sunflowers also was positively correlated to seed set in MS rows. There were more honey bees than nonApis bees foraging in MF and MS rows, but there was no evidence of competition for resources between the two populations. The size of the honey bee population was positively correlated to the area of open flowers on sunflower capitula, while the nonApis population remained relatively constant throughout bloom. Results from this study indicate that a combined honey bee and nonApis bee population might result in better pollination of hybrid sunflowers than either population alone.
 DeGrandiHoffman, G., Watkins, J., Guerrero, P., & Erickson, E. (2000). Using honey bees to teach mathematics and science to high school students. American Bee Journal, 140(4), 293295.More infoAbstract: Honey bees have been an integral part of human civilization for centuries. They pollinate crops and supply us with honey and pollen. In fact, humans use almost everything that honey bees collect or produce from royal jelly in cosmetics, to wax for candies and propolis for finishing the wood of fine musical instruments. Honey bees also have another function. They make great tools for teaching the fundamentals of mathematics and biology to students of all ages.
 Watkins, J. C. (2000). Consistency and fluctuation theorems for discrete time structured population models having demographic stochasticity. Journal of Mathematical Biology, 41(3), 253271.More infoPMID: 11072758;Abstract: In this paper we prove a consistency theorem (law of large numbers) and a fluctuation theorem (central limit theorem) for structured population processes. The basic assumptions for these theorems are that the individuals have no statistically distinguishing features beyond their class and that the interaction between any two individuals is not too high. We apply these results to density dependent models of Leslie type and to a model for flour beetle dynamics.
 Mendelson, N. H., Bourque, A., Wilkening, K., Anderson, K. R., & Watkins, J. C. (1999). Organized cell swimming motions in Bacillus subtilis colonies: Patterns of shortlived whirls and jets. Journal of Bacteriology, 181(2), 600609.More infoPMID: 9882676;PMCID: PMC93416;Abstract: The swimming motions of cells within Bacillus subtilis colonies, as well as the associated fluid flows, were analyzed from video films produced during colony growth and expansion on wet agar surfaces. Individual cells in very wet dense populations moved at rates between 76 and 116 μm/s. Swimming cells were organized into patterns of whirls, each approximately 1,000 μm2, and jets of about 95 by 12 μm. Whirls and jets were shortlived, lasting only about 0.25 s. Patterns within given areas constantly repeated with a periodicity of approximately 1 s. Whirls of a given direction became disorganized and then reformed, usually into whirls moving in the opposite direction. Pattern elements were also organized with respect to one another in the colony. Neighboring whirls usually turned in opposite directions. This correlation decreased as a function of distance between whirls. Fluid flows associated with whirls and jets were measured by observing the movement of marker latex spheres added to colonies. The average velocity of markers traveling in whirls was 19 μm/s, whereas those traveling in jets moved at 27 μm/s. The paths followed by markers were aligned with the direction of cell motion, suggesting that cells create flows moving with them into whirls and along jets. When colonies became dry, swimming motions ceased except in regions close to the periphery and in isolated islands where cells traveled in slow whirls at about 4 μm/s. The addition of water resulted in immediate though transient rapid swimming (> 80 μm/s) in characteristic whirl and jet patterns. The rate of swimming decreased to 13 μm/s within 2 min, however, as the water diffused into the agar. Organized swimming patterns were nevertheless preserved throughout this period. These findings show that cell swimming in colonies is highly organized.
 Watkins, J. C., Mendelson, N. H., Bourque, A., Wilkening, K., & Anderson, K. R. (1999). Organized Cell Swimming Motions in Bacillus subtilis Colonies: Patterns of ShortLived Whirls and Jets. Journal of Bacteriology, 181(2), 600609. doi:10.1128/jb.181.2.600609.1999
 DeGrandiHoffman, G., & Watkins, J. C. (1998). Queen development time and the Africanization of European honey bees. American Bee Journal, 138(6), 467469.More infoAbstract: Organisms that inhabit an area thrive because they have become well adapted to the environmental conditions that surround them. The general rule is that the distinguishing characteristics of a species change very slowly over time. Suppose an individual of a particular species migrates into a territory in which that same species is established. If the migrant individual is similar to the current inhabitants, then the small differences in traits that the new arrival might bring will mix into the population and the distinguishing traits will rarely, if ever, be seen. On the other hand, if the immigrant has distinctive genetic characteristics, it often cannot compete for survival and reproduction with the resident population. Thus, these distinguishing characteristics are rapidly removed from the population's gene pool.
 DeGrandiHoffman, G., Watkins, J. C., Collins, A. G., Loper, G. M., Martin, J. H., Arias, M. C., & Sheppard, W. S. (1998).
Queen Developmental Time as a Factor in the Africanization of European Honey Bee (Hymenoptera: Apidae) Populations
. Annals of the Entomological Society of America, 91, 5258. doi:10.1093/aesa/91.1.52More infoThe development times of daughter queens from African and European matrilines mated to both African and European drones were recorded. Regardless of the matriline, African patriline queens completed their development and emerged 8–12 h before those with European paternity. A probability distribution function derived from the emergence time data indicated that because of differences in development times between patrilines, the probability that an African patriline queen will emerge 1st can be 2–3 times greater than the proportion of the African patrilines in the colony population. Because the 1st queen to emerge has the best chance of becoming the colony's new queen, differences in queen development times between Africanand European patrilines might be a factor contributing to the asymmetrical gene flow between African and European honey bee, Apis mellifera L., populations, and the eventual loss of European nuclear markers and behavioral attributes in European honey bee populations where African bees have migrated.  DegrandiHoffman, G., Watkins, J. C., Collins, A. M., Loper, G. M., Martin, J. H., Arias, M. C., & Sheppard, W. S. (1998). Queen developmental time as a factor in the Africanization of European honey bee (Hymenoptera: Apidae) populations. Annals of the Entomological Society of America, 91(1), 5258.More infoAbstract: The development times of daughter queens from African and European matrilines mated to both African and European drones were recorded. Regardless of the matriline, African patriline queens completed their development and emerged 812 h before those with European paternity. A probability distribution function derived from the emergence time data indicated that because of differences in development times between patrilines, the probability that an African patriline queen will emerge 1st can be 23 times greater than the proportion of the African patrilines in the colony population. Because the 1st queen to emerge has the best chance of becoming the colony's new queen, differences in queen development times between African and European patrilines might be a factor contributing to the asymmetrical gene flow between African and European honey bee, Apis mellifera L., populations, and the eventual loss of European nuclear markers and behavioral attributes in European honey bee populations where African bees have migrated.
 Degrandihoffman, G., Watkins, J. C., & Degrandihoffman, G. (1998). QUEEN DEVELOPMENT TIME AND THE AFRICANIZATION OF EUROPEAN HONEY BEES. American Bee Journal, 138(6), 467469.
 Watkins, J. C. (1997). Mechanical models for cell movement  Locomotion, translocation, migration. Journal of Applied Probability, 34(4), 827846.More infoAbstract: This paper provides a detailed stochastic analysis of leucocyte cell movement based on the dynamics of a rigid body. The cell's behavior is studied in two relevant anisotropic environments displaying adhesion mediated movement (haptotaxis) and stimulus mediated movement (chemotaxis). This behavior is modeled by diffusion processes on three successively longer time scales, termed locomotion, translocation, and migration.
 Watkins, J. C. (1996). REVIEW OF "Lectures on Random Evolutions," by Mark A. Pinsky. Annals of Probability, 24(3), 16471652. doi:10.1214/aop/1065725198
 Heubach, S., & Watkins, J. C. (1995). A stochastic model for the movement of a white blood cell. Advances in Applied Probability, 27(2), 443475. doi:10.2307/1427835More infoWe present a stochastic model for the movement of a white blood cell both in uniform concentration of chemoattractant and in the presence of a chemoattractant gradient. It is assumed that the rotational velocity is proportional to the weighted difference of the occupied receptors in the two halves of the cell and that each of the receptors stays free or occupied for an exponential length of time. We define processes corresponding to a cell with 2nP + 1 receptors (receptor sites). In the case of constant concentration, we show that the limiting process for the rotational velocity is an OmsteinUhlenbeck process. Its drift coefficient depends on the parameters of the exponential waiting times and its diffusion coefficient depends in addition also on the weight function. In the inhomogeneous case, the velocity process has a diffusion limit with drift coefficient depending on the concentration gradient and diffusion coefficient depending on the concentration and the weight function.
 Watkins, J. C., & Woessner, B. (1991). Diffusion models for chemotaxis: a statistical analysis of noninteractive unicellular movement. Mathematical Biosciences, 104(2), 271303.More infoPMID: 1804464;Abstract: A program is developed for applying stochastic differential equations to models for chemotaxis. First a few of the experimental and theoretical models for chemotaxis both for swimming bacteria and for cells migrating along a substrate are reviewed. In physical and biological models of deterministic systems, finite difference equations are often replaced by a limiting differential equation in order to take advantage of the ease in the use of calculus. A similar but more intricate methodology is developed here for stochastic models for chemotaxis. This exposition is possible because recent work in probability theory gives ease in the use of the stochastic calculus for diffusions and broad applicability in the convergence of stochastic difference equations to a stochastic differential equation. Stochastic differential equations suggest useful data for the model and provide statistical tests. We begin with phenomenological considerations as we analyze a onedimensional model proposed by Boyarsky, Noble, and Peterson in their study of human granulocytes. In this context, a theoretical model consists in identifying which diffusion best approximates a model for cell movement based upon theoretical considerations of cell phsyiology. Such a diffusion approximation theorem is presented along with discussion of the relationship between autocovariance and persistence. Both the stochastic calculus and the diffusion approximation theorem are described in one dimension. Finally, these tools are extended to multidimensional models and applied to a threedimensional experimental setup of spherical symmetry. © 1991.
 Watkins, J. C. (1990). A remark on Kunita's decomposition theorem. Stochastic Processes and their Applications, 35(1), 8185.More infoAbstract: We use Michel Emery's stability theorem for stochastic differential equations to give a short proof for explicit solutions to linear stochastic differential equations over a solvable Lie group. © 1990.
 Watkins, J. C. (1989). Donsker's Invariance Principle for Lie Groups. Annals of Probability, 17(3), 12201242. doi:10.1214/aop/1176991265More infoThis paper establishes a functional central limit theorem for Lie groups under a mixing hypothesis. The main theorem generalizes results by Patrick Billingsley for Euclidean space and the author for the general linear group.
 Watkins, J. C. (1987).
A Companion to the Oseledec Multiplicative Ergodic Theorem
. Proceedings of the American Mathematical Society, 90, 772776. doi:10.2307/2046491More infoLet ${F_1},{F_2}, \ldots$ be a stationary sequence of continuously differentiable mappings from $[0,1]$ into the set of $d \times d$ matrices. Assume ${F_k}(0) = I$ for each $k$ and $E[{\sup _{0 \leq p \leq 1}}{Fâ_k}(p)] < \infty$. Let $\mathcal {I}$ denote the invariant sigma field for the sequence. Then \[ \lim \limits _{n \to \infty } {F_n}\left ( {\frac {1}{n}} \right ) \cdots {F_2}\left ( {\frac {1}{n}} \right ){F_1}\left ( {\frac {1}{n}} \right ) = \exp E[{Fâ_1}(0)\mathcal {I}]\] with probability one.  Watkins, J. C. (1987). A companion to the oseledec multiplicative ergodic theorem. Proceedings of the American Mathematical Society, 99(4), 772776. doi:10.1090/s00029939198708770557
 Watkins, J. C. (1987). Functional central limit theorems and their associated large deviation principles for products of random matrices. Probability Theory and Related Fields, 76(2), 133166.More infoAbstract: This paper establishes a functional central limit theorem for a product of random matrices. The sequence of matrices form a stationary process which is a φmixing. The individual matrices in the product become closer and closer to the identity matrix with longer and longer products. In addition, these perturbations from the identity matrix have mean zero. A large deviation principle for the limit process is proved. © 1987 SpringerVerlag.
 Watkins, J. C. (1986).
Limit theorems for products of random matrices: a comparison of two points of view
. Random matrices and their applications, 50, 522. doi:10.1090/conm/050/841078  Watkins, J. C. (1985). A STOCHASTIC INTEGRAL REPRESENTATION FOR RANDOM EVOLUTIONS. Annals of Probability, 13(2), 531557. doi:10.1214/aop/1176993007
 Watkins, J. C. (1985). Limit theorems for stationary random evolutions. Stochastic Processes and their Applications, 19(2), 189224.More infoAbstract: On a separable Banach space, let A(ξ1),A(ξ2),... be a strictly stationary sequence of infinitesimal operators, centered so that EA(ξi) = 0, i = 1,2,.... This paper characterizes the limit of the random evolutions Yn(t)=exp 1 nA(ξ[n2t])⋯exp 1 nA(ξ2)exp 1 nA(ξ1)Yn(0)as the solution to a martingale problem. This work is a direct extension of previous work on i.i.d. random evolutions. © 1985.
 Watkins, J. C. (1984). A Central Limit Problem in Random Evolutions. Annals of Probability, 12(2), 480513. doi:10.1214/aop/1176993302More infoLet $Tnln~ be a sequence of independent and identically distributed strongly continuous semigroups on a separable Banach space. The corresponding generators JAnlnal satisfy E[An] = 0. Conditions are given to guarantee that the weak limit Y(t) = limitn Ho f LI5' Ti(1/n) Yn(O) exists, and is characterized as the unique solution of a martingale problem. Transport phenomena, random classical mechanics, and families of bounded operators are the featured examples.
Proceedings Publications
 Ahmed, R., Angelini, P., Efrat, A., Glickenstein, D., Gronemann, M., Heinsohn, N., Kobourov, S. G., Sahneh, F. D., Spence, R., Watkins, J. C., & Wolff, A. (2018).
MultiLevel Steiner Trees.
. In Symposium on Experimental Algorithms.More infoIn the classical Steiner tree problem, one is given an undirected, connected graph G=(V,E) with nonnegative edge costs and a set of terminals T subseteq V. The objective is to find a minimumcost edge set E' subseteq E that spans the terminals. The problem is APXhard; the best known approximation algorithm has a ratio of rho = ln(4)+epsilon < 1.39. In this paper, we study a natural generalization, the multilevel Steiner tree (MLST) problem: given a nested sequence of terminals T_1 subset ... subset T_k subseteq V, compute nested edge sets E_1 subseteq ... subseteq E_k subseteq E that span the corresponding terminal sets with minimum total cost. The MLST problem and variants thereof have been studied under names such as QualityofService Multicast tree, GradeofService Steiner tree, and MultiTier tree. Several approximation results are known. We first present two natural heuristics with approximation factor O(k). Based on these, we introduce a composite algorithm that requires 2^k Steiner tree computations. We determine its approximation ratio by solving a linear program. We then present a method that guarantees the same approximation ratio and needs at most 2k Steiner tree computations. We compare five algorithms experimentally on several classes of graphs using four types of graph generators. We also implemented an integer linear program for MLST to provide ground truth. Our combined algorithm outperforms the others both in theory and in practice when the number of levels is small (k
Presentations
 Watkins, J. C. (2022, April). Epidemics from the Eye of the Pathogen. Mathematical Biology Seminar. virtual: Arizona State University.
 Watkins, J. C. (2022, January/March). Data Sciences Academy at the University of Arizona. Academic Data Science Alliance Annual Conference. Online/Irvine, california: Academic Data Science Alliance.
 Watkins, J. C., Gentry, B., & Richardson, M. (2020, August). Indigenous Language Migration along the US Southern Border, the View from Arizona. Joint Statistical Meetings. virtual: American Statistical Association.More infoPanel presentation for the ASA Committee on Human Rights
 Watkins, J. C. (2018, January). Data and the Human Condition. AAAS. AAAS Washington DC: AAAS.
 Watkins, J. C. (2018, October). Multilevel Steiner Trees. TRIPODS Conference. Santa Clara, California: University of California, Santa Cruz.
 Watkins, J. C. (2017, September). On the Human Condition. Biostatistics Seminar. University of Arizona: College of Public Health.
 Watkins, J. C. (2014, June). BEEPOP: The Population Dynamics of the Honey Bee in the Hive and in the Wild. BioQuest/HHMI. University of Delaware: BioQuest/HHMI.More infoWorkshop on the Native American Summer Program at the University of Arizona
 Watkins, J. C. (2014, September). Curriculum & Classroom Strategies in a Statistics Course for Life Sciences Majors/Math Minors. High Performance Computing in Undergradraduate Quantitative Biologyu. Cold Spring Harbor, New York: Cold Spring Harbor.
Poster Presentations
 Watkins, J. C. (2018, October). A novel nonlinear dimension reduction approach to infer population structure for lowcoverage sequencing data. American Society of Human Genetics Annual Meeting. San Diego. California: American Society of Human Genetics.
 Bender, C., Watkins, J. C., Tolbert, L. P., Bender, C., Watkins, J. C., Tolbert, L. P., Bender, C., Watkins, J. C., & Tolbert, L. P. (2012, October). Assessing Undergraduate Research and BioMath Efforts at the University of Arizona. Howard Hughes Medical Institute Program Directors' Meeting. Chevy Chase, Maryland: Howard Hughes Medical Institute.