Jump to navigation

The University of Arizona Wordmark Line Logo White
UA Profiles | Home
  • Phonebook
  • Edit My Profile
  • Feedback

Profiles search form

Xueying Tang

  • Associate Professor, Mathematics
  • Member of the Graduate Faculty
  • Associate Professor, Applied Mathematics - GIDP
Contact
  • xytang@arizona.edu
  • Bio
  • Interests
  • Courses
  • Scholarly Contributions

Awards

  • Best Reviewer Award
    • Psychometric Society, Summer 2024
  • Elected to Arizona Alpha Chapter of Mu Sigma Rho (National Honor Society in Statistics)
    • Spring 2024
  • Outstanding Reviewer
    • American Educational Research Association and Journal of Educational and Behavioral Statistics, Spring 2021

Related Links

Share Profile

Interests

No activities entered.

Courses

2025-26 Courses

  • Independent Study
    STAT 599 (Spring 2026)
  • Adv Stat Regress Analys
    MATH 571A (Fall 2025)
  • Adv Stat Regress Analys
    STAT 571A (Fall 2025)
  • Theory of Statistics
    MATH 466 (Fall 2025)

2024-25 Courses

  • Independent Study
    DATA 499 (Spring 2025)
  • Adv Stat Regress Analys
    MATH 571A (Fall 2024)
  • Adv Stat Regress Analys
    STAT 571A (Fall 2024)
  • Independent Study
    DATA 499 (Fall 2024)

2023-24 Courses

  • Dissertation
    MATH 920 (Spring 2024)
  • Dissertation
    STAT 920 (Spring 2024)
  • Theory of Statistics
    MATH 466 (Spring 2024)
  • Theory of Statistics
    MATH 566 (Spring 2024)
  • Theory of Statistics
    STAT 566 (Spring 2024)
  • Dissertation
    MATH 920 (Fall 2023)
  • Dissertation
    STAT 920 (Fall 2023)
  • Statistical Machine Learning
    MATH 574M (Fall 2023)

2022-23 Courses

  • Dissertation
    STAT 920 (Spring 2023)
  • Honors Thesis
    DATA 498H (Spring 2023)
  • Theory of Statistics
    MATH 466 (Spring 2023)
  • Theory of Statistics
    MATH 566 (Spring 2023)
  • Theory of Statistics
    STAT 566 (Spring 2023)
  • Dissertation
    STAT 920 (Fall 2022)
  • Honors Thesis
    DATA 498H (Fall 2022)
  • Theory of Statistics
    MATH 466 (Fall 2022)

2021-22 Courses

  • Independent Study
    STAT 599 (Spring 2022)
  • Research
    STAT 900 (Spring 2022)
  • Theory of Statistics
    MATH 466 (Spring 2022)
  • Theory of Statistics
    MATH 566 (Spring 2022)
  • Theory of Statistics
    STAT 566 (Spring 2022)
  • Theory of Statistics
    MATH 466 (Fall 2021)

2020-21 Courses

  • Theory of Statistics
    MATH 566 (Spring 2021)
  • Theory of Statistics
    STAT 566 (Spring 2021)
  • Adv Stat Regress Analys
    MATH 571A (Fall 2020)
  • Adv Stat Regress Analys
    STAT 571A (Fall 2020)
  • Thesis
    STAT 910 (Fall 2020)

2019-20 Courses

  • Theory of Statistics
    MATH 466 (Spring 2020)
  • Theory of Statistics
    MATH 466 (Fall 2019)

Related Links

UA Course Catalog

Scholarly Contributions

Journals/Publications

  • Tang, X., Liu, J., & Ying, Z. (2025). A path signature perspective of process data feature extraction. British Journal of Mathematical and Statistical Psychology, 78(Issue). doi:10.1111/bmsp.12390
    More info
    Computer-based interactive items have become prevalent in recent educational assessments. In such items, the entire human-computer interactive process is recorded in a log file and is known as the response process. These data are noisy, diverse, and in a nonstandard format. Several feature extraction methods have been developed to overcome the difficulties in process data analysis. However, these methods often focus on the action sequence and ignore the time sequence in response processes. In this paper, we introduce a new feature extraction method that incorporates the information in both the action sequence and the response time sequence. The method is based on the concept of path signature from stochastic analysis. We apply the proposed method to both simulated data and real response process data from PIAAC. A prediction framework is used to show that taking time information into account provides a more comprehensive understanding of respondents' behaviors.
  • Zhu, Z., & Tang, X. (2025). Modeling sparsity with super heavy-tailed priors. Electronic Journal of Statistics, 19(Issue 1). doi:10.1214/25-ejs2370
    More info
    Sparsity is often a desired structure for parameters in highdimensional statistical problems. Within a Bayesian framework, sparsity is usually induced by spike-and-slab priors or global-local shrinkage priors. The latter choice is often expressed as a scale mixture of normal distributions. It marginally places a polynomial-tailed distribution on the parameter. In general, a heavier-tailed distribution has a better performance in estimating sparse parameters. We consider the log Cauchy prior and, more generally, super heavy-tailed priors in the normal mean estimation problem. This class of priors is proper while having a tail order arbitrarily close to one. The resulting posterior mean is a shrinkage estimator, and the posterior contraction rate is sharp minimax. The empirical performance of these priors is demonstrated through simulations and a real data example.
  • Tang, X. (2024). A Latent Hidden Markov Model for Process Data. Psychometrika, 89(Issue 1). doi:10.1007/s11336-023-09938-1
    More info
    Response process data from computer-based problem-solving items describe respondents’ problem-solving processes as sequences of actions. Such data provide a valuable source for understanding respondents’ problem-solving behaviors. Recently, data-driven feature extraction methods have been developed to compress the information in unstructured process data into relatively low-dimensional features. Although the extracted features can be used as covariates in regression or other models to understand respondents’ response behaviors, the results are often not easy to interpret since the relationship between the extracted features, and the original response process is often not explicitly defined. In this paper, we propose a statistical model for describing response processes and how they vary across respondents. The proposed model assumes a response process follows a hidden Markov model given the respondent’s latent traits. The structure of hidden Markov models resembles problem-solving processes, with the hidden states interpreted as problem-solving subtasks or stages. Incorporating the latent traits in hidden Markov models enables us to characterize the heterogeneity of response processes across respondents in a parsimonious and interpretable way. We demonstrate the performance of the proposed model through simulation experiments and case studies of PISA process data.
  • Tang, X., & Zhang, L. (2024). A hierarchical gamma prior for modeling random effects in small area estimation. Survey Methodology, 50(Issue 1).
    More info
    Small area estimation (SAE) is becoming increasingly popular among survey statisticians. Since the direct estimates of small areas usually have large standard errors, model-based approaches are often adopted to borrow strength across areas. SAE models often use covariates to link different areas and random effects to account for the additional variation. Recent studies showed that random effects are not necessary for all areas, so global-local (GL) shrinkage priors have been introduced to effectively model the sparsity in random effects. The GL priors vary in tail behavior, and their performance differs under different sparsity levels of random effects. As a result, one needs to fit the model with different choices of priors and then select the most appropriate one based on the deviance information criterion or other evaluation metrics. In this paper, we propose a flexible prior for modeling random effects in SAE. The hyperparameters of the prior determine the tail behavior and can be estimated in a fully Bayesian framework. Therefore, the resulting model is adaptive to the sparsity level of random effects without repetitive fitting. We demonstrate the performance of the proposed prior via simulations and real applications.
  • Zhang, S., Tang, X., He, Q., Liu, J., & Ying, Z. (2024). External Correlates of Adult Digital Problem-Solving Process An Empirical Analysis of PIAAC PSTRE Action Sequences. Zeitschrift fur Psychologie / Journal of Psychology, 232(Issue 2). doi:10.1027/2151-2604/a000554
    More info
    Computerized assessments and interactive simulation tasks are increasingly popular and afford the collection of process data, i.e., an examinee's sequence of actions (e.g., clickstreams, keystrokes) that arises from interactions with each task. Action sequence data contain rich information on the problem-solving process but are in a nonstandard, variable-length discrete sequence format. Two methods that directly extract features from the raw action sequences, namely multidimensional scaling and sequence-to-sequence autoencoders, produce multidimensional numerical features that summarize original sequence information. This study explores the utility of action sequence features in understanding how problem-solving behavior relates to cognitive proficiencies and demographic characteristics. This is empirically illustrated with the process data from the 2012 PIAAC PSTRE digital assessment. Regularized regression results showed that action sequence features are more predictive of examinees' demographic and cognitive characteristics compared to final outcomes. Partial least squares analysis further aided the identification of behavioral patterns systematically associated with demographic/cognitive characteristics.
  • Zhang, S., Tang, X., Wang, Z., Liu, J., & Ying, Z. (2023). External Correlates of Adult Digital Problem-Solving Process: An Empirical Analysis of PIAAC PSTRE Action Sequences. Zeitschrift fur Psychologie.
  • Tang, X. (2023). A latent hidden Markov model for response process data. Psychometrika.
  • Tang, X., & Ghosh, M. (2023).

    Global-Local Priors for Spatial Small Area Estimation

    . Calcutta Statistical Association Bulletin. doi:10.1177/00080683231186378
  • Tang, X., Wang, Z., Liu, J., & Ying, Z. (2023). Subtask analysis of process data through a predictive model. British Journal of Mathematical and Statistical Psychology, 76(1), 211-235. doi:10.1111/bmsp.12290
  • Ghosh, T., Ghosh, M., Maples, J., & Tang, X. (2022). Multivariate global-local priors for small area estimation. Stats, 5(3), 673-688. doi:https://doi.org/10.3390/stats5030040
  • Lippitt, W., Lippitt, W., Sethuraman, S., Sethuraman, S., Tang, X., & Tang, X. (2022). Stationarity and Inference in Multistate Promoter Models of Stochastic Gene Expression via Stick-Breaking Measures. SIAM Journal on Applied Mathematics, 82(6), 1953-1986. doi:10.1137/21m1440876
  • Tang, X., Wang, Z., Liu, J., & Ying, Z. (2021). An exploratory analysis of the latent structure of process data via action sequence autoencoders. British Journal of Mathematical and Statistical Psychology, 74(1), 1-33. doi:10.1111/bmsp.12203
  • Tang, X., Zhang, S., Wang, Z., Liu, J., & Ying, Z. (2021). ProcData: An R Package for Process Data Analysis. Psychometrika.
  • Tang, X., Wang, Z., & Liu, J. (2020). Statistical Analysis of Multi-Relational Network Recovery. Frontiers in Applied Mathematics and Statistics.
  • Tang, X., Wang, Z., He, Q., Liu, J., & Ying, Z. (2020). Latent feature extraction for process data via multidimensional scaling. Psychometrika.
  • Merrill, H. R., Tang, X., & Bliznyuk, N. (2019). Spatio-temporal additive regression model selection for urban water demand. Stochastic Environmental Research and Risk Assessment, 33(Issue 4-6). doi:10.1007/s00477-019-01682-2
    More info
    Understanding the factors influencing urban water use is critical for meeting demand and conserving resources. To analyze the relationships between urban household-level water demand and potential drivers, we develop a method for Bayesian variable selection in partially linear additive regression models, particularly suited for high-dimensional spatio-temporally dependent data. Our approach combines a spike-and-slab prior distribution with a modified version of the Bayesian group lasso to simultaneously perform selection of null, linear, and nonlinear models and to penalize regression splines to prevent overfitting. We investigate the effectiveness of the proposed method through a simulation study and provide comparisons with existing methods. We illustrate the methodology on a case study to estimate and quantify uncertainty of the associations between several environmental and demographic predictors and spatio-temporally varying household-level urban water demand in Tampa, FL.
  • Tang, X., Chen, Y., Li, X., Liu, J., & Ying, Z. (2019). A reinforcement learning approach to personalized learning recommendation systems. British Journal of Mathematical and Statistical Psychology, 72(Issue 1). doi:10.1111/bmsp.12144
    More info
    Personalized learning refers to instruction in which the pace of learning and the instructional approach are optimized for the needs of each learner. With the latest advances in information technology and data science, personalized learning is becoming possible for anyone with a personal computer, supported by a data-driven recommendation system that automatically schedules the learning sequence. The engine of such a recommendation system is a recommendation strategy that, based on data from other learners and the performance of the current learner, recommends suitable learning materials to optimize certain learning outcomes. A powerful engine achieves a balance between making the best possible recommendations based on the current knowledge and exploring new learning trajectories that may potentially pay off. Building such an engine is a challenging task. We formulate this problem within the Markov decision framework and propose a reinforcement learning approach to solving the problem.
  • Tang, X., Yang, Y., Yu, H. J., Liao, Q. H., & Bliznyuk, N. (2019). A Spatio-Temporal Modeling Framework for Surveillance Data of Multiple Infectious Pathogens With Small Laboratory Validation Sets. Journal of the American Statistical Association, 114(Issue 528). doi:10.1080/01621459.2019.1585250
    More info
    Many surveillance systems of infectious diseases are syndrome-based, capturing patients by clinical manifestation. Only a fraction of patients, mostly severe cases, undergo laboratory validation to identify the underlying pathogen. Motivated by the need to understand transmission dynamics and associate risk factors of enteroviruses causing the hand, foot, and mouth disease (HFMD) in China, we developed a Bayesian spatio-temporal modeling framework for surveillance data of infectious diseases with small validation sets. A novel approach was proposed to sample unobserved pathogen-specific patient counts over space and time and was compared to an existing sampling approach. The practical utility of this framework in identifying key parameters was assessed in simulations for a range of realistic sizes of the validation set. Several designs of sampling patients for laboratory validation were compared with and without aggregation of sparse validation data. The methodology was applied to the 2009 HFMD epidemic in southern China to evaluate transmissibility and the effects of climatic conditions for the leading pathogens of the disease, enterovirus 71, and Coxsackie A16. Supplementary materials for this article are available online.
  • Tang, X., Ghosh, M., Ha, N. S., & Sedransk, J. (2018). Modeling Random Effects Using Global–Local Shrinkage Priors in Small Area Estimation. Journal of the American Statistical Association, 113(Issue 524). doi:10.1080/01621459.2017.1419135
    More info
    Small area estimation is becoming increasingly popular for survey statisticians. One very important program is Small Area Income and Poverty Estimation undertaken by the United States Bureau of the Census, which aims at providing estimates related to income and poverty based on American Community Survey data at the state level and even at lower levels of geography. This article introduces global–local (GL) shrinkage priors for random effects in small area estimation to capture wide area level variation when the number of small areas is very large. These priors employ two levels of parameters, global and local parameters, to express variances of area-specific random effects so that both small and large random effects can be captured properly. We show via simulations and data analysis that use of the GL priors can improve estimation results in most cases. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
  • Tang, X., Xu, X., Ghosh, M., & Ghosh, P. (2018). Bayesian Variable Selection and Estimation Based on Global-Local Shrinkage Priors. Sankhya A, 80(Issue 2). doi:10.1007/s13171-017-0118-2
    More info
    We consider in this paper simultaneous Bayesian variable selection and estimation for linear regression models with global-local shrinkage priors on the regression coefficients. We propose a variable selection procedure that selects a variable if the ratio of the posterior mean of its regression coefficient to the corresponding ordinary least square estimate is greater than a half. The regression coefficient is estimated by the posterior mean or zero depending on whether the corresponding variable is selected or not. Under the assumption of orthogonal designs, we prove that if the local parameters have polynomial-tailed priors, the proposed method enjoys the oracle property in the sense that it can achieve variable selection consistency and optimal estimation rate at the same time. However, if, instead, an exponential-tailed prior is used for the local parameters, the proposed method has variable selection consistency but not the optimal estimation rate. We show via simulation and real data examples that our proposed selection mechanism works for nonorthogonal designs as well.
  • Tang, X., Li, K., & Ghosh, M. (2017). Bayesian multiple testing under sparsity for polynomial-tailed distributions. Statistica Sinica, 27(Issue 3). doi:10.5705/ss.202015.0362
    More info
    This paper considers Bayesian multiple testing under sparsity for polynomial-tailed distributions satisfying a monotone likelihood ratio property. Included in this class of distributions are the Student's t, the Pareto, and many other distributions. We prove some general asymptotic optimality results under fixed and random thresholding. As examples of these general results, we establish the Bayesian asymptotic optimality of several multiple testing procedures in the literature for appropriately chosen false discovery rate levels. We also show by simulation that the Benjamini-Hochberg procedure with a false discovery rate level different from the asymptotically optimal one can lead to high Bayes risk.
  • Ghosh, P., Tang, X., Ghosh, M., & Chakrabarti, A. (2016). Asymptotic properties of bayes risk of a general class of shrinkage priors in multiple hypothesis testing under sparsity. Bayesian Analysis, 11(Issue 3). doi:10.1214/15-ba973
    More info
    Consider the problem of simultaneous testing for the means of independent normal observations. In this paper, we study some asymptotic optimality properties of certain multiple testing rules induced by a general class of one-group shrinkage priors in a Bayesian decision theoretic framework, where the overall loss is taken as the number of misclassified hypotheses. We assume a two-groups normal mixture model for the data and consider the asymptotic framework adopted in Bogdan et al. (2011) who introduced the notion of asymptotic Bayes optimality under sparsity in the context of multiple testing. The general class of one-group priors under study is rich enough to include, among others, the families of three parameter beta, generalized double Pareto priors, and in particular the horseshoe, the normal-exponential-gamma and the Strawderman-Berger priors. We establish that within our chosen asymptotic framework, the multiple testing rules under study asymptotically attain the risk of the Bayes Oracle up to a multiplicative factor, with the constant in the risk close to the constant in the Oracle risk. This is similar to a result obtained in Datta and Ghosh (2013) for the multiple testing rule based on the horseshoe estimator introduced in Carvalho et al. (2009, 2010). We further show that under very mild assumption on the underlying sparsity parameter, the induced decisions using an empirical Bayes estimate of the corresponding global shrinkage parameter proposed by van der Pas et al. (2014), asymptotically attain the optimal Bayes risk up to the same multiplicative factor. We provide a unifying argument applicable for the general class of priors under study. In the process, we settle a conjecture regarding optimality property of the generalized double Pareto priors made in Datta and Ghosh (2013). Our work also shows that the result in Datta and Ghosh (2013) can be improved further.

Presentations

  • Tang, X. (2024, April). A Latent Hidden Markov Model for Process Data. Arizona Data Science Day.
  • Tang, X. (2024, August). Hidden Markov Cognitive Diagnostic Models for Response Process Data. Joint Statistical Meetings.
  • Tang, X. (2024, December).

    A Hierarchical Gamma Prior for Modeling Random Effects in Small Area Estimation.

    . 17th International Conference of ERCIM WG on Computational and Methodological Statistics.
  • Tang, X. (2023, April). Modeling Sparsity Using Log-Cauchy Priors. Statistics Seminar at the University of Pittsburgh.
  • Tang, X. (2023, August). Adaptive Bayesian Shrinkage of Random Effects in Small Area Estimation. Joint Statistical Meetings. Toronto, Canada.
  • Tang, X. (2023, December). Global-Local Priors for Spatial Small Area Estimation. 16th International Conference of the ERCIM WG on Computational and Methodological Statistics.
  • Tang, X. (2023, February). A Latent Hidden Markov Model for Response Process Data. Special Interest Group Seminar at ETS.
  • Tang, X. (2023, July). A Latent Hidden Markov Model for Response Process Data. International Meeting for Psychometric Society. College Park, Maryland.
  • Tang, X. (2023, September). A Latent Hidden Markov Model for Response Process Data. Psychometrics Workshop at Columbia University.
  • Tang, X. (2022, April). Modeling sparsity using log Cauchy prior. University of Minnesota Statistics Seminar.
  • Tang, X. (2022, April). Subtask analysis of process data through a predictive model. Arizona State University Machine Learning Day.
  • Tang, X. (2022, December). Measurement Error Models with Global-Local Random Effects in Small Area Estimation. 15th International Conference of the ERCIM WG on Computational and Methodological Statistics. Online.
  • Tang, X. (2022, June). A latent hidden Markov model for response process data. International Chinese Statistician Association Applied Statistics Symposium.
  • Tang, X. (2022, October). Modeling sparsity using log-Cauchy prior. Arizona State University Statistical Seminar.
  • Tang, X. (2022, October). Modeling sparsity using log-Cauchy prior. University of Cincinnati Statistics Seminar.
  • Tang, X. (2021). Using log Cauchy priors for modeling sparsity. 14th International Conference of the ERCIM WG on Computational and Methodological Statistics. Virtual.
  • Tang, X. (2021, April). Subtask Analysis of Process Data Through a Predictive Model. The Ohio State University Biostatistics Seminar. Virtual.
  • Tang, X. (2021, June). Subtask Analysis of Process Data Through a Predictive Model. University of California Davis Statistics Seminar. Virtual.
  • Tang, X. (2021, October). Subtask Analysis of Process Data Through a Predictive Model. 34th New England Statistics Symposium. Virtual.
  • Tang, X. (2020, August). Bayesian Semiparametric Regression Model Selection with Correlated Errors. Joint Statistical Meetings.
  • Tang, X. (2020, December). Subtask Analysis of Process Data Through a Predictive Model. International Chinese Statistical Association Applied Statistics Symposium.
  • Tang, X. (2020, July). A Hidden Markov Model for Identifying Problem Solving Strategies in Process Data. International Meeting of the Psychometric Society.
  • Tang, X. (2020, July). Introduction to R package ProcData. Workshop on Statistical Learning for Process Data.

Profiles With Related Publications

  • Sunder Sethuraman

 Edit my profile

UA Profiles | Home

University Information Security and Privacy

© 2026 The Arizona Board of Regents on behalf of The University of Arizona.