Liangming Pan

Assistant Professor, School of Information
Member of the Graduate Faculty

Contact

Richard P. Harvill Building, Rm. 409
Tucson, AZ 85721
liangmingpan@arizona.edu

Degrees

Ph.D. Integrative Sciences and Engineering Programme (ISEP)

National University of Singapore, Singapore, Singapore
Towards Generating Deep Questions from Text

M.S. Computer Science

Tsinghua University, Beijing, China
NLP for Massive Open Online Courses (MOOCs)

B.S. Software Engineering

Beihang University, Beijing, China

Work Experience

University of California, Santa Barbara (2022 - 2024)

Awards

Best Paper Award (Runner-up)

3th Table Representation Learning Workshop @ NeurIPS 2024, Fall 2024

Interests

No activities entered.

Courses

2025-26 Courses

Data Mining/Discovery

INFO 523 (Fall 2025)

2024-25 Courses

Adv ML Apps

INFO 621 (Spring 2025)
Directed Research

INFO 692 (Spring 2025)
Data Mining/Discovery

INFO 523 (Fall 2024)

Scholarly Contributions

Journals/Publications

Pan, L., Saxon, M., Xu, W., Nathani, D., Wang, X., & Wang, W. (2024). Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies. Transactions of the Association for Computational Linguistics, 12. doi:10.1162/tacl_a_00660
More info
While large language models (LLMs) have shown remarkable effectiveness in various NLP tasks, they are still prone to issues such as hallucination, unfaithful reasoning, and toxicity. A promising approach to rectify these flaws is correcting LLMs with feedback, where the LLM itself is prompted or guided with feedback to fix problems in its own output. Techniques leveraging automated feedback—either produced by the LLM itself (self-correction) or some external system—are of particular interest as they make LLM-based solutions more practical and deployable with minimal human intervention. This paper provides an exhaustive review of the recent advances in correcting LLMs with automated feedback, categorizing them into training-time, generation-time, and post-hoc approaches. We also identify potential challenges and future directions in this emerging field.
Pan, L., Chen, J., Liu, S., Ngo, C., Kan, M., & Chua, T. (2021). A Hybrid Approach for Detecting Prerequisite Relations in Multi-Modal Food Recipes. IEEE Transactions on Multimedia, 23. doi:10.1109/TMM.2020.3042706
More info
Modeling the structure of culinary recipes is the core of recipe representation learning. Current approaches mostly focus on extracting the workflow graph from recipes based on text descriptions. Process images, which constitute an important part of cooking recipes, has rarely been investigated in recipe structure modeling. We study this recipe structure problem from a multi-modal learning perspective, by proposing a prerequisite tree to represent recipes with cooking images at a step-level granularity. We propose a simple-yet-effective two-stage framework to automatically construct the prerequisite tree for a recipe by (1) utilizing a trained classifier to detect pairwise prerequisite relations that fuses multi-modal features as input; then (2) applying different strategies (greedy method, maximum weight, and beam search) to build the tree structure. Experiments on the MM-ReS dataset demonstrates the advantages of introducing process images for recipe structure modeling. Also, compared with neural methods which require large numbers of training data, we show that our two-stage pipeline can achieve promising results using only 400 labeled prerequisite trees as training data.

Proceedings Publications

Amayuelas, A., Yang, X., Antoniades, A., Hua, W., Pan, L., & Wang, W. (2024). MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate. In Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
More info
Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually. The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models as agents, enabling interactions among multiple models to execute complex tasks. Such collaborations offer several advantages, including the use of specialized models (e.g. coding), improved confidence through multiple computations, and enhanced divergent thinking, leading to more diverse outputs. Thus, the collaborative use of language models is expected to grow significantly in the coming years. In this work, we evaluate the behavior of a network of models collaborating through debate under the influence of an adversary. We introduce pertinent metrics to assess the adversary's effectiveness, focusing on system accuracy and model agreement. Our findings highlight the importance of a model's persuasive ability in influencing others. Additionally, we explore inference-time methods to generate more compelling arguments and evaluate the potential of prompt-based mitigation as a defensive strategy. [Journal_ref: ]
Wu, X., Dong, X., Pan, L., Nguyen, T., & Luu, A. (2024). Modeling Dynamic Topics in Chain-Free Fashion by Evolution-Tracking Contrastive Learning and Unassociated Word Exclusion. In Findings of Annual Meeting of the Association for Computational Linguistics (ACL, 2024).
More info
Dynamic topic models track the evolution of topics in sequential documents, which have derived various applications like trend analysis and opinion mining. However, existing models suffer from repetitive topic and unassociated topic issues, failing to reveal the evolution and hindering further applications. To address these issues, we break the tradition of simply chaining topics in existing work and propose a novel neural Chain-Free Dynamic Topic Model. We introduce a new evolution-tracking contrastive learning method that builds the similarity relations among dynamic topics. This not only tracks topic evolution but also maintains topic diversity, mitigating the repetitive topic issue. To avoid unassociated topics, we further present an unassociated word exclusion method that consistently excludes unassociated words from discovered topics. Extensive experiments demonstrate our model significantly outperforms state-of-the-art baselines, tracking topic evolution with high-quality topics, showing better performance on downstream tasks, and remaining robust to the hyperparameter for evolution intensities. Our code is available at https://github.com/bobxwu/CFDTM.
Wu, X., Pan, L., Wang, W. Y., & Luu, A. T. (2024). AKEW: Assessing Knowledge Editing in the Wild. In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
More info
Knowledge editing injects knowledge updates into language models to keep them correct and up-to-date. However, its current evaluations deviate significantly from practice: their knowledge updates solely consist of structured facts derived from meticulously crafted datasets, instead of practical sources -- unstructured texts like news articles, and they often overlook practical real-world knowledge updates. To address these issues, in this paper we propose AKEW (Assessing Knowledge Editing in the Wild), a new practical benchmark for knowledge editing. AKEW fully covers three editing settings of knowledge updates: structured facts, unstructured texts as facts, and extracted triplets. It further introduces new datasets featuring both counterfactual and real-world knowledge updates. Through extensive experiments, we demonstrate the considerable gap between state-of-the-art knowledge-editing methods and practical scenarios. Our analyses further highlight key insights to motivate future research for practical knowledge editing. [Journal_ref: ]
Yang, X., Pan, L., Zhao, X., Chen, H., Petzold, L., Wang, W. Y., & Cheng, W. (2024). A Survey on Detection of LLMs-Generated Content. In Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
More info
The burgeoning capabilities of advanced large language models (LLMs) such as ChatGPT have led to an increase in synthetic content generation with implications across a variety of sectors, including media, cybersecurity, public discourse, and education. As such, the ability to detect LLMs-generated content has become of paramount importance. We aim to provide a detailed overview of existing detection strategies and benchmarks, scrutinizing their differences and identifying key challenges and prospects in the field, advocating for more adaptable and robust models to enhance detection accuracy. We also posit the necessity for a multi-faceted approach to defend against various attacks to counter the rapidly advancing capabilities of LLMs. To the best of our knowledge, this work is the first comprehensive survey on the detection in the era of LLMs. We hope it will provide a broad understanding of the current landscape of LLMs-generated content detection, offering a guiding reference for researchers and practitioners striving to uphold the integrity of digital information in an era increasingly dominated by synthetic content. The relevant papers are summarized and will be consistently updated at https://github.com/Xianjun-Yang/Awesome_papers_on_LLMs_detection.git. [Journal_ref: ]
Diao, S., Keh, S., Pan, L., Tian, Z., Song, Y., & Zhang, T. (2023). Hashtag-Guided Low-Resource Tweet Classification. In International World Wide Web Conference (WWW), 2023.
More info
Social media classification tasks (e.g., tweet sentiment analysis, tweet stance detection) are challenging because social media posts are typically short, informal, and ambiguous. Thus, training on tweets is challenging and demands large-scale human-annotated labels, which are time-consuming and costly to obtain. In this paper, we find that providing hashtags to social media tweets can help alleviate this issue because hashtags can enrich short and ambiguous tweets in terms of various information, such as topic, sentiment, and stance. This motivates us to propose a novel Hashtag-guided Tweet Classification model (HashTation), which automatically generates meaningful hashtags for the input tweet to provide useful auxiliary signals for tweet classification. To generate high-quality and insightful hashtags, our hashtag generation model retrieves and encodes the post-level and entity-level information across the whole corpus. Experiments show that HashTation achieves significant improvements on seven low-resource tweet classification tasks, in which only a limited amount of training data is provided, showing that automatically enriching tweets with model-generated hashtags could significantly reduce the demand for large-scale human-labeled data. Further analysis demonstrates that HashTation is able to generate high-quality hashtags that are consistent with the tweets and their labels. The code is available at https://github.com/shizhediao/HashTation.
Nathani, D., Wang, D., Pan, L., & Wang, W. (2023). MAF: Multi-Aspect Feedback for Improving Reasoning in Large Language Models. In 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023).
More info
Language Models (LMs) have shown impressive performance in various natural language tasks. However, when it comes to natural language reasoning, LMs still face challenges such as hallucination, generating incorrect intermediate reasoning steps, and making mathematical errors. Recent research has focused on enhancing LMs through self-improvement using feedback. Nevertheless, existing approaches relying on a single generic feedback source fail to address the diverse error types found in LM-generated reasoning chains. In this work, we propose Multi-Aspect Feedback, an iterative refinement framework that integrates multiple feedback modules, including frozen LMs and external tools, each focusing on a specific error category. Our experimental results demonstrate the efficacy of our approach to addressing several errors in the LM-generated reasoning chain and thus improving the overall performance of an LM in several reasoning tasks. We see a relative improvement of up to 20% in Mathematical Reasoning and up to 18% in Logical Entailment. We release our source code, prompts, and data1 to accelerate future research.
Pan, L., Albalak, A., Wang, X., & Wang, W. (2023). LOGIC-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning. In Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP).
More info
Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle with complex logical problems. This paper introduces a novel framework, LOGIC-LM, which integrates LLMs with symbolic solvers to improve logical problem-solving. Our method first utilizes LLMs to translate a natural language problem into a symbolic formulation. Afterward, a deterministic symbolic solver performs inference on the formulated problem. We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations. We demonstrate LOGIC-LM's effectiveness on five logical reasoning datasets: ProofWriter, PrOntoQA, FOLIO, LogicalDeduction, and AR-LSAT. On average, LOGIC-LM achieves a significant performance boost of 39.2% over using LLM alone with standard prompting and 18.4% over LLM with chain-of-thought prompting. Our findings suggest that LOGIC-LM, by combining LLMs with symbolic logic, offers a promising avenue for faithful logical reasoning.
Wu, J., Pan, L., Chen, J., & Jiang, Y. (2022). Ingredient-enriched Recipe Generation from Cooking Videos. In International Conference on Multimedia Retrieval (ICMR), 2022.
More info
Cooking video captioning aims to generate the text instructions that describes the cooking procedures presented in the video. Current approaches tend to use large neural models or use more robust feature extractors to increase the expressive ability of features, ignoring the strong correlation between consecutive cooking steps in the video. However, it is intuitive that previous cooking steps can provide clues for the next cooking step. Specially, consecutive cooking steps tend to share the same ingredients. Therefore, accurate ingredients recognition can help to introduce more fine-grained information in captioning. To improve the performance of video procedural caption in cooking video, this paper proposes a framework that introduces ingredient recognition module which uses the copy mechanism to fuse the predicted ingredient information into the generated sentence. Moreover, we integrate the visual information of the previous step into the generation of the current step, and the visual information of the two steps together assist in the generation process. Extensive experiments verify the effectiveness of our propose framework and it achieves the promising performances on both YouCookII and Cooking-COIN datasets.
Liu, S., Chen, J., Pan, L., Ngo, C., Chua, T., & Jiang, Y. (2020). Hyperbolic visual embedding learning for zero-shot recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
More info
This paper proposes a Hyperbolic Visual Embedding Learning Network for zero-shot recognition. The network learns image embeddings in hyperbolic space, which is capable of preserving the hierarchical structure of semantic classes in low dimensions. Comparing with existing zero-shot learning approaches, the network is more robust because the embedding feature in hyperbolic space better represents class hierarchy and thereby avoid misleading resulted from unrelated siblings. Our network outperforms exiting baselines under hierarchical evaluation with an extremely challenging setting, i.e., learning only from 1,000 categories to recognize 20,841 unseen categories. While under flat evaluation, it has competitive performance as state-of-the-art methods but with five times lower embedding dimensions. Our code is publicly available.
Pan, L., Chen, J., Wu, J., Liu, S., Ngo, C., Kan, M., Jiang, Y., & Chua, T. (2020). Multi-modal Cooking Workflow Construction for Food Recipes. In ACM International Conference on Multimedia (ACM MM), 2020.
More info
Understanding food recipe requires anticipating the implicit causal effects of cooking actions, such that the recipe can be converted into a graph describing the temporal workflow of the recipe. This is a non-trivial task that involves common-sense reasoning. However, existing efforts rely on hand-crafted features to extract the workflow graph from recipes due to the lack of large-scale labeled datasets. Moreover, they fail to utilize the cooking images, which constitute an important part of food recipes. In this paper, we build MM-ReS, the first large-scale dataset for cooking workflow construction, consisting of 9,850 recipes with human-labeled workflow graphs. Cooking steps are multi-modal, featuring both text instructions and cooking images. We then propose a neural encoder-decoder model that utilizes both visual and textual information to construct the cooking workflow, which achieved over 20% performance gain over existing hand-crafted baselines.

Edit my profile

Liangming Pan

Degrees

Work Experience

Awards

Related Links

Interests

Courses

2025-26 Courses

Data Mining/Discovery

2024-25 Courses

Adv ML Apps

Directed Research

Data Mining/Discovery

Related Links

Scholarly Contributions

Journals/Publications

Proceedings Publications

Profiles search form

Liangming Pan

Degrees

Work Experience

Awards

Related Links

Interests

Courses

2025-26 Courses

Data Mining/Discovery

2024-25 Courses

Adv ML Apps

Directed Research

Data Mining/Discovery

Related Links

Scholarly Contributions

Journals/Publications

Proceedings Publications