Gus Hahn-Powell
- Assistant Professor, Linguistics
- Member of the Graduate Faculty
- (520) 621-6897
- Communication, Rm. 316A
- Tucson, AZ 85721
- hahnpowell@arizona.edu
Biography
I am an Assistant Professor in the Department of Linguistics where I serve as the founding director of the online MS in Human Language Technology (HLT), as well as the Graduate Certificate in Natural Language Processing (NLP). I also hold appointments in the Cognitive Science GIDP and the Computational Social Science Graduate Certificate Program.
My research centers around machine reading for scientific discovery. In other words, I build and design intelligent systems to help researchers surmount the problem of information overload by scouring the vast body of scientific literature, analyzing findings, and synthesizing discoveries to generate novel hypotheses.
Please see my website for details on my research and teaching.
Degrees
- Ph.D. Computational Linguistics
- University of Arizona, Tucson, Arizona, United States
- Machine Reading for Scientific Discovery
- M.S. Human Language Technology
- University of Arizona, Tucson, Arizona, United States
- M.A. Applied Linguistics
- University of Alabama, Tuscaloosa, Alabama, United States
- B.A. Japanese
- University of Alabama, Tuscaloosa, Alabama, United States
Awards
- Recognition from SBS Dean
- Office of the DeanCollege of Social & Behavior Sciences, Fall 2021
- Best System Demonstration
- Proceedings of the 2019 Conference of the NorthAmerican Chapter of the Association for Computational Linguistics (Demonstrations), Summer 2019
Interests
Research
Computational Linguistics, Machine Reading, Distributional Semantics, Information Extraction, Natural Language Processing (NLP), Literature-based Discovery (LBD)
Courses
2024-25 Courses
-
Independent Study
LING 599 (Spring 2025) -
Adv Statistical Nlp
LING 582 (Fall 2024) -
HLT I
LING 529 (Fall 2024)
2023-24 Courses
-
Independent Study
LING 699 (Spring 2024) -
Professionalism In Ling
LING 689 (Spring 2024) -
Stat Nat Lang Processing
CSC 539 (Spring 2024) -
Stat Nat Lang Processing
INFO 539 (Spring 2024) -
Stat Nat Lang Processing
LING 539 (Spring 2024) -
Adv Statistical Nlp
LING 582 (Fall 2023) -
HLT I
LING 529 (Fall 2023) -
Independent Study
LING 599 (Fall 2023) -
Independent Study
LING 699 (Fall 2023)
2022-23 Courses
-
Independent Study
LING 599 (Spring 2023) -
Internship/Hum Lang Tech
LING 593A (Spring 2023)
2021-22 Courses
-
Internship/Hum Lang Tech
LING 593A (Summer I 2022) -
Adv Statistical Nlp
LING 582 (Spring 2022) -
Dissertation
LING 920 (Spring 2022) -
Internship/Hum Lang Tech
LING 593A (Spring 2022) -
Stat Nat Lang Processing
CSC 539 (Spring 2022) -
Stat Nat Lang Processing
INFO 539 (Spring 2022) -
Stat Nat Lang Processing
LING 539 (Spring 2022) -
Dissertation
LING 920 (Fall 2021) -
HLT I
LING 529 (Fall 2021) -
Independent Study
LING 399 (Fall 2021) -
Internship/Hum Lang Tech
LING 593A (Fall 2021)
2020-21 Courses
-
Internship/Hum Lang Tech
LING 593A (Summer I 2021) -
Independent Study
LING 699 (Spring 2021) -
Internship/Hum Lang Tech
LING 593A (Spring 2021) -
Stat Nat Lang Processing
CSC 439 (Spring 2021) -
Stat Nat Lang Processing
CSC 539 (Spring 2021) -
Stat Nat Lang Processing
INFO 539 (Spring 2021) -
Stat Nat Lang Processing
ISTA 439 (Spring 2021) -
Stat Nat Lang Processing
LING 439 (Spring 2021) -
Stat Nat Lang Processing
LING 539 (Spring 2021) -
Independent Study
LING 699 (Fall 2020) -
Internship/Hum Lang Tech
LING 593A (Fall 2020)
2019-20 Courses
-
Independent Study
LING 599 (Spring 2020) -
Stat Nat Lang Processing
CSC 439 (Spring 2020) -
Stat Nat Lang Processing
CSC 539 (Spring 2020) -
Stat Nat Lang Processing
INFO 539 (Spring 2020) -
Stat Nat Lang Processing
ISTA 439 (Spring 2020) -
Stat Nat Lang Processing
LING 439 (Spring 2020) -
Stat Nat Lang Processing
LING 539 (Spring 2020) -
Adv Statistical Nlp
LING 582 (Fall 2019)
Scholarly Contributions
Chapters
- Smalheiser, N. R., Hahn-Powell, G., Hristovski, D., & Sebastian, Y. (2023). From Knowledge Discovery to Knowledge Creation: How can Literature-based Discovery Accelerate Progress in Science?. In Artificial Intelligence in Science : Challenges, Opportunities and the Future of Science(p. 300). Organisation for Economic Cooperation and Development (OECD). doi:10.1787/a8d820bd-en
Journals/Publications
- Gopalakrishnan, S., Chen, V. Z., Dou, W., Hahn-Powell, G., Nedunuri, S., & Zadrozny, W. (2023). Text to Causal Knowledge Graph: A Framework to Synthesize Knowledge from Unstructured Business Texts into Causal Graphs. Information, 14(7).
- Poole, R., Gnann, A., & Hahn-Powell, G. (2019). Epistemic stance and the construction of knowledge in science writing: A diachronic corpus study. Journal of English for Academic Purposes, 42, 100784. doi:10.1016/j.jeap.2019.100784
- Lent, H., Hahn-Powell, G., Haug-Baltzell, A., Davey, S., Surdeanu, M., & Lyons, E. (2018). Science Citation Knowledge Extractor. Frontiers in Research Metrics and Analytics. doi:10.3389/frma.2018.00035
- Valenzuela-Esc\'{a}rcega, M. A., Babur, \., Hahn-Powell, G., Bell, D., Hicks, T., Noriega-Atala, E., Wang, X., Surdeanu, M., Demir, E., & Morrison, C. T. (2018). Large-scale Automated Machine Reading Discovers New Cancer Driving Mechanisms. Database: The Journal of Biological Databases and Curation. doi:10.1093/database/bay098
- Fried, D., Jansen, P., Hahn-Powell, G., Surdeanu, M., & Clark, P. (2015). Higher-order Lexical Semantic Models for Non-factoid Answer Reranking. Transactions of the Association for Computational Linguistics, 3, 197--210. doi:10.1162/tacl_a_00133
- Hahn-Powell, G. V., & Archangeli, D. (2014). AutoTrace: An automatic system for tracing tongue contours. The Journal of the Acoustical Society of America, 136(4_Supplement), 2104-2104. doi:10.1121/1.4899570
Proceedings Publications
- "Vacareanu, R., Barbosa, G. C., Noriega-Atala, E., Hahn-Powell, G., Sharp, R., Valenzuela-Esc'arcega, M. A., & Surdeanu, M. (2022, jul). A Human-machine Interface for Few-shot Rule Synthesis for Information Extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations.More infoWe propose a system that assists a user in constructing transparent information extraction models, consisting of patterns (or rules) written in a declarative language, through program synthesis.Users of our system can specify their requirements through the use of examples, which are collected with a search interface.The rule-synthesis system proposes rule candidates and the results of applying them on a textual corpus; the user has the option to accept the candidate, request another option, or adjust the examples provided to the system.Through an interactive evaluation, we show that our approach generates high-precision rules even in a 1-shot setting. On a second evaluation on a widely-used relation extraction dataset (TACRED), our method generates rules that outperform considerably manually written patterns.Our code, demo, and documentation is available at https://clulab.github.io/odinsynth.
- Issa, E., AlShakhori, M., Al-Bahrani, R., & Hahn-Powell, G. (2021, april). Country-level Arabic dialect identification using RNNs with and without linguistic features. In Proceedings of the Sixth Arabic Natural Language Processing Workshop.More infoThis work investigates the value of augmenting recurrent neural networks with feature engineering for the Second Nuanced Arabic Dialect Identification (NADI) Subtask 1.2: Country-level DA identification. We compare the performance of a simple word-level LSTM using pretrained embeddings with one enhanced using feature embeddings for engineered linguistic features. Our results show that the addition of explicit features to the LSTM is detrimental to performance. We attribute this performance loss to the bivalency of some linguistic items in some text, ubiquity of topics, and participant mobility.
- Sutiono, A., & Hahn-Powell, G. (2022, oct). Syntax-driven Data Augmentation for Named Entity Recognition. In Proceedings of the First Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning.
- Vacareanu, R., Valenzuela-Esc'arcega, M. A., Barbosa, G., Sharp, R., Hahn-Powell, G., & Surdeanu, M. (2022). From Examples to Rules: Neural Guided Rule Synthesis for Information Extraction. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC).More infoWhile deep learning approaches to information extraction have had many successes, they can be difficult to augment or maintain as needs shift. Rule-based methods, on the other hand, can be more easily modified. However, crafting rules requires expertise in linguistics and the domain of interest, making it infeasible for most users. Here we attempt to combine the advantages of these two directions while mitigating their drawbacks. We adapt recent advances from the adjacent field of program synthesis to information extraction, synthesizing rules from provided examples. We use a transformer-based architecture to guide an enumerative search, and show that this reduces the number of steps that need to be explored before a rule is found. Further, we show that without training the synthesis algorithm on the specific domain, our synthesized rules achieve state-of-the-art performance on the 1-shot scenario of a task that focuses on few-shot learning for relation classification, and competitive performance in the 5-shot scenario.
- Hahn-Powell, G. V. (2020). Odinson: A Fast Rule-based Information Extraction Framework. In Proceedings of The 12th Language Resources and Evaluation Conference.
- Hahn-Powell, G. V. (2020, July). Exploring Interpretability in Event Extraction: Multitask Learning of a Neural Event Classifier and an Explanation Decoder. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop.
- Barbosa, G., Wong, Z., Hahn-Powell, G., Bell, D., Sharp, R., Valenzuela-Esc\'arcega, M. A., & Surdeanu, M. (2019, 6). Enabling Search and Collaborative Assembly of Causal Interactions Extracted from Multilingual and Multi-domain Free Text. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations).
- Forbes, A. G., Lee, K., Hahn-Powell, G., Valenzuela-Esc\'{a}rcega, M. A., & Surdeanu, M. (2018, 5). Text Annotation Graphs: Annotating Complex Natural Language Phenomena. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).
- Luo, F., Valenzuela-Esc\'arcega, M. A., Hahn-Powell, G., & Surdeanu, M. (2018, 6). Scientific Discovery as Link Prediction in Influence and Citation Graphs. In Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12).
- Hahn-Powell, G., Valenzuela-Esc\'arcega, M. A., & Surdeanu, M. (2017, 8). Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Software Demonstrations.
- Valenzuela-Esc\'arcega, M. A., Babur, O., Hahn-Powell, G., Bell, D., Hicks, T., Noriega-Atala, E., Wang, X., Surdeanu, M., Demir, E., & Morrison, C. T. (2017, 10). Large-scale automated reading with Reach discovers new cancer driving mechanisms. In Proceedings of the Sixth BioCreative Challenge Evaluation Workshop.
- Bell, D., Hahn-Powell, G., Valenzuela-Esc\'arcega, M. A., & Surdeanu, M. (2016, 5). An investigation of coreference phenomena in the biomedical domain. In Proceedings of the 10th International Conference on Language Resources and Evaluation.
- Hahn-Powell, G., Bell, D., Valenzuela-Esc\'arcega, M. A., & Surdeanu, M. (2016, 8). This before That: Causal Precedence in the Biomedical Domain. In Proceedings of the 2016 Workshop on Biomedical Natural Language Processing.
- Valenzuela-Esc\'arcega, M. A., Hahn-Powell, G., & Surdeanu, M. (2016, 5). Odin's Runes: A Rule Language for Information Extraction. In Proceedings of the 10th International Conference on Language Resources and Evaluation.
- Valenzuela-Esc\'arcega, M. A., Hahn-Powell, G., Bell, D., & Surdeanu, M. (2016, 8). SnapToGrid: From Statistical to Interpretable Models for Biomedical Information Extraction. In Proceedings of the 15th Workshop on Biomedical Natural Language Processing.
- Hahn-Powell, G., Martin, B., & Archangeli, D. (2015, 12). A method for automatically detecting problematic tongue traces. In Proceedings of Ultrafest VII.
- Valenzuela-Esc\'arcega, M. A., Hahn-Powell, G., Hicks, T., & Surdeanu, M. (2015, 7). A Domain-independent Rule-based Framework for Event Extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing: Software Demonstrations.
- Hahn-Powell, G., & Archangeli, D. (2014, 10). AutoTrace: An automatic system for tracing tongue contours. In Proceedings of the 168th Meeting of Acoustical Society of America, 136.
- Hahn-Powell, G., & Archangeli, D. (2014, 10). Testing AutoTrace. In Proceedings of the 168th Meeting of Acoustical Society of America, 136.
- Archangeli, D., Mahdavi, M., Ellison, D., Hahn-Powell, G., Coto, R., Berry, J., & Boersma, P. (2013, 11). UltraPraat Software \& database for simultaneous acoustic and articulatory analysis. In Proceedings of Ultrafest VI.
- Dawson, C. R., Pero, L. D., Morrison, C. T., Surdeanu, M., Hahn-Powell, G., Chapman, Z., & Barnard, K. (2013, 4). Bayesian modeling of scenes and captions. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Workshop on Vision and Language.
- Sung, J., Berry, J., Cooper, M., Hahn-Powell, G., & Archangeli, D. (2013, 11). Testing AutoTrace: A Machine-learning Approach to Automated Tongue Contour Data Extraction. In Proceedings of Ultrafest VI.
- Patton, E., Hahn-Powell, G., & Nelson, R. (2010, 4). The `Worthy of Attention' Collostruction: Frequency, synonymy, and learnability. In Southeastern Conference on Linguistics.
Presentations
- Hahn-Powell, G. V. (2020, April). Generating scientific hypotheses through machine reading.
- Hahn-Powell, G. V. (2020, May). Community-guided Hypothesis Generation.
- Hahn-Powell, G., & Bell, D. (2019, 10). Bridging Non-interacting Research Communities Through Machine-guided Discovery Synthesis. INFORMS. Seattle, Washington, USA: INFORMS.
Others
- Hahn-Powell, G. (2018, 8). Machine Reading for Scientific Discovery. https://repository.arizona.edu/handle/10150/630562
- Valenzuela-Esc\'arcega, M. A., Hahn-Powell, G., & Surdeanu, M. (2015, 9). Description of the Odin Event Extraction Framework and Rule Language. arXiv.org. https://arxiv.org/abs/1509.07513