- Professor, Management Information Systems
- Professor, Computer Science
- Professor, Remote Sensing / Spatial Analysis - GIDP
- Professor, BIO5 Institute
Sudha Ram is Anheuser-Busch Endowed Professor of MIS, and Entrepreneurship & Innovation in the Eller College of Management at the University of Arizona. She has joint faculty appointments as Professor of Computer Science, member of BIO5 Institute, and Institute for Environment. She is the director of the Advanced Database Research Group (ADRG) and co-director of INSITE: Center for Business Intelligence and Analytics at the University of Arizona. Dr. Ram received a Ph.D. from the University of Illinois at Urbana-Champaign in 1985. Her research is in the areas of Enterprise Data Management, Business Intelligence, Large Scale Networks and Data Analytics. Her work uses different methods such as machine learning, statistical approaches, ontologies and conceptual modeling. Dr. Ram has published more than 200 research articles in refereed journals, conferences and book chapters.
She has received research funding for more than $60 million from organizations such as, IBM, Intel Corporation, SAP, Ford, Raytheon Missile Systems, US ARMY, NIST, NSF, NASA, and Office of Research and Development of the CIA. Dr. Ram served as the senior editor for Information Systems Research, and is on the editorial board for many leading Information Systems journal and currently a co-editor in chief of the Journal on Data Semantics. She is a cofounder of the Workshop on Information Technology and Systems (WITS) and serves on the steering committee of many workshops and conferences including the Entity Relationship Conference (ER). Dr. Ram has published articles in such journals as Communications of the ACM, IEEE Expert, IEEE Transactions on Knowledge and Data Engineering, Information Systems, Information Systems Research, Management Science, and MIS Quarterly.
Dr. Ram serves as a consultant to several global companies on Business Intelligence, Enterprise Data management and Social Media Analytics. Recently, she received IBM faculty Development Award and UA Leading Edge Innovator in Research Award in 2007 and 2012. Her research has been highlighted in several media outlets including Arizona Alumni Magazine, International Journalism Festival, and NPR news, She was a speaker for a TED talk in December 2013 on “Creating a Smarter World with Big Data”.
- Ph.D. Management Information Systems
- University of Illinois at Urbana Champaign, Urbana, Illinois, United States
- M.B.A. MIS/Business
- Indian Institute of Management (IIMC), Calcutta, India
- B.S. Chemistry (with Mathematics and Physics minor)
- University of Madras, Madras, India
- University of Illinois, Champaign-Urbana, Illinois (1982 - 1985)
- Tata Consulting Services (1981 - 1982)
- WITS Best Paper Award nomination
- Workshop on Information Technology and Systems, Fall 2019 (Award Nominee)
- AIS Fellow Award
- Association for Information Systems, Fall 2018
- Best Paper Runner Up Award
- WITS 2016, Fall 2016
- Overall Conference Best Paper Award
- IEEE International Conference On Smart Cities, Fall 2016
- Best Paper Award
- ACM Digital Health Conference, Spring 2016
- Gobernarte Award for Smart Cities and Big Data
- InterAmerican Development Bank (IDB), Spring 2016
Big Data Analytics; Data Semantics and Data Integration; Enterprise Data Management.
Big Data Analytics; Network Science and Large Scale Network Analysis; Predictive Modeling; Semantic Data Integration; Health Care Analytics; Smart Cities.
Business IntelligenceMIS 587 (Fall 2020)
DissertationMIS 920 (Fall 2020)
Network Sci: Theory and ApplMIS 615 (Fall 2020)
Business IntelligenceMIS 587 (Spring 2020)
DissertationMIS 920 (Spring 2020)
Business IntelligenceMIS 587 (Fall 2019)
DissertationMIS 920 (Fall 2019)
Business IntelligenceMIS 587 (Spring 2019)
DissertationMIS 920 (Spring 2019)
Honors ThesisMIS 498H (Spring 2019)
Big Data AnalyticsMIS 586 (Fall 2018)
Business IntelligenceMIS 587 (Fall 2018)
DissertationMIS 920 (Fall 2018)
Honors ThesisMIS 498H (Fall 2018)
Business IntelligenceMIS 587 (Spring 2018)
DissertationMIS 920 (Spring 2018)
Big Data AnalyticsMIS 586 (Fall 2017)
Business IntelligenceMIS 587 (Fall 2017)
DissertationMIS 920 (Fall 2017)
Strategic Ops & TechnologyBNAD 507A (Summer I 2017)
Business IntelligenceMIS 587 (Spring 2017)
DissertationMIS 920 (Spring 2017)
Honors ThesisMIS 498H (Spring 2017)
Independent StudyMIS 599 (Spring 2017)
Big Data AnalyticsMIS 586 (Fall 2016)
Business IntelligenceMIS 587 (Fall 2016)
DissertationMIS 920 (Fall 2016)
Honors Independent StudyMIS 399H (Fall 2016)
Honors ThesisMIS 498H (Fall 2016)
Business IntelligenceMIS 587 (Spring 2016)
DissertationMIS 920 (Spring 2016)
- Bhattacharya, D., Currim, F., & Ram, S. (2019). Evaluating Distributed Computing Infrastructures: An Empirical Study Comparing Hadoop Deployments on Cloud and Local Systems. IEEE Transactions on Cloud Computing, 1-1.
- Currim, F., & Ram, S. (2018). Understanding Semantic Completeness in Rule Frameworks for Modeling Cardinality Constraints. Published, Enterprise Modeling and Information Systems Architecture: International Journal of Conceptual Modeling, 293-315. doi:10.18417/emisa.si.hcm.23
- Hashim, M. J., Ram, S., & Tang, Z. (2019). Uncovering the effects of digital movie format availability on physical movie sales. Decision Support Systems, 117, 75--86. doi:10.1016/j.dss.2018.10.016
- Jarke, M., Otto, B., & Ram, S. (2019). Data Sovereignty and Data Space Ecosystems. Business & Information Systems Engineering, 61(5), 549--550.
- Velichety, S., Ram, S., & Bockstedt, J. C. (2019). Quality Assessment of Peer-Produced Content in Knowledge Repositories using Development and Coordination Activities. J. of Management Information Systems, 36(2), 478--512.
- Delen, D., & Ram, S. (2018). Research challenges and opportunities in business analytics. Journal of Business Analytics, 1(1), 2-12.
- Jarke, M., Otto, B., & Ram, S. (2018). Data Sovereignty and Data Space Ecosystems. Business & Information Systems Engineering, 60(2), 191--192.
- Lindberg, C. M., Srinivasan, K., Gilligan, B., Razjouyan, J., Lee, H., Najafi, B., Canada, K. J., Mehl, M. R., Currim, F., Ram, S., Lunden, M. M., Heerwagen, J. H., Kampschroer, K., & Sternberg, E. M. (2018). Effects of office workstation type on physical activity and stress. Occupational and Environmental Medicine, 75(10), 689--695.
- Lismont, J., Ram, S., Vanthienen, J., Lemahieu, W., & Baesens, B. (2018). Predicting interpurchase time in a retail environment using customer-product networks: An empirical study and evaluation. Expert Systems with Applications, 104, 22--32.
- Liu, J., & Ram, S. (2018). Using big data and network analysis to understand Wikipedia article quality. Data and Knowledge Engineering, 115, 80--93.
- Ram, S., & Delen, D. (2018). Introduction to the inaugural issue of journal of business analytics. Journal of Business Analytics, 1(1), 1-1.
- Srinivasan, K., Currim, F., & Ram, S. (2018). Predicting High Cost Patients at Point of Admission using Network Science. IEEE Journal of Biomedical and Health Informatics, 22(6), 1970-1977. doi:10.1109/JBHI.2017.2783049
- Srinivasan, K., Currim, F., & Ram, S. (2018). Predicting High-Cost Patients at Point of Admission Using Network Science. IEEE Journal of Biomedical and Health Informatics, 22(6), 1970-1977.
- Tang, Z., Ram, S., & Hashim, M. J. (2019). Uncovering the Effects of Digital Movie Format Availability on Physical Movie Sales.. Decision Support Systems, 117, 75-86.
- Liu, J., & Ram, S. (2017). Developing data ontology of provenance based on the W7 model: A conceptual graph-based approach. Journal of Database Management, 28(1), 43-62. doi:10.4018/JDM.2017010104
- Liu, J., & Ram, S. (2017). Improving the Domain Independence of Data Provenance Ontologies: A Demonstration Using Conceptual Graphs and the W7 Model. J. Database Manag., 28(1), 43--62.
- Srinivasan, K., Currim, F., & Ram, S. (2017). Predicting High Cost Patients at Point of Admission using Network Science. IEEE Journal of Biomedical and Health Informatics. doi:0.1109/JBHI.2017.2783049
- Amaresinghma, R., Escobar, G., & Ram, S. (2016). Consensus statement on Electronic Health Predictive Analytics: A Guiding Framework to Address Challenges. eGEMS Generating Evidence & Methods to improve patient outcomes, 4(1).More infoComplete List of Authors is Ruben Amaresingham, G. Cohen, M. Entwhistle, G. Escobar, V. Liu,B Lo, S. Ram, S Saria and B. Xie.
- Zhang, K., Bhattacharya, S., & Ram, S. (2016). LARGE SCALE NETWORK ANALYSIS FOR ONLINE SOCIAL BRAND ADVERTISING. MIS Quarterly - Special Issue on Big Data Analytics, 40(4), 35.
- Bhattacharya, D., & Ram, S. (2015). RT @News: An Analysis of News Agency Ego Networks in a Micro-blogging Environment. ACM Transactions on MIS, 6(3).
- Ram, S., Zhang, W., Williams, M., & Pengetenze, Y. (2015). Predicting Asthma Related Emergency Department Visits Using Big Data. IEEE Journal on Biomedical and Health Informatics - Special Issue on Big Data Analytics for Health Care, 19(4), TBD. doi:DOI: 10.1109/JBHI.2015.2404829
- Wang, Y., & Ram, S. (2015). Prediction of Location Based Sequential Purchasing Events Using Spatial, Temporal and Social Patterns. IEEE Intelligent Systems Special Issue on Predictive Analytics.
- Currim, S., Ram, S., Durcikova, A., & Currim, F. (2014). Using a knowledge learning framework to predict errors in database design. Information Systems, 40, 11-31.More infoAbstract: Conceptual data modeling is a critical but difficult part of database development. Little research has attempted to find the underlying causes of the cognitive challenges or errors made during this stage. This paper describes a Modeling Expertise Framework (MEF) that uses modeler expertise to predict errors based on the revised Bloom's taxonomy (RBT). The utility of RBT is in providing a classification of cognitive processes that can be applied to knowledge activities such as conceptual modeling. We employ the MEF to map conceptual modeling tasks to different levels of cognitive complexity and classify current modeler expertise levels. An experimental exercise confirms our predictions of errors. Our work provides an understanding into why novices can handle entity classes and identifying binary relationships with some ease, but find other components like ternary relationships difficult. We discuss implications for data modeling training at a novice and intermediate level, which can be extended to other areas of Information Systems education and training. © 2013 Elsevier Ltd. All rights reserved.
- Khatri, V., Khatri, V., Ram, S., Ram, S., Snodgrass, R. T., Snodgrass, R. T., Terenziani, P., & Terenziani, P. (2016). Capturing Telic-Atelic Temporal Data Semantics: Generalizing Conventional Conceptual Models. Transactions on Knowledge and Data Engineering, 26(3), 528-548.
- Khatri, V., Ram, S., Snodgrass, R. T., & Terenziani, P. (2014). Capturing Telic/Atelic Temporal Data Semantics: Generalizing Conventional Conceptual Models. IEEE Transactions on Knowledge and Data Engineering, 26(3), 528-549.
- Khatri, V., Ram, S., Terenziani, P., & Snodgrass, R. T. (2014). Capturing Telic/Atelic Temporal Data Semantics: Generalizing Conventional Conceptual Models. IEEE Transactions on Knowledge and Data Engineering, 26(3), 528-548.
- Zhang, K., Ram, S., & Bhattacharya, S. (2014). Empirical analysis of implicit brand networks on social media.. Hypertext Conference Proceedings, 190-199.
- Atzeni, P., Jensen, C. S., Orsi, G., Ram, S., Tanca, L., & Torlone, R. (2013). The relational model is dead, SQL is dead, and i don't feel so good myself. SIGMOD Record, 42(2), 64-68.More infoAbstract: We report the opinions expressed by well-known database researchers on the future of the relational model and SQL during a panel at the International Workshop on Non-Conventional Data Access (NoCoDa 2012), held in Florence, Italy in October 2012 in conjunction with the 31st International Conference on Conceptual Modeling. The panelists include: Paolo Atzeni (Universitá Roma Tre, Italy), Umeshwar Dayal (HP Labs, USA), Christian S. Jensen (Aarhus University, Denmark), and Sudha Ram (University of Arizona, USA). Quotations from movies are used as a playful though effective way to convey the dramatic changes that database technology and research are currently undergoing.
- Brocke, J. V., Hekkala, R., Ram, S., & Rossi, M. (2013). Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7939 LNCS, VI.
- Velichety, S., & Ram, S. (2013). Examining lists on twitter to uncover relationships between following, membership and subscription. WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web, 673-675.More infoAbstract: We report on an exploratory analysis of pairwise relationships between three different forms of information consumption on Twitter viz., following, listing and subscribing. We develop a systematic framework to examine the relationships between these three forms. Using our framework, we conducted an empirical analysis of a dataset from Twitter. Our results show that people not only consume information by explicitly following others, but also by listing and subscribing to lists and that the people they list or subscribe to are not the same as the ones they follow. Our work has implications for understanding information propagation and diffusion via Twitter and for generating recommendations for adding users to lists, subscribing and merging or splitting them.
- Atzeni, P., Cheung, D., & Ram, S. (2012). Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7532 LNCS.
- Bhattacharya, D., & Ram, S. (2012). Sharing news articles using 140 characters: A diffusion analysis on twitter. Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012, 966-971.More infoAbstract: Is it possible to effectively spread news articles to a large audience using 140 characters? How does the microblogging website Twitter get used as a platform for the news media agencies to create awareness about the articles they publish on a daily basis? Our study of the diffusion patterns of news articles from 12 popular news sources, including BBC, New York Times, and Mashable on Twitter reveals that a large number of users not only consume and comment on these news articles but also share them in different ways. Combining the methods of network and temporal analyses, we examine and report on how news articles diffuse on Twitter, and how different propagation mechanisms result in different lifespans for news articles. © 2012 IEEE.
- Currim, F., & Ram, S. (2012). Modeling spatial and temporal set-based constraints during conceptual database design. Information Systems Research, 23(1), 109-128.More infoAbstract: From a database perspective, business constraints provide an accurate picture of the real world being modeled and help enforce data integrity. Typically, rules are gathered during requirements analysis and embedded in code during the implementation phase. We propose that the rules be explicitly modeled during conceptual design, and develop a framework for understanding and classifying spatiotemporal set-based (cardinality) constraints and an associated syntax. The constraint semantics are formally specified using first-order logic. Modeling rules in conceptual design ensures they are visible to designers and users and not buried in application code. The rules can then be semiautomatically translated into logical design triggers yielding productivity gains. Following the principles of design science research, we evaluate the framework's expressiveness and utility with a case study. © 2012 INFORMS.
- Velichety, S., & Ram, S. (2012). Common citation analysis and technology overlap factor: An empirical investigation of litigated patents using network analysis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7286 LNCS, 287-293.More infoAbstract: Companies incur huge costs in filing and defending patent lawsuits. A part of the problem arises from the fact that companies do not have a comprehensive understanding of the patents that they have cited and the patents that have cited their patents. By empirically analyzing the forward and backward citations of a set of litigated patents in the smart phone industry, we provide a method for profiling patents and identifying citation patterns. Our results show that while some patents share common forward and backward citations, others do not share any backward citations but share a lot of forward citations. We hypothesize that this maybe an indication of the convergence of different types of technologies. We also propose a new metric - Technology Overlap Factor - that can help in identifying convergence. In doing so, we provide a preliminary framework for further investigation and for building a patent analysis software system. © 2012 Springer-Verlag.
- Wei, W., & Ram, S. (2012). Using a network analysis approach for organizing social bookmarking tags and enabling web content discovery. ACM Transactions on Management Information Systems, 3(3).More infoAbstract: This article describes an innovative approach to reorganizing the tag space generated by social bookmarking services. The objective of this work is to enable effective search and discovery of Web content using social bookmarking tags. Tags are metadata generated by users for Web content annotation. Their potential as effective Web search and discovery tool is hindered by challenges such as, the tag space being untidy due to ambiguity, and hidden or implicit semantics. Using a novel analytics approach, we conducted network analyses on tags and discovered that tags are generated for different purposes and that there are inherent relationships among tags. Our approach can be used to extract the purposes of tags and relationships among the tags and this information can be used as facets to add structure and hierarchy to reorganize the flat tag space. The semantics of relationships and hierarchy in our proposed faceted model of tags enable searches on annotated Web content in an effective manner. We describe the implementation of a prototype system called FASTS to demonstrate feasibility and effectiveness of our approach. © 2012 ACM.
- Liu, J., & Ram, S. (2011). Who does what: Collaboration patterns in the wikipedia and their impact on article quality. ACM Transactions on Management Information Systems, 2(2).More infoAbstract: The quality of Wikipedia articles is debatable. On the one hand, existing research indicates that not only are people willing to contribute articles but the quality of these articles is close to that found in conventional encyclopedias. On the other hand, the public has never stopped criticizing the quality of Wikipedia articles, and critics never have trouble finding low-quality Wikipedia articles. Why do Wikipedia articles vary widely in quality? We investigate the relationship between collaboration and Wikipedia article quality. We show that the quality of Wikipedia articles is not only dependent on the different types of contributors but also on how they collaborate. Based on an empirical study, we classify contributors based on their roles in editing individualWikipedia articles.We identify various patterns of collaboration based on the provenance or, more specifically, who does what to Wikipedia articles. Our research helps identify collaboration patterns that are preferable or detrimental for article quality, thus providing insights for designing tools and mechanisms to improve the quality of Wikipedia articles. © 2011 ACM.
- Parsons, J., Olivé, A., Ram, S., Wagner, G., Wand, Y., & Eric, Y. u. (2011). Panel: New directions for conceptual modeling. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6998 LNCS, 524-525.More infoAbstract: This panel examines potential opportunities for conceptual modeling research in new domains. © 2011 Springer-Verlag.
- Zhao, J., & Ram, S. (2011). Examining the evolution of networks based on lists in Twitter. 2011 IEEE 5th International Conference on Internet Multimedia Systems Architecture and Application, IMSAA 2011 - Conference Proceedings.More infoAbstract: In this research we analyze the evolution of online social networks based on the "List" feature in Twitter. Lists are connected implicitly by virtue of sharing common users, and the users are also connected implicitly by being added to the same lists. We focus on the evolution of these implicit networks in Twitter with users and their lists, and attempt to understand how triadic closures occur among users via lists. Using a dataset collected over a period of 6 months we show that closures exhibit many different patterns. We also analyze the pattern of dissolution of connections and show that it is closely related to the nature of the triadic structures and the roles of the nodes in the information sharing network. © 2011 IEEE.
- Currim, F., & Ram, S. (2010). When entities are types: Effectively modeling type-instantiation relationships. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6413 LNCS, 138-147.More infoAbstract: Type-instantiation relationships (TIRs) appear in many application domains including RFID-based inventory tracking, securities markets, health care, incident-response management, travel, advertising, and academia. For example an emergency response (type) is instantiated in the actual incident, or an advertisement (type) serves impressions on a website. This kind of relationship has received little attention in literature notwithstanding its ubiquity. Conventional modeling does not properly capture its underlying semantics. This can lead to data redundancy, denormalized relations and loss of knowledge about constraints during implementation. Our work formally defines and discusses the semantics of the type-instantiation relationship. We also present an analysis of how TIRs affect other relationships in a conceptual database schema, and the relational implications of our approach. © 2010 Springer-Verlag.
- Ram, S., & Liu, J. (2010). Provenance management in BioSciences. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6413 LNCS, 54-64.More infoAbstract: Data provenance is becoming increasingly important for biosciences with the advent of large-scale collaborative environments such as the iPlant collaborative, where scientists collaborate by using data that they themselves did not generate. To facilitate the widespread use and sharing of provenance, ontologies of provenance need to be developed to enable the capture and standardized representation of provenance for biosciences. Working with researchers from the iPlant Tree of Life (iPToL) Grand Challenge Project, we developed a domain ontology of provenance for phylogenetic analysis. Relying on the conceptual graph formalism, we describe the process of developing the provenance ontology based on the W7 model, a generic ontology of data provenance. This domain ontology provides a structured model for harvesting, storing and querying provenance. We also illustrate how the harvested data provenance based on our ontology can be used for different purposes. © 2010 Springer-Verlag.
- Ram, S., & Wei, W. (2010). FASTS: FAcets Structured Tag Space - A novel approach to organize and reuse social bookmarking tags. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6105 LNCS, 426-438.More infoAbstract: Social bookmarking tools are generating an enormous pool of metadata describing and categorizing web resources. The value of these metadata in the form of tags can be fully realized only when they are shared and reused for web search and retrieval. The research described in this paper proposes a facet classification mechanism, and a tag relationship ontology to organize tags into a meaningful and intuitively useful structure. We have implemented a web-based prototype system to effectively search and browse bookmarked web resources using this approach. We collected real tag data from del.icio.us for a wide range of popular domains. We analyzed, processed, and organized these tags to demonstrate the effectiveness and utility of our approach for tag organization and reuse. © 2010 Springer-Verlag.
- Ram, S., & Wei, W. (2010). How social is social bookmarking?. 2010 IEEE International Workshop on Business Applications of Social Network Analysis, BASNA 2010.More infoAbstract: Social bookmarking services allow a user to make her personal collection of favorite web resources accessible by the public. The content of this collection can attract users of "similar minds" and therefore has tremendous potential to enable networking and collaboration. In this research, we analyzed a large dataset collected from one of the most popular social bookmarking services. To understand why there is a large gap between a user's explicit network and her implicit user-user association networks based on common resources or common tags, we compared a users' bookmark resources and tags to those of her explicit network members. Our results suggest that a typical social bookmarking service user does not create her explicit network based on common interests. We discuss the implications behind the gap between a user's explicit network and implicit network and propose solutions to enhance and improve the "social" functions of social bookmarking services. © 2010 IEEE.
- Ram, S., & Liu, J. (2009). A new perspective on semantics of data provenance. CEUR Workshop Proceedings, 526.More infoAbstract: Data Provenance refers to the "origin", "lineage", and "source" of data. In this work, we examine provenance from a semantics perspective and present the W7 model, an ontological model of data provenance. In the W7 model, provenance is conceptualized as a combination of seven interconnected elements including "what", "when", "where", "how", "who", "which" and "why". Each of these components may be used to track events that affect data during its lifetime. The W7 model is general and extensible enough to capture provenance semantics for data in different domains. Using the example of the Wikipedia, we illustrate how the W7 model can capture domain or application specific provenance.
- Currim, F., & Ram, S. (2008). Conceptually modeling windows and bounds for space and time in database constraints. Communications of the ACM, 51(11), 125-129.More infoAbstract: GIS, logistics, CAD/CAM, robotics, and medical imaging systems uses spatial data, while systems for financial services, inventory management, professional sports, consumer research, and payroll uses historical or temporal data. A Customer Relationship Management (CRM) application can track and keep information about customers and promotions. Some of the significant advantages during modeling constraint semantics for conceptual design are reduced cost of correcting errors, consistent application enforcement, and improved ability to validate requirements with users. The modeling constraints for conceptual design can provide accurate representation of semantics. The model-driven architecture (MDA) method can automate the translation of constraints into database.
- Ram, S., & Liu, J. (2008). Understanding the semantics of data provenance to support active conceptual modeling. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4512 LNCS, 17-29.More infoAbstract: Data Provenance refers to the lineage of data including its origin, key events that occur over the course of its lifecycle, and other details associated with data creation, processing, and archiving. We believe that tracking provenance enables users to share, discover, and reuse the data, thus streamlining collaborative activities, reducing the possibility of repeating dead ends, and facilitating learning. It also provides a mechanism to transition from static to active conceptual modeling. The primary goal of our research is to investigate the semantics or meaning of data provenance. We describe the W7 model that represents different components of provenance and their relationships to each other. We conceptualize provenance as a combination of seven interconnected elements including "what", "when", "where", "how", "who", "which" and "why". Each of these components may be used to track events that affect data during its lifetime. A homeland security example illustrates how current conceptual models can be extended to embed provenance. © 2008 Springer-Verlag Berlin Heidelberg.
- Ram, S., Zhang, K., & Wei, W. (2008). Linking biological databases semantically for knowledge discovery. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5232 LNCS, 22-32.More infoAbstract: Many important life sciences questions are aimed at studying the relationships and interactions between biological functions/processes and biological entities such as genes. The answers may be found by examining diverse types of biological/genomic databases. Finding these answers, however, requires accessing, and retrieving data, from diverse biological data sources. More importantly, sophisticated knowledge discovery processes involve traversing through large numbers of inherent links among various data sources. Currently, the links among data are either implemented as hyperlinks without explicitly indicating their meanings and labels, or hidden in a seemingly simple text format. Consequently, biologists spend numerous hours identifying potentially useful links and following each lead manually, which is time-consuming and error-prone. Our research is aimed at constructing semantic relationships among all biological entities. We have designed a semantic model to categorize and formally define the links. By incorporating ontologies such as Gene or Sequence ontology, we propose techniques to analyze the links embedded within and among data records, to explicitly label their semantics, and to facilitate link traversal, querying, and data sharing. Users may then ask complicated and ad hoc questions and even design their own workflow to support their knowledge discovery processes. In addition, we have performed an empirical analysis to demonstrate that our method can not only improve the efficiency of querying multiple databases, but also yield more useful information. © 2008 Springer Berlin Heidelberg.
- Zhao, H., & Ram, S. (2008). Entity matching across heterogeneous data sources: An approach based on constrained cascade generalization. Data and Knowledge Engineering, 66(3), 368-381.More infoAbstract: To integrate or link the data stored in heterogeneous data sources, a critical problem is entity matching, i.e., matching records representing semantically corresponding entities in the real world, across the sources. While decision tree techniques have been used to learn entity matching rules, most decision tree learners have an inherent representational bias, that is, they generate univariate trees and restrict the decision boundaries to be axis-orthogonal hyper-planes in the feature space. Cascading other classification methods with decision tree learners can alleviate this bias and potentially increase classification accuracy. In this paper, the authors apply a recently-developed constrained cascade generalization method in entity matching and report on empirical evaluation using real-world data. The evaluation results show that this method outperforms the base classification methods in terms of classification accuracy, especially in the dirtiest case. © 2008 Elsevier B.V. All rights reserved.
- Zhao, H., Sinha, A. P., & Ram, S. (2008). An empirical study of the effects of principal component analysis on symbolic classifiers. 14th Americas Conference on Information Systems, AMCIS 2008, 1, 563-569.More infoAbstract: Classification is a frequently encountered data mining problem. While symbolic classifiers have high comprehensibility, their language bias may hamper their classification performance. Incorporating new features constructed based on the original features may relax such language bias and lead to performance improvement. Among others, principal component analysis (PCA) has been proposed as a possible method for enhancing the performance of decision trees. However, since PCA is an unsupervised method, the principal components may not represent the ideal projection directions for optimizing the classification performance. Thus, we expect PCA to have varying effects; it may improve classification performance if the projections enhance class differences, but may degrade performance otherwise. We also posit that the effects of PCA are similar on symbolic classifiers, including decision rules, decision trees, and decision tables. In this paper, we empirically evaluate the effects of PCA on symbolic classifiers and discuss the findings.
- Bouaziz, R., Chakhar, S., Mousseau, V., Ram, S., & Telmoudi, A. (2007). Database design and querying within the fuzzy semantic model. Information Sciences, 177(21), 4598-4620.More infoAbstract: Fuzzy semantic model (FSM) is a data model that uses basic concepts of semantic modeling and supports handling fuzziness, uncertainty and imprecision of real-world at the attribute, entity and class levels. The paper presents the principles and constructs of the FSM. It proposes ways to define the membership functions within all the constructs of the FSM. In addition, it provides a proposal for specifying FSM schema and introduce a query language adapted to FSM-based databases. © 2007 Elsevier Inc. All rights reserved.
- Ram, S., & Khatri, V. (2007). Special issue theme: Defining, eliciting and using data semantics for emerging domains. Journal of Database Management, 18(1), i-iv.
- Ram, S., & Zhao, H. (2007). Special issue on "Semantic Web: Opportunities and challenges". Information Technology and Management, 8(3), 203-204.
- Zhao, H., & Ram, S. (2007). Combining schema and instance information for integrating heterogeneous data sources. Data and Knowledge Engineering, 61(2), 281-303.More infoAbstract: Determining the correspondences among heterogeneous data sources, which is critical to integration of the data sources, is a complex and resource-consuming task that demands automated support. We propose an iterative procedure for detecting both schema-level and instance-level correspondences from heterogeneous data sources. Cluster analysis techniques are used first to identify similar schema elements (i.e., relations and attributes). Based on the identified schema-level correspondences, classification techniques are used to identify matching tuples. Statistical analysis techniques are then applied to a preliminary integrated data set to evaluate the relationships among schema elements more accurately. Improvement in schema-level correspondences triggers another iteration of an iterative procedure. We have performed empirical evaluation using real-world heterogeneous data sources and report in this paper some promising results (i.e., incremental improvement in identified correspondences) that demonstrate the utility of the proposed iterative procedure. © 2006 Elsevier B.V. All rights reserved.
- Embley, D. W., Olive, A., & Ram, S. (2006). Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4215 LNCS, v-vi.
- Khatri, V., Vessey, I., Ram, S., & Ramesh, V. (2006). Cognitive fit between conceptual schemas and internal problem representations: The case of geospatio-temporal conceptual schema comprehension. IEEE Transactions on Professional Communication, 49(2), 109-127.More infoAbstract: Geospatio-temporal conceptual models provide a mechanism to explicitly represent geospatial and temporal aspects of applications. Such models, which focus on both "what" and "when/where," need to be more expressive than conventional conceptual models (e.g., the ER model), which primarily focus on "what" is important for a given application. In this study, we view conceptual schema comprehension of geospatio-temporal data semantics in terms of matching the external problem representation (that is, the conceptual schema) to the problem-solving task (that is, syntactic and semantic comprehension tasks), an argument based on the theory of cognitive fit. Our theory suggests that an external problem representation that matches the problem solver's internal task representation will enhance performance, for example, in comprehending such schemas. To assess performance on geospatio-temporal schema comprehension tasks, we conducted a laboratory experiment using two semantically identical conceptual schemas, one of which mapped closely to the internal task representation while the other did not. As expected, we found that the geospatio-temporal conceptual schema that corresponded to the internal representation of the task enhanced the accuracy of schema comprehension; comprehension time was equivalent for both. Cognitive fit between the internal representation of the task and conceptual schemas with geospatio-temporal annotations was, therefore, manifested in accuracy of schema comprehension and not in time for problem solution. Our findings suggest that the annotated schemas facilitate understanding of data semantics represented on the schema. © 2006 IEEE.
- Liginlal, D., Ram, S., & Duckstein, L. (2006). Fuzzy measure theoretical approach to screening product innovations. IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans, 36(3), 577-591.More infoAbstract: Variety of decision models have been proposed in contemporary literature to tackle the problem of screening product innovations. Although linear models have gained considerable attention and recommendation, contemporary literature contains strong evidence in support of nonlinear noncompensatory models. In this paper, the authors first demonstrate how fuzzy measures, which are defined on subsets of decision attributes, and their Choquet-integral formulation, which exhibits both compensatory and noncompensatory properties, have meaningful behavioral interpretations within the context of new-product screening. Then, they show how to address the complex problem of building such measures by applying a learning algorithm that relies on methods of judgment analysis. An accompanying case study demonstrates how organizations may customize a new product decision aid and fine tune their business strategy as actual results accrue. Finally, the authors present the results of analytical studies to compare the Choquet-integral model with other noncompensatory models, such as Martino's extended scoring model and Einhorn's conjunctive model, and heuristic approaches, such as Tversky's EBA and the lexicographic method. For the new-product-decision scenario considered in the study, the Choquet-integral model provided the best fit, measured by Pearson's rank order correlation coefficient, with all of the competing models. © 2006 IEEE.
- Zhao, H., Sinha, A. P., & Ram, S. (2006). Elitist and ensemble strategies for cascade generalization. Journal of Database Management, 17(3), 92-107.More infoAbstract: Several methods have been proposed for cascading other classification algorithms with decision tree learners to alleviate the representational bias of decision trees and, potentially, to improve classification accuracy. Such cascade generalization of decision trees increases the flexibility of the decision boundaries between classes and promotes better fitting of the training data. However, more flexible models may not necessarily lead to more predictive power. Because of potential overfitting problems, the true classification accuracy on test data may not increase. Recently, a generic method for cascade generalization has been proposed. The method uses a parameter - the maximum cascading depth - to constrain the degree that other classification algorithms are cascaded with decision tree learners. A method for efficiently learning a collection (i. e., a forest) of generalized decision trees, each with other classification algorithms cascaded to a particular depth, also has been developed In this article, we propose several new strategies, including elitist and ensemble (weighted or unweighted), for using the various decision trees in such a collection in the prediction phase. Our empirical evaluation using 32 data sets in the UCI machine learning repository shows that, on average, the elitist strategy outperforms the weighted full ensemble strategy, which, in turn, outperforms the unweighted full ensemble strategy. However, no strategy is universally superior across all applications. Since the same training process can be used to evaluate the various strategies, we recommend that several promising strategies be evaluated and compared before selecting the one to use for a given application. Copyright © 2006, Idea Group Inc.
- Ram, S. (2005). Toward semantic interoperability of heterogeneous biological data sources. Lecture Notes in Computer Science, 3520, 32-.More infoAbstract: Genomic researchers use a number of heterogeneous data sources including nucleotides, protein sequences, 3-D Protein structures, taxonomies, and research publications such as MEDLINE. This research aims to discover as much biological knowledge as possible about the properties and functions of the structures such as DNA sequences and protein structures and to explore the connections among all the data, so that the knowledge can be used to improve human lives. Currently it is very difficult to connect all of these data sources seamlessly unless all the data is transformed into a common format with an id connecting all of them. The state-of-the-art facilities for searching these data sources provide interfaces through which scientists can access multiple databases. Most of these searches are primarily text-based, requiring users to specify keywords using which the systems search through each individual data source and returns results. The user is then required to create the connections between the results from each source. This is a major problem because researchers do not always know how to create these connections. To solve this problem we propose a semantics-based mechanism for automatically linking and connecting the various data sources. Our approach is based on a model that explicitly captures the semantics of the heterogeneous data sources and makes them available for searching. In this talk I will discuss issues related to capturing the semantics of biological data and using these semantics to automate the integration of diverse heterogeneous sources. © Springer-Verlag Berlin Heidelberg 2005.
- Ram, S., & Khatri, V. (2005). A comprehensive framework for modeling set-based business rules during conceptual database design. Information Systems, 30(2), 89-118.More infoAbstract: Business rules are the basis of any organization. From an information systems perspective, these business rules function as constraints on a database helping ensure that the structure and content of the real world - sometimes referred to as miniworld - is accurately incorporated into the database. It is important to elicit these rules during the analysis and design stage, since the captured rules are the basis for subsequent development of a business constraints repository. We present a taxonomy for set-based business rules, and describe an overarching framework for modeling rules that constrain the cardinality of sets. The proposed framework results in various types constraints, i.e., attribute, class, participation, projection, co-occurrence, appearance and overlapping, on a semantic model that supports abstractions like classification, generalization/specialization, aggregation and association. We formally define the syntax of our proposed framework in Backus-Naur Form and explicate the semantics using first-order logic. We describe partial ordering in the constraints and define the concept of metaconstraints, which can be used for automatic constraint consistency checking during the design stage itself. We demonstrate the practicality of our approach with a case study and show how our approach to modeling business rules seamlessly integrates into existing database design methodology. Via our proposed framework, we show how explicitly capturing data semantics will help bridge the semantic gap between the real world and its representation in an information system. © 2003 Elsevier Ltd. All rights reserved.
- Ram, S., & Wei, . (2005). Semantics based operators for integrating heterogeneous biological data sources. Proceedings - 18th International Conference on Systems Engineering, IICSEng 2005, 2005, 16-21.More infoAbstract: Advances in analytical techniques have drastically increased the amount of biological data available, which in turn has increased the difficulty and complexity of biological knowledge discovery. It is important to have mechanisms to take full advantage of the data not only for doing simple searches, but also for answering ad hoc and complex questions at the user's will by virtually connecting the heterogeneous data sources. In this paper we propose and formally define new operators that can be used for sequence comparison and alignment. Furthermore, cross-database operators can be developed based on work presented in this paper to dynamically link different types of databases. We believe this will liberate biological researchers from unproductive labor and allow them to focus on pattern seeking and interpretation of results. ©2005 IEEE.
- Zhao, H., & Ram, S. (2005). Entity identification for heterogeneous database integration - A multiple classifier system approach and empirical evaluation. Information Systems, 30(2), 119-132.More infoAbstract: Entity identification, i.e., detecting semantically corresponding records from heterogeneous data sources, is a critical step in integrating the data sources. The objective of this research is to develop and evaluate a novel multiple classifier system approach that improves entity identification accuracy. We apply various classification techniques drawn from statistical pattern recognition, machine learning, and artificial neural networks to determine whether two records from different data sources represent the same real-world entity. We further employ a variety of ways to combine multiple classifiers for improved classification accuracy. In this paper, we report on some promising empirical results that demonstrate performance improvement by combining multiple classifiers. © 2003 Elsevier Ltd. All rights reserved.
- Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004). Design science in information systems research. MIS Quarterly: Management Information Systems, 28(1), 75-105.More infoAbstract: Two paradigms characterize much of the research in the Information Systems discipline: behavioral science and design science. The behavioral-science paradigm seeks to develop and verify theories that explain or predict human or organizational behavior. The design-science paradigm seeks to extend the boundaries of human and organizational capabilities by creating new and innovative artifacts. Both paradigms are foundational to the IS discipline, positioned as it is at the confluence of people, organizations, and technology. Our objective is to describe the performance of design-science research in Information Systems via a concise conceptual framework and clear guidelines for understanding, executing, and evaluating the research. In the design-science paradigm, knowledge and understanding of a problem domain and its solution are achieved in the building and application of the designed artifact. Three recent exemplars in the research literature are used to demonstrate the application of these guidelines. We conclude with an analysis of the challenges of performing high-quality design-science research in the context of the broader IS community.
- Park, J., & Ram, S. (2004). Information systems interoperability: What lies beneath?. ACM Transactions on Information Systems, 22(4), 595-632.More infoAbstract: Interoperability is the most critical issue facing businesses that need to access information from multiple information systems. Our objective in this research is to develop a comprehensive frame-work and methodology to facilitate semantic interoperability among distributed and heterogeneous information systems. A comprehensive framework for managing various semantic conflicts is proposed. Our proposed framework provides a unified view of the underlying representational and reasoning formalism for the semantic mediation process. This framework is then used as a basis for automating the detection and resolution of semantic conflicts among heterogeneous information sources. We define several types of semantic mediators to achieve semantic interoperability. A domain-independent ontology is used to capture various semantic conflicts. A mediation-based query processing technique is developed to provide uniform and integrated access to the multiple heterogeneous databases. A usable prototype is implemented as a proof-of-concept for this work. Finally, the usefulness of our approach is evaluated using three cases in different application domains. Various heterogeneous datasets are used during the evaluation phase. The results of the evaluation suggest that correct identification and construction of both schema and ontology-schema mapping knowledge play very important roles in achieving interoperability at both the data and schema levels.
- Ram, S. (2004). Modeling the semantics of 3d protein structures. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3288, 696-708.More infoAbstract: The post Human Genome Project era calls for reliable, integrated, flexible, and convenient data management techniques to facilitate research activities. Querying biological data that is large in volume and complex in structure such as 3D proteins requires expressive models to explicitly support and capture the semantics of the complex data. Protein 3D structure search and comparison not only enable us to predict unknown structures, but can also reveal distant evolutionary relationships that are otherwise undetectable, and perhaps suggest unsuspected functional properties. In this work, we model 3D protein structures by adding spatial semantics and constructs to represent the contributing forces such as hydrogen bonds and high-level structures such as protein secondary structures. This paper makes a contribution to modeling the specialty of life science data and develops methods to meet the novel challenges posed by such data. © Springer-Verlag 2004.
- Ram, S., & Park, J. (2004). Semantic Conflict Resolution Ontology (SCROL): An Ontology for Detecting and Resolving Data and Schema-Level Semantic Conflicts. IEEE Transactions on Knowledge and Data Engineering, 16(2), 189-202.More infoAbstract: Establishing semantic interoperability among heterogeneous information sources has been a critical issue in the database community for the past two decades. Despite the critical importance, current approaches to semantic interoperability of heterogeneous databases have not been sufficiently effective. We propose a common ontology called Semantic Conflict Resolution Ontology (SCROL) that addresses the inherent difficulties in the conventional approaches, i.e., federated schema and domain ontology approaches, SCROL provides a systematic method for automatically detecting and resolving various semantic conflicts in heterogeneous databases. SCROL provides a dynamic mechanism of comparing and manipulating contextual knowledge of each information source, which is useful in achieving semantic interoperability among heterogeneous databases. We show how SCROL is used for detecting and resolving semantic conflicts between semantically equivalent schema and data elements. In addition, we present evaluation results to show that SCROL can be successfully used to automate the process of identifying and resolving semantic conflicts.
- Zhao, H., & Ram, S. (2004). Clustering schema elements for semantic integration of heterogeneous data sources. Journal of Database Management, 15(4), 88-106.More infoAbstract: Interschema relationship identification (IRI), that is, determining the relationships among schema elements in heterogeneous data sources, is an important step in integrating the data sources. This article proposes a cluster analysis based approach to semi-automating the IRI process, which is typically very time-consuming and requires extensive human interaction. The authors apply multiple clustering techniques, including K-means, hierarchical clustering, and self-organizing map (SOM) neural network, to identify similar schema elements from heterogeneous data sources, based on a combination of features such as naming similarity, document similarity, schema specification, data patterns, and usage patterns. An SOM prototype the authors have developed provides users with a visualization tool for display of clustering results as well as for incremental evaluation of candidate similar elements.
- Zhao, H., & Ram, S. (2004). Constrained cascade generalization of decision trees. IEEE Transactions on Knowledge and Data Engineering, 16(6), 727-739.More infoAbstract: While decision tree techniques have been widely used in classification applications, a shortcoming of many decision tree inducers is that they do not learn intermediate concepts, i.e., at each node, only one of the original features is involved In the branching decision. Combining other classification methods, which learn intermediate concepts, with decision tree inducers can produce more flexible decision boundaries that separate different classes, potentially Improving classification accuracy. We propose a generic algorithm for cascade generalization of decision tree inducers with the maximum cascading depth as a parameter to constrain the degree of cascading. Cascading methods proposed in the past, i.e., loose coupling and tight coupling, are strictly special cases of this new algorithm. We have empirically evaluated the proposed algorithm using logistic regression and C4.5 as base inducers on 32 UCI data sets and found that neither loose coupling nor tight coupling is always the best cascading strategy and that the maximum cascading depth in the proposed algorithm can be tuned for better classification accuracy. We have also empirically compared the proposed algorithm and ensemble methods such as bagging and boosting and found that the proposed algorithm performs marginally better than bagging and boosting on the average.
- Bajaj, A., & Ram, S. (2003). IAIS: A methodology to enable inter-agency information sharing in eGovernment. Journal of Database Management, 14(4), 59-80.More infoAbstract: Recently, there has been increased interest in information sharing among government agencies, with a view toward improving security, reducing costs and offering better quality service to users of government services. In this work, the authors complement earlier work by proposing a comprehensive methodology called IAIS (Inter Agency Information Sharing) that uses XML to facilitate the definition of information that needs to be shared, the storage of such information, the access to this information and finally the maintenance of shared information. The authors compare IAIS with two alternate methodologies to share information among agencies, and analyze the pros and cons of each. They also show how IAIS leverages the recently proposed XML (extensible markup language) standard to allow for inclusion of various groups' viewpoints when determining what information should be shared and how it should be structured.
- Madnick, S., Chu, W., Knoblock, C. A., Ram, S., & Wiederhold, G. (2003). Panel: Information and knowledge sharing - Successes and failures. Proceedings of the IASTED International Conference on Information and Knowledge Sharing, 3-5.More infoAbstract: The successes and failures in information and knowledge sharing to increase the number of successes are discussed. The exploitation of information and knowledge sources requires 'logical connectivity' to obtain information from disparate sources which assimilates information meaningfully. The information used in Information Retrieval comes in different formats such as structure data, multimedia, semi-structure and unstructured data. The problems of integrating data from information sources with mixed success reviews four critical research areas such as getting access to information, planning to integrate data across sources, executing integration plans and resolving inconsistencies across sources.
- Ram, S., & Bajaj, A. (2003). Data management for e-business. Journal of Database Management, 14(4), i-iii.
- Bajaj, A., & Ram, S. (2002). SEAM: A state-entity-activity-model for a well-defined workflow development methodology. IEEE Transactions on Knowledge and Data Engineering, 14(2), 415-431.More infoAbstract: Current conceptual workflow models use either informally defined conceptual models or several formally defined conceptual models that capture different aspects of the workflow, e.g., the data, process, and organizational aspects of the workflow. To the best of our knowledge, there are no algorithms that can amalgamate these models to yield a single view of reality. A fragmented conceptual view is useful for systems analysis and documentation. However, it fails to realize the potential of conceptual models to provide a convenient interface to automate the design and management of workflows. First, as a step toward accomplishing this objective, we propose SEAM (State-Entity-Activity-Model), a conceptual workflow model defined in terms of set theory. Second, no attempt has been made, to the best of our knowledge, to incorporate time into a conceptual workflow model. SEAM incorporates the temporal aspect of workflows. Third, we apply SEAM to a real-life organizational unit's workflows. In this work, we show a subset of the workflows modeled for this organization using SEAM. We also demonstrate, via a prototype application, how the SEAM schema can be implemented on a relational database management system. We present the lessons we learned about the advantages obtained for the organization and, for developers who choose to use SEAM, we also present potential pitfalls in using the SEAM methodology to build workflow systems on relational platforms. The information contained in this work is sufficient enough to allow application developers to utilize SEAM as a methodology to analyze, design, and construct workflow applications on current relational database management systems. The definition of SEAM as a context-free grammar, definition of its semantics, and its mapping to relational platforms should be sufficient also, to allow the construction of an automated workflow design and construction tool with SEAM as the user interface.
- March, S., Hevner, A., & Ram, S. (2000). Research Commentary: An Agenda for Information Technology Research in Heterogeneous and Distributed Environments. Information Systems Research, 11(4), 327-341.More infoAbstract: Application-driven, technology-intensive research is critically needed to meet the challenges of globalization, interactivity, high productivity, and rapid adaptation faced by business organizations. Information systems researchers are uniquely positioned to conduct such research, combining computer science, mathematical modeling, systems thinking, management science, cognitive science, and knowledge of organizations and their functions. We present an agenda for addressing these challenges as they affect organizations in heterogeneous and distributed environments. We focus on three major capabilities enabled by such environments: Mobile Computing, Intelligent Agents, and Net-Centric Computing. We identify and define important unresolved problems in each of these areas and propose research strategies to address them.
- Ram, S., Khatrl, V., Hwang, Y., & Yool, S. R. (2000). Semantic modeling and decision support in hydrology. Photogrammetric Engineering and Remote Sensing, 66(10), 1229-1239.More infoAbstract: The current revolution in interconnectivity and online availability of Earth science data has enabled hydrology end users to access a wide variety of Earth science data through the World Wide Web (WWW). However, these distributed data sources have various data formats and numerous spatial and temporal resolutions, which limits the usability of the available datasets. In this paper, we describe how we have applied semantic modeling and ontology to achieve context-based information integration. We are developing a Hydrology Decision Support System (HyDSS), a prototype state-of-the-art web-based decision support system that provides a comprehensive environment for information integration and analysis. HyDSS is a part of the Hydrology Information System (HyDIS), an overall information system to support the requirements of hydrologic end users. It is aimed at supporting the entire decision-making process of hydrological end users, i.e., it helps in information collation and can provide an interface to third-party modeling and simulation tools.
- Ram, S. (1999). Semantic-model support for geographic information systems. Computer, 32(5), 74-81.More infoAbstract: An understanding of geographic information system (GIS) databases will clarify some of the inherent problems in accessing and managing spatiotemporal data. Against this background, a semantic model that explicitly captures the spatial and temporal characteristics of GIS data and their interrelationships is described. The model is part of an overall system for solving some of the problems in accessing data from heterogeneous GIS sources.
- Ram, S., & Shankaranarayanan, G. (1999). Modeling and navigation of large information spaces: A semantics based approach. Proceedings of the Hawaii International Conference on System Sciences, 215-.More infoAbstract: In this paper we present techniques for modeling the semantics of large information spaces and for navigating them. This information space represents heterogeneous data stored in different formats and distributed across multiple locations on the Internet. We also describe a prototype system-called SEMQUEST (SEMantics based QUEry SysTem) that employs graph-based algorithms and allows users to interactively explore, manipulate, and relate data in large information spaces to their interests. It provides users with the flexibility to understand what is available in the information space, determine which parts are relevant, and query/retrieve underlying data using a visual framework. SEMQUEST also allows users to share software modules permitting them to reuse data analysis/visualization codes. We demonstrate system use with global climate change data collected by centers across the world. We believe this research serves as a foundation for future work in integrating information sources across the WWW.
- Ram, S., Park, J., & Lee, D. (1999). Digital Libraries for the Next Millennium: Challenges and Research Directions. Information Systems Frontiers, 1(1), 75-94.More infoAbstract: The unprecedented growth of Internet technologies has made resources on the World Wide Web instantly accessible to various user communities through digital libraries. Since the early 1990s, there have been several digital library initiatives sponsored by government agencies and/or private organizations all over the world. A digital library is a networked system environment that provides diverse user communities with coherent, seamless and transparent access to large, organized, and digitized information resources. This article provides a comprehensive overview of major digital library projects that are currently being undertaken across the globe. We also identify and discuss major challenges and research issues to be addressed in the design and implementation of digital libraries for the next millennium. We believe that digital libraries are ripe with research opportunities, offer many challenges, and will continue to grow in the next several years.
- Timmermann, B. N., Wachter, G., Valcic, S., Hutchinson, B., Casler, C., Henzel, J., Ram, S., Currim, F., Manak, R., Franzblau, S., Maiese, W., Galinis, D., Suarez, E., Fortunato, R., Saavedra, E., Bye, R., Mata, R., & Montenegro, G. (1999). The Latin American ICBG: The first five years. Pharmaceutical Biology, 37(SUPPL.), 35-54.More infoAbstract: The University of Arizona was awarded an International Cooperative Biodiversity Groups (ICBG) Program in 1993 for research into drug discovery from arid-adapted plants, biodiversity conservation and economic development in Latin America. The second phase of this program, initiated in 1998, will add the study of endophytic microorganisms as potential sources of new drugs. While biodiversity from arid lands is well known to produce a vast array of natural products as defensive agents and poisons, they have received much less attention than plants and microorganisms from the tropical rainforests as potential sources of useful biological agents. This experimental effort, funded by the U.S. government, is done in cooperation with universities and research institutions from the U.S.A., Argentina, Chile and Mexico, and U.S. pharmaceutical and agrochemical corporations. Detailed intellectual property agreements were fully executed among the participants defining work and funding commitments, ownership of materials, licensing rights and distribution of potential benefits. The collaborative model for bioprospecting employed in this ICBG has been studied with great enthusiasm by institutions developing access and benefit-sharing policies in Chile, Argentina, Mexico and other countries. Capacity building has been done to support the research and conservation efforts of the overall project by building laboratory infrastructure and information handling capabilities, and by promoting exchange of resources, information and ideas through formal links between the collaborating institutions. We currently summarize the goals, accomplishments, challenges, problems, and solutions encountered in the first five years of this ICBG.
- Konana, P., & Ram, S. (1998). Transaction management mechanisms for active and real-time databases: A comprehensive protocol and a performance study. Journal of Systems and Software, 42(3), 205-225.More infoAbstract: Active and real-time databases (ARTDB) have a variety of applications in electronic brokerages in financial markets, stock trading, network management and manufacturing process control. Transaction processing (TP) in ARTDB is extremely complicated since transactions may trigger other real-time transactions to an arbitrary depth with various types of dependencies (coupling modes). Therefore, transaction processing must be cognizant of not only the time deadlines but also the types of semantic dependencies with other transactions. The conflict resolution between two transactions cannot be considered in isolation since affecting one transaction may affect every other semantically dependent transaction. Similarly, transaction scheduling needs to be compatible with the concurrency control to avoid unnecessary restarts. In this paper we argue that transaction pre-analysis using the pre-declaration paradigm is an efficient mechanism to integrate the various issues of transaction processing such as concurrency control, scheduling, and semantic dependencies. The pre-analysis is possible since in many applications transactions repeat from a set of transaction classes, and the conflicts can be easily determined at a logical level by partitioning relations into mutually exclusive subset (e.g., by stock-id in financial applications). We develop a pre-analysis based transaction processing mechanism called OCCWB. OCCWB is an extended optimistic concurrency control protocol with blocking that combines the benefits of both optimistic and lock based protocols. Such an approach also has an implicit overload management mechanism required in many applications. OCCWB consists of four phases, namely, transaction pre-analysis, serialization ordering, priority adjustment and priority wait. Our protocol is validated using simulation and is shown to outperform existing protocols under various workload and parameter settings. © 1998 Elsevier Science Inc. All rights reserved.
- Ram, S., & Ramesh, V. (1998). Collaborative conceptual schema design: A process model and prototype system. ACM Transactions on Information Systems, 16(4), 347-371.More infoAbstract: Recent years have seen an increased interest in providing support for collaborative activities among groups of users participating in various information systems design tasks such as, requirements determination and process modeling. However, little attention has been paid to the collaborative conceptual database design process. In this article, we develop a model of the collaborative conceptual schema development process and describe the design and implementation of a graphical multiuser conceptual schema design tool that is based on the model. The system we describe allows a group of users to work collaboratively on the creation of database schemas in a synchronous (same-time) mode (either in a face-to-face or distributed setting). Extensive modeling support is provided to assist users in creating semantically correct conceptual schemas. The system also provides users with several graphical facilities such as, a large drawing workspace with the ability to scroll or "jump" to any portion of this workspace, zooming capabilities, and the ability to move object(s) to any portion of the workspace. The unique component of the system, however, is its built-in support for collaborative schema design. The system supports a relaxed WYSIWIS environment, i.e., each user can control the graphical layout of the same set of schema objects. The system ensures that changes/additions made by any user are immediately reflected at other user workstations and that all users' schemas are consistent. Any conflicts that may compromise the integrity of the shared schema are flagged and resolved by the system. The results from a preliminary experiment suggest that the use of our system in a collaborative mode improved information sharing among users, minimized conflicts, and led to a more comprehensive schema definition. © 1998 ACM.
- Ram, S., & Ramesh, V. (1998). Information sharing among multiple heterogeneous data sources distributed across the Internet. Proceedings of the Hawaii International Conference on System Sciences, 4, 504-.More infoAbstract: The technologies, techniques and protocols that can be used to facilitate the sharing of information from multiple heterogeneous data sources distributed across the Internet are presented. This minitrack consists of three papers addressing the various aspects of information sharing.
- Tamhankar, A. M., & Ram, S. (1998). Database fragmentation and allocation: An integrated methodology and case study. IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans., 28(3), 288-305.More infoAbstract: Distributed database design requires decisions on closely related issues such as fragmentation, allocation, degree of replication, concurrency control, and query processing. Integrated methodologies for distributed database design, therefore, tend to be very complex, predominantly theoretical, and limited in scope from a practical standpoint. Further, although the distribution options are interdependent, existing methodologies deal with fragmentation, replication, and allocation independent of one another. We develop an integrated methodology for fragmentation and allocation that is simple and practical and can be applied to real-life problems. The methodology also incorporates replication and concurrency control costs. At the same time, it is theoretically sound and comprehensive enough to achieve the objectives of efficiency and effectiveness. It distributes data across multiple sites such that design objectives in terms of response time and availability for transactions, and constraints on storage space, are adequately addressed. This work makes one of the first attempts at successfully combining fragmentation, allocation, and replication into a single step of distribution and applying the combination to a practical problem with positive results. This methodology has been used successfully in designing a distributed database system for a large geographically distributed organization. © 1998 IEEE.
- Konana, P., Lee, J., & Ram, S. (1997). Updating timestamp interval for dynamic adjustment of serialization order in Optimistic Concurrency Control-Time Interval (OCCTI) protocol. Information Processing Letters, 63(4), 189-193.
- Ram, S., Venkatsubramanyan, S., Marsh, S., & Ball, G. (1997). Resource discovery and intelligent image retrieval in a distributed environment. Proceedings of the Hawaii International Conference on System Sciences, 4, 340-349.More infoAbstract: Remote sensing satellites are generating large amounts of (image) data at different levels of temporal, spatial and spectral resolution, with varying levels of detail, and at various stages of processing with a variety of algorithms. This bewildering array of data sources is going to mandate a new class of information retrieval systems to help end-users access the networked information resources effectively. In this paper, we propose intelligent agent based mechanisms to locate the appropriate sources of satellite data, processing algorithms, and ancillary data that may be required at various levels of processing to answer different types of temporal and spatial queries.
- Ramesh, V., & Ram, S. (1997). Integrity constraint integration in heterogeneous databases: An enhanced methodology for schema integration. Information Systems, 22(8), 423-446.More infoAbstract: In today's technologically diverse corporate environment, it is common to find several different databases being used to accomplish the organization's operational data management functions. Providing interoperability among these databases is important to the successful operation of the organization. One approach to providing interoperability among heterogeneous database systems, is to define one or more schemas which represent a coherent view of the underlying databases. In the past, most approaches have used schematic knowledge about the underlying databases to generate integrated representations of the databases. In this paper we present a seven step methodology for utilizing integrity constraint knowledge from heterogeneous databases. Specifically, we describe how we can generate a set of integrity constraints applicable at the integrated level from constraints specified on local databases. We introduce the concept of constraint-based relationships between objects in heterogeneous databases and describe the role that these relationships play in integrity constraint integration. Finally, we describe how the integrated set of constraints generated using our methodology can be used to facilitate semantic query processing in a heterogeneous database environment. © 1997 Elsevier Science Ltd. All rights reserved.
- Choobineh, J., & Ram, S. (1996). Practical aspects of teaching an applied database course. Data Base for Advances in Information Systems, 27(3), 70-82.More infoAbstract: The information systems curriculum recommendations proposed by ACM (Nunamaker et al., 1982), IFIP (Buckingham et al., 1987), DPMA (1991), and the recent ACM/AIS/DPMA (Couger et al., 1995, 1997) contain an overlapping core and several electives. Coverage of database concepts is required by all of the three curriculum recommendations for the development of information systems professionals. An important component of this requirement is learning the practical aspects of developing information systems by using databases. In this paper we will present our experiences and recommendations in implementing this important aspect of teaching database courses within the information systems curriculum. In particular, we will a) discuss management of field projects, including their benefits and pitfalls; and 2) make recommendations on tools and technologies available for implementation within a class room environment.
- Ram, S., & Ram, S. (1996). Design and validation of a knowledge-based system for screening product innovations. IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans., 26(2), 213-221.More infoAbstract: Our research goal is to develop and validate an expert system that screens innovations prior to commercialization. This is an important research issue because business corporations are highly dependent on innovations for their growth and profitability, yet most corporations suffer from a high rate of new product failure. Few of the existing decision support systems have alleviated this problem, partly because of their inability to deal with nonmathematical (logical) relationships. An expert system for new product planning could save organizations tremendous amounts of resources (such as dollars, time and scientific talent) spent on product failures. The design of the proposed knowledge-based system is built upon our earlier work in this area -. We have addressed several critical research issues in the development of such a system: choice of the appropriate sources of knowledge, resolution of conflict among human experts chosen for knowledge acquisition, use of knowledge programming techniques that can accommodate uncertainty, and multiple methods of system validation. The research makes several contributions to marketing theory and practice. Most notably, the development of such systems contributes to effective product planning in organizations and enhances resource efficiency. Further, it generates guidelines for capturing and using expertise in highly unstructured decision-making situations such as product management. © 1996 IEEE.
- Ram, S., & Ram, S. (1996). Screening products for international markets: An expert systems application. New Review of Applied Expert Systems and Emerging Technologies, 2, 19-31.More infoAbstract: Although expert systems have been developed for several marketing applications, few have been developed in the context of international marketing. In this paper, we present the results of our efforts to design and implement the prototype of ADAPTOR, an expert system which assists firms in international market entry decisions. Specifically, the system can assist U.S. multinational firms decide: (1) whether to enter the Indian market for consumer packaged goods; and, if so, (2) how to adapt the product for the Indian market. Knowledge for the system was acquired through in-depth interviews with country experts and product experts selected through well-defined criteria. The system has been implemented in a PC environment using a shell called VPEXPERT. We discuss our preliminary validation procedures, identify potential research contributions of this project, and discuss areas for future development.
- Ram, S., & Ram, S. (1996). Validation of expert systems for innovation management: Issues, methodology, and empirical assessment. Journal of Product Innovation Management, 13(1), 53-68.More infoAbstract: Faced with the complexities of managing new product development, most of us would welcome the support of a computer-based system that captures the knowledge and the reasoning capabilities of experts in our field. Considerable effort has been focused on the design and development of expert systems for applications such as new product management. However, design and development are only two steps on the path to successful implementation of a useful expert system. A rigorous validation process is essential for ensuring that the expert system performs as intended. Using the INNOVATOR expert system as an example, Sundaresan Ram and Sudha Ram propose and test a framework for validating expert systems designed for new product management. The proposed validation framework considers three aspects of the expert system: its knowledge acquisition methodology, its performance, and its utility. Validation of an expert system's knowledge acquisition methodology involves assessment of the knowledge sources used, the criteria for selecting human experts, and the methods used for knowledge acquisition. Using multiple sources improves the likelihood that the expert system will capture the necessary core knowledge. Similarly, selection of the experts who are to supply the knowledge used by the expert system should be based on reliable measures of new product expertise rather than ad hoc measures. The system's performance is evaluated through formal tests of the accuracy and the completeness of the knowledge base, the consistency and the accuracy of the decisions made by the system, and the reasoning process by which the system reaches its decisions. Such tests may involve direct examination of the system by experts, and Turing tests, which compare both the recommendations and the reasoning process of the system with those of selected experts. Both types of tests may involve experts from whom knowledge was acquired during the development of the system as well as experts who were not involved in the design and development of the system. Assessment of an expert system's utility focuses on user perceptions of system performance and utility as well as the design of the user interface. First, end-users must evaluate the relevance of the chosen problem domain. In other words, the validation process must verify that the expert system addresses an important problem that requires decision support tools. Second, the expert system must provide a logical, systematic approach to solving the problem. Finally, the expert system must provide a consistent, intuitive user interface.
- Blanning, R. W., Ram, S., & Wang, R. Y. (1995). Information technologies and systems. Decision Support Systems, 13(3-4), 219-221.
- Hayne, S., & Ram, S. (1995). Group data base design: Addressing the view modeling problem. The Journal of Systems and Software, 28(2), 97-116.More infoAbstract: Today's organizations increasingly depend on the use of data base technology to manage their operations. Advances in technology have resulted in increasing the number and complexity of these data bases. Despite their growing complexity, all data bases have one thing in common: each must have gone through either a formal or an informal design process. Data bases must mirror reality accurately, and thus the design process must better capture that reality. The heart of the design process is the conceptual design, data model mapping, and physical design. Our research focuses on providing automated support for the first of these, e.g., conceptual design. Conceptual design is known to be a very difficult and time-consuming phase in the development of data base applications. This article describes the architecture, implementation, and use of a distributed graphical group data base design system. The group view modeling system (GVMS) is implemented in Microsoft Windows for networked personal computers. The main purpose of GVMS is to allow multiple designers (or users) to share conceptual design information in real time and resolve design conflicts through the electronic medium. The underlying data model, semantic data model, is extended to include distribution information as well as transactions and is represented as an extended entity relationship model. Diagram management techniques are implemented to aid in simplifying large complex designs. A small study demonstrated that groups of data base designers who define their view collectively outperform individuals. © 1995.
- Ram, S. (1995). Intelligent database design using the unifying semantic model. Information and Management, 29(4), 191-206.More infoAbstract: Research and development in the field of database systems has culminated in its widespread use. As usage has grown, the desire to link separate databases has resulted in substantial effort being directed towards the design of distributed database systems. A major research issue in designing such systems is the definition of a formal model that can be used to capture the semantics of the individual databases. This paper presents a semantic model called the Unifying Semantic Model (USM) and a software tool using it. The USM can be used for modeling the complex interrelationships and semantics found in a manufacturing environment. It is based on enhancements to existing semantic models. It can serve as a formal specification and documentation tool for databases. It can provide a means of specifying the Universe of Discourse for any interchange of information. © 1995.
- Ram, S., & Ramesh, V. (1995). Blackboard-based cooperative system for schema integration. IEEE expert, 10(3), 56-62.More infoAbstract: A comprehensive methodology for schema integration, a very important step towards creating information system interoperability, is presented. The methodology demonstrates how to employ a blackboard-based system to provide a cooperative environment that allows multiple computational human agents to interact. A successful prototype system has demonstrated the feasibility and utility of this approach.
- Storey, V. C., Thompson, C. B., & Ram, S. (1995). Understanding database design expertise. Data and Knowledge Engineering, 16(2), 97-124.More infoAbstract: Database design is a complex and time-consuming process. In order to automate database design, an understanding of the nature of expertise that goes into the design process is needed. Although a number of expert systems have been developed to assist or replace a database designer, database design expertise has not been examined in any detail. This paper proposes a conceptual framework for explaining this type of expertise. The components of the framework are applied to each phase of the design process and used to provide guidelines for the level of expertise developers might strive to obtain. Several representative systems are analyzed, based on the framework, to explore the degree to which expertise is being captured. Implications for the future development of database design expert systems are discussed. © 1995.
- Ram, S., & Narasimhan, S. (1994). Database allocation in a distributed environment: incorporating a concurrency control mechanism and queuing costs. Management Science, 40(8), 969-983.More infoAbstract: This research investigates the problem of allocating database fragments across a set of computers connected by a communication network. A mathematical model is presented to aid designers in the development of distributed database systems. The model takes into account the pattern of usage of the databases, communication costs in the network, delays due to queuing of data requests, costs for maintaining consistency among the various copies of a database, and storage costs. A solution procedure based on Lagrangian relaxation is proposed to solve the model. Computational results are reported along with several useful observations. The model is applicable to organizations that are considering migration from a centralized to a distributed computing environment.
- Ram, S., Hayne, S., & Carlson, D. (1992). Integrating information systems technologies to support consultation in an information center. Information and Management, 23(6), 331-343.More infoAbstract: This paper presents an approach for integrating different types of information systems technologies to support the functions of an information center (IC). A knowledge-based system, information center expert/help service (ICE/H), has been developed to provide support for the help services of an IC. A general process model to represent the consultation process in an IC is described. Based on this model, an architecture has been developed to support the consultation process. The architecture depicts the use of a knowledge management system, a data management system and a communication (e-mail) system to emulate the consultation process. The ICE/H system has been implemented using this architecture to support an IC with 5000 users. © 1992 - Elsevier Science Publishers B.V. All rights reserved.
- Ram, S., & Marsten, R. E. (1991). A model for database allocation incorporating a concurrency control mechanism. IEEE Transactions on Knowledge and Data Engineering, 3(3), 389-395.More infoAbstract: The impact of incorporating a specific concurrency control mechanism (CCM) into the file allocation problem (FAP) is discussed. Depending on the specific CCM use, the communication flows in a network will vary. To allocate data optimally, one must identify the exact communication flows in the network. It is this aspect that has been ignored in past research on the FAP. A linear mixed-integer programming model formulated for the FAP is given. The model incorporates the 'Write Locks All-Read Locks One' mechanism for concurrency control. A special algorithm based on the implicit representation of variable upper bounds is developed to solve the model. Detailed analysis for various configurations of a network is performed. Several potential applications for the model are identified.
- Carlson, D. A., & Ram, S. (1990). HyperIntelligence. The next frontier. Communications of the ACM, 33(3), 311-321.More infoAbstract: The authors discuss how mental models may be used to organize an individual's thoughts while forming a plan. A case study from a strategic planning textbook is used to facilitate the examination of supporting mental models with an information system. A hypermedia system, SPRINT, which supports an explicit representation of a mental model as a network of associations among the elements of a strategic plan is described.
- Carlson, D. A., & Ram, S. (1990). Modeling organizations as a social network of distributed knowledge-based systems. Proceedings of the Hawaii International Conference on System Science, 4, 271-280.More infoAbstract: An architecture is described that allows multiple, heterogeneous knowledge-base systems to cooperate in a partially structured social network. The design is conceived from social epistemological theory which studies the social influence on beliefs and evaluates social practices based on a truth-linked criterion. A planning theory of intention is reviewed as a mechanism for coordinating actions between agents. A design technique is proposed whereby information technology is applied to augment actual human processes that are deficient relative to the normative theory. A distributed knowledge-base management system architecture is described that models the practices and group methods of a social community. The individual knowledge-base systems model the processes and methods used by individual agents. Research issues are outlined for extending this architecture.
- Hayne, S., & Ram, S. (1990). Multi-user view integration system (MUVIS): An expert system for view integration. Proceedings - Sixth International Conference on Data Engineering, 402-409.More infoAbstract: A description is given of the architecture and development of a knowledge-based system called MUVIS (multiuser view integration system) to support the design of distributed object-oriented databases. MUVIS is implemented using an object-oriented development environment. It assists database designers in representing user views and integrating these views into a global conceptual view. The view integration component is decoupled from the view modeling component. The underlying data model, the semantic data model, treats all parts of the design as objects, thereby reducing the complexity of the integration.
- Ram, S., & Curran, S. M. (1990). The synthesis approach for relational database design: an expanded perspective. Information Sciences, 52(1), 53-73.More infoAbstract: This paper addresses relational database design using the concept of functional dependencies (FDs). The classical synthesis approach processes a given set of functional dependencies to produce one minimal cover. This cover is then used to develop a relational schema; however, a given set of FDs may have more than one minimal cover. In turn, different minimal covers may give rise to different relational schemata. An enhancement is proposed to the traditional synthesis algorithm that aids in efficiently determining all minimal covers for a given set of functional dependencies. The algorithm has been implemented using Turbo Pascal on an IBM PC/AT. The performance of this algorithm is compared with that of the traditional synthesis algorithm. © 1990.
- Ram, S., & Ram, S. (1990). Screening innovations in the telecommunications industry: An expert systems approach. Proceedings of the Hawaii International Conference on System Science, 3, 257-263.More infoAbstract: Research is described whose goal is to develop and validate an expert system that will screen new products and services prior to commercialization. The authors have developed a model for screening telecommunications innovations and built a prototype expert system based on this model. They have addressed several critical research issues in the development of such a system: choice of the appropriate sources of knowledge, resolution of conflict among human experts chosen for knowledge acquisition, use of knowledge programming techniques that can accommodate uncertainty, and suggestions for multiple methods of system validation. It is noted that the development of the system will contribute to effective product planning in organizations and enhance resource efficiency. It will also generate guidelines for capturing and using expertise in highly unstructured decision-making situations in business such as new product screening.
- Ram, S., Carlson, D., & Jones, A. (1990). Distributed knowledge based systems for computer integrated manufacturing. NIST Special Publication, 334-352.More infoAbstract: Efforts are being made in many organizations to use new technologies to automate and integrate the design, planning and manufacturing processes. The goal in developing these computer integrated manufacturing (CIM) systems is to increase productivity, improve product quality, and, minimize wastage of resources. This paper describes the information required to carry out major manufacturing functions. It also examines some of the special characteristics of these functions and their inputs/outputs. Based on this examination, the paper proposes an architecture for integrating distributed knowledge based systems (DKBS) to support CIM. An object oriented design for the various components of the DKBS is briefly described. Issues requiring further research are outlined.
- Carlson, D. A., & Ram, S. (1989). Object-oriented design for distributed knowledge-based systems. Proceedings of the Hawaii International Conference on System Science, 3, 55-63.More infoAbstract: Many management information systems have requirements that are inherently distributed, either logically or physically. An architecture for distributed knowledge-based systems (DKBS) is presented which satisfies these requirements. A distributed knowledge-base management system (DKBMS) is described which manages the metaknowledge necessary to coordinate multiple local KBSs. The authors propose an object-oriented design for implementing the DKBMS structure and the communication protocol which connects the DKBMS with each of the local KBSs. An example is described where the DKBS architecture is applied to planning research projects within a university. The resulting architecture is compared with previous research in related topics.
- Ram, S. (1989). A model for designing distributed database systems. Information and Management, 17(3), 169-180.More infoAbstract: In designing distributed database systems, an important issue is the location of various copies of each database. This is known as the File Allocation Problem (FAP). This paper examines the impact of incorporating a specific concurrency control mechanism (CCM) into the FAP. Several mechanisms can be used, and, depending on the choice, the communication flows in the network will vary. In order to allocate data optimally, one must identify the exact communication flows. It is this that has been ignored in past research. Here a non-linear integer programming model has been formulated for the FAP. It incorporates the Central Node Locking mechanism for concurrency control. The model has been solved using an algorithm called ZOOM/XMP. Detailed analysis has been carried out for various configurations. Assumed values have been used for the various non-decision parameters that need to be entered. © 1989.
- Ram, S., & Chastain, C. L. (1989). Architecture of distributed data base systems. The Journal of Systems and Software, 10(2), 77-95.More infoAbstract: Research and development over the last twenty years has culminated in the widespread use of data base management system (DBMS) software. As usage has grown, the desire to link and integrate separate data bases has resulted in substantial effort being directed towards the design of distributed data base systems. This paper presents the major architectures which have emerged for distributed data base systems. The architectures are compared and evaluated. Sixteen distributed data base management system (DDBMS) projects have been surveyed and classified according to the architectures. The various projects represent widely differing stages of effort: academic research, industrial testbeds, and commercial prototypes. The survey reviews important features of the DDBMSs. It does not attempt a qualitative performance comparison. The focus is instead on identification of overall architectural characteristics. The usefulness of the survey lies in the summary information which it imparts on current research, and in the classification scheme for generic distributed data base architectures which it provides. © 1989.
- Ram, S., & Curran, S. M. (1989). An automated tool for relational database design. Information Systems, 14(3), 247-259.More infoAbstract: This paper addresses relational database design using the concept of functional dependencies (FDs). The classical synthesis approach processes a given set of functional dependencies to produce one minimal cover. This cover is then used to develop a relational schema; however, a given set of FDs may have more than one minimal cover. In turn, different minimal covers may give rise to different relational schemata. An enhancement is proposed to the traditional synthesis algorithm that aids in efficiently determining all minimal covers for a given set of functional dependencies. We have implemented a tool called SYNTHESIZER that uses the modified synthesis algorithm to produce relations in Third Normal Form. SYNTHESIZER not only preserves dependencies, but also enforces the lossless join property. SYNTHESIZER has been implemented in Turbo Pascal to operate on IBM-PC compatibles. Expert design heuristics have been incorporated into this tool. The tool supports the requirements collection, conceptual and logical design phases of database design. SYNTHESIZER has been extensively validated by designers in more than 10 different database design projects. © 1989.
- Ram, S., & Ram, S. (1989). Expert systems: An emerging technology for selecting new product winners. The Journal of Product Innovation Management, 6(2), 89-98.More infoAbstract: Sundaresan Ram and Sudha Ram describe INNOVATOR, an expert system that they have developed to assess the success potential of new products in the financial services industry. They provide details on key aspects of the system: how knowledge is collected from experts and encoded; how a user interacts with the system to obtain recommendations on specific new products; and how the inference engine of the system arrives at the final recommendation. They discuss how INNOVATOR can, if necessary, be linked to external databases for additional data inputs. The authors also recommend alternative ways of eliciting the expertise of new product planners and different ways of programming their knowledge into the system. They highlight the key advantages that an expert system provides over conventional new product planning models. With its ability to detect potential winners early, an expert system like INNOVATOR can significantly contribute to the new product development process. © 1989.
- Ram, S., & Curran, S. M. (1988). SYNTHESIS APPROACH FOR RELATIONAL DATABASE DESIGN: AN EXPANDED PERSPECTIVE.. Proceedings of the Hawaii International Conference on System Science, 571-580.More infoAbstract: Relational database design using the concept of functional dependencies (FDs) is addressed. The classical synthesis approach processes a given set of FDs to produce one minimal cover, which is then used to develop a relational schema. However, a given set of FDs can have more than one minimal cover, and different minimal covers can give rise to different relational schemata. An enhancement is proposed to the traditional synthesis algorithm that aids in efficiently determining all minimal covers for a given set of FDs. The algorithm has been implemented using Turbo Pascal on an IBM PC AT. The performance of this algorithm is compared with that of the traditional synthesis algorithm.
- Ram, S., & Ram, S. (1988). INNOVATOR: An expert system for new product launch decisions. Applied Artificial Intelligence, 2(2), 129-148.More infoAbstract: Expert systems have become an increasingly important area of research for marketing academicians and practitioners. This paper describes an expert system called INNOVATOR that screens new product ideas and provides an approve/reject/reevaluate decision. If desired, the system can explain the logical reasoning it used to arrive at the decision. INNOVATOR uses a rule base consisting of regular IF and fuzzy IF rules and a backward chaining strategy to make recommendations. The paper explains the knowledge (expertise) requirements of the system, as well as the knowledge elicitation process and the knowledge representation scheme. The system has been implemented on an IBM 4381 using a shell called Expert System Environment (ESE). INNOVATOR, in its current form, assists consultants in financial service organizations in deciding whether to offer specific product innovations to their customers. The system is operational and has been validated by comparing its decisions with those of experts in the field.
- Ram, S., & Wang, T. (1988). Expert system for implementing communication protocols.. Array, 994-998.More infoAbstract: The Protocol Machine Expert System (PMES) for implementing communication protocols is described. The objective is to develop and implement a general model for communication protocols based on the principles of finite-state machines. The authors have designed an inference method and knowledge representation method, based on semantic networks, for implementing this model. A semantic net consists of points called nodes connected by links called arcs that describe the relations between the nodes. The authors have added interactive capability and automatic error detection to check for invalid external events and other types of errors in our model. They have illustrated the use of PMES by applying it to the TCP (transmission control protocol) of the US Department of Defense Model for Communication.
- Davis, W. J., & Ram, S. (1987). DESIGN OF DISTRIBUTED DATABASES FOR AN AUTOMATED MANUFACTURING FACILITY.. Computers in Engineering, Proceedings of the International Computers in Engineering Conference and, 2, 1-7.More infoAbstract: A recent worldwide trend to improve productivity in manufacturing has centered around the adoption of computer controlled processing to provide the essential coordination of the manufacturing processes. This paper presents a decision making control hierarchy (DM/CH) for an automated manufacturing environment. The data requirements at each level in this hierarchy are identified in order to design a distributed database system. Various issues in the design of a distributed database system are outlined. The relevance of these issues with respect to an automated manufacturing environment is discussed. It is shown that there is a natural basis for partitioning data at various levels of the hierarchy.
- Ram, S. (1987). FILE ALLOCATION PROBLEM: AN EXPANDED PERSPECTIVE.. Proceedings of the Hawaii International Conference on System Science, 3, 394-405.More infoAbstract: In designing distributed database systems an important issue is where to store the various copies of each database. This issue is known as the File Allocation Problem (FAP). This research addresses FAP from a new perspective. The main objective is to examine the impact of incorporating a specific concurrency control mechanism (CCM) into the FAP. CCM is an integral part of distributed database systems. Several mechanisms can be used for concurrency control. Depending on the specific CCM used, the communication flows in a network will vary. In order to allocate data optimally, one must identify the exact communication flows in the network. This aspect has been ignored in past research on the FAP. In this research a nonlinear integer programming model has been formulated for the FAP. The model incorporates the central node locking mechanism for concurrency control. The model has been solved using an algorithm called ZOOM/XMP. An analysis for various configurations of a network has been carried out. Assumed values have been used for the various non-decision parameters that need to be fed into the model. Several practical implications have been identified that give insight into the FAP.
- Belford, G. G., Liu, J. W., Hwung, S. C., Hsu, C. Y., Kaufman, K. A., Kim, C. K., Kim, J. K., Leo, J., Ma, A., Ng, J., Neff, D., Quinn, J., Ram, S., Yan, Y. L., & Zhang, L. (1986). REPORT GENERATION FACILITY - A HIGH - LEVEL INTERFACE FOR COHERENT ACCESS TO HETEROGENEOUS DATABASE SYSTEMS.. Array, 144-150.More infoAbstract: The Report Generation Facility is a high-level interface designed and implemented to support coherent access of data stored in a number of stand-alone, independent, database systems served by it. Specifically, the facility provides a menu which allows a user to request one or more standard reports provided by the individual database systems and the Report Generation Facility, or to formulate a query in an English-like query language without having to specify where the requested data reside or how to retrieve them. The Report Generation Facility translates the user's request or query into subtransactions to the individual database systems and submits the subtransactions in order to retrieve and format the data requested by the user. Thus, the facility provides a standard way to produce reports based on data stored in the database systems with minimal user effort and expertise. The facility's capabilities and architecture are described.
- Ram, S., & Belford, G. G. (1985). MODEL FOR THE DESIGN OF DISTRIBUTED DATABASES.. Array, 428-.More infoAbstract: This research is addressed towards solving a problem which falls under the general class of File Allocation Problems (FAP). The main purpose here is to provide a tool for the designer of a distributed database system. With an increasing number of enterprises tending towards distributed systems, a major research question that has come up is how to allocate resources (databases and programs) in a distributed environment. This question has been addressed by formulating a nonlinear integer programming model. Though the same issue has been addressed several times in the past, there is a major difference between the model developed here and all the other models. None of the models in the past have tried to solve the FAP by taking into account the concurrency control mechanism being used. This is a major deficiency of these models because allocation of files is partly based on the communication flows in a network of computers. The communication flows in turn are based on the concurrency control mechanism. It is this weakness that the research here has attempted to overcome by designing a more realistic model.
- Dong, F., Faiz, C., & Ram, S. (2019, December). Modeling and Prediction of Bike Sharing Systems Using Heterogeneous Data Sources. In Proceedings of WITS 2019, Munich, Germany, December 2019..
- Kim, B. (., Srinivasan, K., & Ram, S. (2019, December). Robust Local Explanations for Healthcare Predictive Analytics: An Application to Fragility Fracture Risk Modeling. In Proceedings of the 40th International Conference on Information Systems, ICIS 2019, Munich, Germany, December 15-18, 2019.
- Kim, B., & Ram, S. (2019, December). Neural Collaborative Filtering with Content for Personalizing Mobile Healthcare Applications. In Wonrkshop on Information Technology and Systems.
- Szep, A., Hashim, M., & Ram, S. (2019, Winter). Polarizing Virtual Water Cooler Chat? An Analysis of Bot Influence on Sentiment During US Elections. In Proceedings of WITS 2019 (Nominated for Best Paper Award).
- Yang, Z., Faiz, C., & Ram, S. (2019, Winter). Suggestion Mining for Online Health Forums. In Proceedings of WITS 2019.
- Lee, H., Razjouyan, J., Nguyen, H., Lindburg, C., Srinivasan, K., Gilligan, B., Canada, K., Sharafkhaneh, A., Mehl, M. R., Currim, F., Ram, S., Lunden, M., Heerwagen, J., Kampschroer, K., Sternberg, E. M., & Najafi, B. (2018). Sensor-Based Sleep Quality Index (SB-SQI): A New Metric to Examine the Association of Office Workstation Type on Stress and Sleep.
- Pries-Heje, J., Ram, S., & Rosemann, M. (2018, December). Proceedings of the International Conference on Information Systems - Bridging the Internet of People, Data, and Things, ICIS 2018, San Francisco, CA, USA, December 13-16, 2018. In ICIS 2018.
- Yan g, Z., & Ram, S. (2018, November). Future Impact Prediction of Technology based on Patent Ranking in Heterogeneous Network. In INFORMS Conference on Information Systems and Technology (CIST).
- Yang, Z., Ram, S., & Faiz, C. (2018, December 2018). Drug-Drug Interaction Mining and Interaction Terms Extraction using Deep Learning: A Word-level Attention Bi-Directional LSTM. In Proceedings of Workshop on Information Technology and Systems (WITS 2018).
- Canada, K., Kampschroer, K., Heerwagen, J., Gilligan, B., Andrews, S., Goebel, N., Lunden, M., Herzl, R., Herzl, D., Mehl, M. R., Lee, H., Razjouyan, J., Najafi, B., Skeath, P. R., Sternberg, E. M., Lindberg, C., Ram, S., Currim, F., & Srinivasan, K. (2017, July, 2-5). A Regularization Approach for Identifying Cumulative Lagged Effects in Smart Health Applications. In 7th International Conference on Digital Health, London, 2017, 99-103.
- Lee, K., & Ram, S. (2017, December). Identifying target audience for marketing on social media. In Workshop on Information Technology and Systems (WITS).
- Lee, K., & Ram, S. (2017, December). Reconsidering Measurement of Tie Strength in Online Social Networks. In International Conference on Information Systems (ICIS),.
- Ram, S., Currim, F., & Srinivasan, K. (2017, December). Using Digital Health Wearable Devices to Understand the Relationship Between Sound levels and Wellbeing: A Segmented Mixed-effects Regression Approach. In 27th Workshop on Information Technology and Systems (WITS), Seoul, South Korea (WITS2017).
- Srinivasan, K., Faiz, C., & Ram, S. (2017, December). Using Digital Health Wearable Devices to Understand the Relationship Between Sound levels and Wellbeing: A Segmented Mixed-effects Regression Approach. In Workshop on Information Technologya nd Systems.
- Zhang, W., & Ram, S. (2017, December). Are E-Cigarettes Safer Substitutes for Cigarettes Among Asthmatic Patients: A Social Media Based Analysis. In Workshop on Information Technology and Systems (WITS).
- Zhang, W., & Ram, S. (2017, November). Domain Adaptation for Signal Extraction from Large Social Media Datasets. In Conference on Information Systems and Technology (CIST).
- Zhang, W., & Ram, S. (2017, October). A Machine Learning Approach for Understanding Population-Level Health Effects of E-Cigarettes. In 8th International Conference on Health IT and Analytics.
- Bhattacharya, D., & Ram, S. (2016, Spring). Understanding the Competitive Landscape of News Providers on Social Media. In Proceedings of the 25th International Conference on World Wide Web, (WWW 2016, Montreal, Canada, April 11-15, 2016, Companion Volume, 719-724.
- Ram, S., Dong, F., Currim, F. A., Wang, Y., Dantas, E., & Sabóia, L. A. (2016, September). SMARTBIKE: Policy making and decision support for Bike Share systems. In IEEE 2nd International Smart Cities Conference.
- Ram, S., Wang, Y., Currim, F., Dong, F., Dantas, E., & Sabóia, L. A. (2016, April, 11-15). SMARTBUS: A Web Application for Smart Urban Mobility and Transportation. In AW4city 2nd International Smart City Workshop, Proceedings of World Wide WEb Conference, Montreal, 2016 (WWW 2016).
- Srinivasan, K., & Ram, S. (2016, Spring). Indoor Environmental Effects and Well Being. In Proceesdings of 6th International Conference on Digital Health Conference, Montreal Canada, April, 79-82.
- Srinivasan, K., Currim, F., Ram, S., Lindberg, C., Sternberg, E. M., Skeath, P. R., Najafi, B., Razjouyan, J., Lee, H., Foe-Parker, C., Goebel, N., Herzl, R., Mehl, M. R., Gilligan, B., Heerwagen, J., Kampschroer, K., & Canada, K. (2016, April, 11-15). Feature importance and predictive modeling for multi-source healthcare data with missing values (Best Paper Award). In 6th International Conference on Digital Health, Proceedings of World Wide Web Conference, Montreal, 2016 (WWW 2016).
- Wang, Y., Currim, F., & Ram, S. (2016, December). Deep Learning for Bus Passenger Demand Prediction Using Big Data (Best Paper Runner Up Award). In Proceedings of WITS, 2016.
- Wang, Y., Currim, F., Ram, S., Wang, Y., Currim, F., & Ram, S. (2016, December). Deep Learning for Bus Passenger Demand Prediction Using Big Data. In 26th Workshop on Information Technology and Systems (WITS), Dublin, Ireland (WITS2016).
- Wang, Y., Ram, S., Currim, F. A., Dantas, E., & Sabóia, L. A. (2016, September). A Big Data Approach for Smart Transportation Management on Bus Network ( Received Overall Conference Best Paper Award). In IEEE 2nd International Smart Cities Conference.
- Zhang, W., & Ram, S. (2016, April). Extracting Signals from Social Media for Chronic Disease Surveillance. In Proceedings of 6th International Conference on Digital health, Montreal, Canada, April 2016.
- Ram, S., & Zhang, W. (2015, December). A Comprehensive Methodology for Extracting Signal from Social Media Text Using Natural Language Processing and Machine Learning. In 25th Workshop on Information Technology and Systems (WITS 2015).
- Ram, S., Currim, F., Currim, S., & Wang, Y. (2015, December). Using Big Data for Predicting Freshman Retention. In International Conference on Information Systems, ICIS 2015.
- Ram, S., Zhang, W., Pengetenze, Y., & Burkhart, M. (2015, October). Asthma Surveillance using Social Media Signals. In 6th Annual Workshop on Health Information and Economics.
- Wang, Y., Currim, F., Ram, S., & Currim, S. (2015, December). Class Imbalance Learning and Novel Feature Extraction Methods for Predicting Freshman Retention. In 25th Workshop on Information Technology and Systems (WITS), Dallas, TX (WITS2015).
- Ram, S. (2019, December). AI and Big Data in Computation Social Science. Workshop on Computational Social Science, College of SBS, University of Arizona. University of Arizona.
- Ram, S. (2019, December). Editors Panel: Publishing in Journal of Business Analytics. SIGDSA Workshop.
- Ram, S. (2019, December). Is theory Dead in the world of AI and Big Data. Workshop in Information Systems and Technology (WITS).
- Ram, S. (2019, January). Keynote address: Levering Data Science for Building a Smart Campus at Data Science in Education Symposium -4CAST Symposium. Data Science in Education Symposium -4CAST Symposium. University of Iowa, Ames.
- Ram, S. (2019, January). Leveraging AI and Big Data for Social Good at PAN IIT Symposium on Leveraging AI for INDIA. PAN IIT Symposium on Leveraging AI for INDIA. New Delhi, India.
- Ram, S. (2019, March). Prediction Models using Network Analytics in Business. Distinguished Seminar Series, Indian School of Business (ISB). Hyderabad.
- Ram, S. (2019, March). Using Network Science for Predicting Brand Adoption on Social Media: The Role of Indirect Social Influence. Distinguished Speaker Seminar Series, Seoul National University. Seoul, South Korea: SNU.
- Ram, S. (2019, November). AI for Prediction Modeling. Keynote Address, University of Queensland Information Systems Workshop. Brisbane, Australia: University of Queensland.
- Ram, S. (2019, November). Interpretable AI in HealthCare. International Conference on Digital Public Health. Marseilles, France: ACM.
- Ram, S. (2019, October). Leveraging AI and Big Data to create Enterprises of the Future. Keynote Address, IEEE Enterprise Computing Conference. Sorbonne University, France: IEEE.
- Ram, S. (2019, September). Unlocking the Potential of AI and Big Data. Guru Speak, Keynote Speech. Mumbai, India: Indian Institute of Management, Calcutta Alumni Association.
- Ram, S. (2019, September). Using Spatio Temporal Data and Network Science for Predicting Student Retention. Notre Dame University Distibguished Speaker Seminar Series. Notre Dame University, South Bend Indiana.
- Ram, S. (2018, December). Big Data Analytics: Research Challenges and Opportunities at AIS SIG DSA workshop. AIS SIG DSA workshop. San Francisco.
- Ram, S. (2018, January). Understanding Asthma Triggers and Risk factors Using Machine learning and Heterogeneous Data Sources. MIS Quarterly Workshop for special issue on "Health Care Analytics". University of Texas, Dallas.
- Ram, S. (2018, June). Building a Smart Campus using Network Science and Machine Learning. ESSEC Business school, Research Seminar. ESSEC Business School, Paris, France.
- Ram, S. (2018, September). Business Analytics: Research Challenges at OR60: Conference of the OR Society. OR60: Conference of the OR Society. Lancaster, UK.
- Ram, S. (2018, Spring). Keynote address: Leveraging Data Science for extracting Value from Big Data at UC Merced Data Science Symposium. UC Merced Data Science Symposium. University of California, MERCED.
- Ram, S. (2018, Spring). Using Big data and Network science for Prediction Modeling. Faculty Research Seminar at Seoul National University Business School. Seoul National University Business School.
- Ram, S. (2018, Summer). Keynote Address: Leveraging Big Data and Analytics for Enterprises of the Future at Research Conference on Information Systems 2018. Research Conference on Information Systems 2018. Nantes, France.
- Ram, S. (2017, April). Building Smart Cities Using Big Data. City of Tucson Workshop on Big Data and Analytics. Tucson, AZ: City of Tucson.
- Ram, S. (2017, December). Leveraging Big Data and Analytics for addressing Social Grand Challenges. Faculty Research Seminar. Sri Jayawardane University, Colombo, Sri Lanka.
- Ram, S. (2017, Deecmber). Keynote address: Big Data Analytics in health Care: Challenes and Opportunities. Sri Lanka Association for Advancement of Science (SLAAS )International Theme Seminar and Workshop). Colombo, Sri Lanka.
- Ram, S. (2017, February). Using Big Data for Predictive Analytics in health Care: The case of Asthma. Faculty research Seminar Arizona Cancer Center. Arizona Cancer Center, Tucson, AZ: University of Arizona Medical School.
- Ram, S. (2017, January). Leveraging Big Data Aand Analytics: INSITE Center Research Overview. Zipperman Seminar. Eller College, U of A, Tucson, AZ.
- Ram, S. (2017, March). Keynote Presentation on "Leveraging Big Data for actionable insights". Facilty Source National Convention. Phoenix, Arizona.
- Ram, S. (2017, March). Using Big Data to Understand Movement and Behavior from Large Heterogeneous Datasets. Faculty Research Seminar. Seoul National University, Seoul, South Korea: SNU.
- Ram, S. (2017, November). Big Data for Conceptual Modeling. International Conference on Conceptual Modeling: ER 2017. Valencia, Spain.
- Ram, S. (2017, September). Big Data Analytics: Research and its Impact on Practice. Eller College Celebrate Research Faculty Workshop. Tucson, Arizona.
- Ram, S. (2017, September). Research on Big Data Analytics. Honor College Workshop for General Community and Parents Weekend. Tucson, AZ: Honors College, University of Arizona.
- Ram, S. (2017, September). Understanding behavior Using Smart Card Transactions to predict Freshman Retention. Faculty research Seminar. Leuven, Belgium: Katholik U niversity Leuven, Belgium.
- Ram, S. (2017, September). Understanding behavioural patterns from Big data for Predicting Freshman Retention. Faculty Research Seminar, Indiana University School of Business. Indiana University.
- Ram, S. (2016, Academic year). Keynote and Invited Presentations. See attached file for list.
- Velichety, S., Ram, S., & Bockstedt, J. C. (2015, March). Does Chatter Matter on Wikipedia?. Winter Conference on Business Intelligence. Snowbird: University of Utah.
- Ram, S. (2014, October). Big Data and Predictive Analytics: A Research Agenda - Keynote Address at IEEE/ACM Mobile Big Data Symposium. Keynote Address at IEEE/ACM Mobile Big Data Symposium. Atlanta: IEEE/ACM.
- Ram, S. (2014, October). Population Level Risk modeling for Asthma Related Emergency Department visits Using Big Data. University of Illinois Research Seminar. Chicago.
- Ram, S. (2014, October). Predicting Asthma Related Emergency Department visits using Big Data. INSITE Big Data Symposium. Tucson.
- Ram, S. (2014, September). Improving the World Using Big Data Analytics. Keynote Address at YWCA Leadership Conference.