2006 AAAI/SIGART Doctoral Consortium

Eleventh Annual AAAI/SIGART Doctoral Consortium
July 16-17, 2006
Boston, MA (part of AAAI 2006 and UAI 2006)

Student Abstracts

Abraham Bagherjeiran

A Value Theory of Meta-Learning Algorithms
We use game theory to analyze meta-learning algorithms. The objective of meta-learning is to determine which algorithm to apply on a given task. This is an instance of a more general problem that consists of allocating knowledge consumers to learning producers. Solving this general problem in the field of meta-learning yields solutions for related fields such as information retrieval and recommender systems.

Fernanda Caropreso

Towards a Text Representation for Sentence Selection
Sentence Selection consists in identifying the sentences relevant to a particular topic, task, user or linguistic structure. This is a necessary step in many document-processing tasks, such as Information Extraction (IE) and Text Summarization (TS). Sentence Selection is related to Automatic Text Categorization (ATC), but not all their characteristics are the same. Because of their differences, we believe some variations to the text representations and techniques usually used for ATC might be beneficial for Sentence Selection. We explore the benefits of a syntactically and semantically enriched text representation for the Sentence Selection task on technical domains. We also propose to use relational learning techniques to further take advantage of this enriched representation. We have tried our approach in several experiments in three different datasets. In all the experiments our approach performed better than the classical Bag of Words text representation. The use of the suggested relational representation and relational learner showed to be the best in two out of the three datasets.

Jinying Chen

Towards High-Performance Word Sense Disambiguation by Combining Rich Linguistic Knowledge and Machine Learning Approaches
Supervised word sense disambiguation (WSD) for truly polysemous words is a difficult problem for machine learning mainly due to the data sparseness problem (i.e., a few (training) instances observed in a very-high-dimension feature space) that often causes overfitting of machine learning models. At the same time, high accuracy is necessary for WSD to be beneficial for high-level applications, such as information retrieval, question answering, and machine learning. This paper presents our thesis work that addresses the data sparseness problem for WSD through combining rich linguistic knowledge and machine learning methods (feature selection and generalization, active learning) in order to achieve high accuracy verb sense disambiguation. We built a supervised WSD system by using a smoothed maximum entropy (MaxEnt) model and linguistically motivated features (e.g., the semantic categories of a verb's noun phrase (NP) arguments). With three specific enhancements to the automatic extraction and generation of linguistically motivated features, our system achieved higher performance than previously published best results in an evaluation using the SENSEVAL2 English verbs with fine-grained senses (64.6% accuracy; 16.7 senses on average). We developed a clustering-based feature selection method to filter out noisy semantic features introduced by sense ambiguity of a verb's NP arguments, which improved the system performance on verbs whose senses rely heavily on their NP arguments. To generalize semantic features, we developed a new clustering algorithm that automatically acquires semantically coherent noun groups from large text corpora. Using the acquired noun groups and their corresponding WordNet concepts (high-level WordNet synsets and hypernyms) as semantic features improved our system's performance further. These noun groups can also be used as features for other NLP applications such as machine translation, information retrieval, and question answering. We also investigated the effectiveness of active learning in the creation of more labeled training data for supervised WSD. Our experiments showed that two uncertainty-based active learning methods, combined with the smoothed MaxEnt model, worked well on learning coarse-grained English verb senses (a reduction of 1/3 to 1/2 training data).

Yun-Gyung Cheong

A Computational Model of Narrative Generation for Suspense
Although suspense contributes significantly to the enjoyment of a narrative by its readers, its role in dynamic story generation systems has been largely ignored. This paper presents Suspenser, a computational model of narrative generation that takes as input a given story world and constructs a narrative structure intended to evoke the desirable level of suspense from the reader at a specific moment in the story. This system is based on the concepts that a) the readers suspense level is affected by the number of solutions available to the problems faced by a narratives protagonists, and b) story structure can influence the readers narrative comprehension process. I use the Longbow planning algorithm to approximate the readers planning-related reasoning in order to estimate the number of anticipated solutions that the reader builds at a specific point in the story.

Pedro A. Diaz-Gomez

Choosing Characteristics for Genetic Algorithms
One of the difficulties in working with Genetic Algorithms (GAs) is choosing the characteristics of the fitness function, selection mechanism to be used, cross over and mutation probability, population size---appropriate for solving a problem. Besides the difficulty of the application problem to be solved, an additional difficulty arises because the quality of the solution found, or the computational resources required to find it, depends on the selection of the characteristics, that is, finding a correct fitness function and appropriate operators and parameters to solve a problem with GAs is itself a difficult problem. The goal of this dissertation, then, is to gain some insight into the difficult problem of finding a good fitness function and good operators to solve a problem with a GA, in the context of two specific applications: Audit trail analysis of intrusions and the snake in the box problem.

Laura Dietz

Probabilistic Topic Models to Support a Scientific Community
Excellent research requires a good understanding of the community according to publications, researchers and topics in order to quickly find potential research gaps as well as cooperation partners for specific tasks. Such an effective community support is especially required by researchers that are new to the field as well as in cross disciplinary areas. The contribution of my thesis will be a set of probabilistic models and algorithms for automatic recognition of thematic aspects in relations of actors and publications. Unsupervised data-driven algorithms are required to allow for application by the end user without expert knowledge. One example is the recognition of the topical aspect in a cites relation, which aims to distinguish background literature from publications that had a significant impact on the citing document. This allows to give historiographic details on the flow of ideas within the research field. Another example is the clarification of topical roles of co-authors in a joint authoring process which will give rise to topic based social network analysis.

Eric Eaton

Multi-Resolution Learning for Knowledge Transfer
Related objects may look similar at low resolutions; differences begin to emerge naturally as the resolution is increased. By learning across multiple resolutions of input, knowledge can be transfered between related objects. My dissertation develops this idea and applies it to the problem of multitask transfer learning.

Seyda Ertekin

Fast Automatic Classification with Online Support Vector Machines
In recent years, we have witnessed significant increase in the amount of data in digital format, due to the widespread use of computers and advances in storage systems. As the volume of digital information increases, people need more effective tools to better find, filter and manage these resources. Classification, the assignment of instances (i.e. pictures, text documents, emails, Web sites etc.) to one or more predefined categories based on their content, is an important component in many information organization and management tasks. Support Vector Machines (SVMs) is a popular machine learning algorithm for classification problems due to their theoretical foundation and good generalization performance. However, SVMs have not yet seen widespread adoption in the communities working with very large datasets due to the high computational cost involved in solving quadratic programming (QP) problem in the training phase. This research presents an online SVM learning algorithm, LASVM, which yields classification accuracy rates of the state-of-the-art SVM solvers but requires less computational resources. LASVM tolerates much smaller main memory and has a much faster training phase. We also show that not all the examples are equally informative in the training set. We present methods to select the most informative examples and exploit those to reduce the computational requirements of the learning algorithm.

Andrew Fast

Learning Models of Macrobehavior in Complex Adaptive Systems
Macrobehaviors in complex adaptive systems are high-level behaviors that arise from a combination of low-level interactions and characteristics of individuals. Over time, the actions of individuals participating in these systems influence the behavior of other individuals leading to the emergence of complex aggregate behavior. These aggregate behaviors are often not well understood and are difficult to identify and predict without automated tools; however, to date there are very few statistical tools designed for this purpose. In this work, I present new techniques, founded on existing relational learning algorithms, to learn the structure of the temporal and compositional dependencies present within complex systems. Initial models of the network among professional football coaches will be presented along with the challenges of structure learning for complex systems.

Janae Foss

Techniques for Generating Optimal, Robust Plans when Temporal Uncertainty is Present
Planning under uncertainty has been well studied, but usually the uncertainty is in action outcomes. This work instead investigates uncertainty in the amount of time that actions require to execute. In addition to this temporal uncertainty, the problems being studied must have robust solution plans that are optimized based on an objective function. This thesis summary details two iterative approaches that have been used to solve these type of problems and discusses future work, including state based approaches.

Michel Galley

Automatic Summarization of Conversational Multi-party Speech
My thesis focuses on automatic summarization of conversational speech, a task facing many challenges not found in written texts, e.g., lack of formatting elements and punctuation. My analysis of human-generated extractive summaries demonstrated the existence of non-local dependencies between utterances, e.g., a question will generally not appear in the summary unless its answer is also included. This finding led me to investigate ways of incorporating non-local structure in sequence models such as conditional Markov models and conditional random fields, and was an integral element in my design of an extractive summarizer that achieves state-of-the-art performance in a meeting summarization task. In addition, I investigate the problem of rendering errorful and disfluent spoken utterances into concise sentences that are maximally readable. I investigate a number of trainable approaches, including a syntax-driven compression method that aims to delete uninformative and syntactically optional constituents based on a large number of grammatical, lexical, acoustic, and information retrieval features whose parameters are discriminatively trained to optimize summary quality.

Rachel Greenstadt

Privatizing Constraint Optimization
Distributed Constraint Optimization (DCOP) is rapidly emerging as a prominent technique for multiagent coordination. However, despite agent privacy being a key motivation for applying DCOPs in many applications, rigorous quantitative evaluations of privacy loss in DCOP algorithms have been lacking. My thesis research aims to determine accurate methods for analyzing privacy loss, evaluate existing algorithms according to those methods and design new algorithms that lose less privacy. My approach includes detailed experimental analysis of existing algorithms---to gain an intuition and understanding about how design parameters interact to achieve privacy or privacy loss. Understanding how these parameters interact will facilitate the development of new algorithms that are more suitable for use in settings where privacy matters.

Arnav Jhala

Darshak: An Intelligent Cinematic Camera Planning System
A virtual camera is a powerful communicative tool in virtual environments. For virtual environments with an underlying narrative component, there is a need for automated camera planning systems that account for the situational parameters of the interaction and not just the graphical arrangement of the virtual world. We propose a camera planning system called Darshak that takes as input a story in the form of sequence of events and generates a sequence of camera actions based on cinematic idioms. The camera actions, when executed in the virtual environment update a list of geometric constraints on the camera. A constraint solver then places the camera based on these constraints.

Ron Katz

Automated Negotiators Using Generic and Demographic Opponent Modeling
In my research I develop algorithms and heuristics for agents participating in negotiations where other participants are humans. I'm currently focus on environments in which agents conduct interactions repeatedly with different people, performing generic opponent modeling of their behavior. Specifically, I focused on a set of negotiation environments named Cliff-Edge (CE), which include common daily interactions, such as sealed-bid auctions, dynamic pricing and Ultimatum Game. In all of these interactions the success probability decreases monotonically as the expected reward increases. Another aspect of my research considers modeling of human learning and decision making, using concepts from AI and machine learning.

Wenhui Liao

Dynamic and Active Information Fusion for Decision Making Under Uncertainty
In recent years, there has been a great deal of interest in information fusion for both military and civilian applications. The thesis aims to solve several technical challenges in active fusion for timely decision-making. The first challenge is problem modeling. For that, we propose a unified framework based on Influence Diagrams for simultaneously modeling and performing sensor selection, sensor fusion, and decision making. The second challenge is model learning. We will improve the existing learning methods to construct and parameterize the model from data and domain knowledge to deal with real-world applications. The third challenge is active sensing and efficient information fusion. It includes three key issues: how to choose a sensor selection criterion and compute the criterion efficiently; how to select optimal sensor sets given a sensor selection criterion; and how to fuse information efficiently. For the first issue, we propose using value-of-information as a measure of sensory utility, and further propose an approximation algorithm to efficiently compute nonmyopic value-of-information for multiple sensors. For the second issue, two efficient sensor selection algorithms are proposed to decide the optimal sensor sets under two typical scenarios: the budget-limit case and the optimal trade-off case. For the third issue, a factor-tree inference (FTI) algorithm is proposed to efficiently fuse the information collected from the selected sensors. Finally, to demonstrate the proposed framework and the associated algorithms, we apply them to an application of human computer interaction. We also expect to apply them to a military decision-making application.

Bhaskar Mehta

Cross System Personalization by Learning Manifold Alignments
Today, personalization in digital libraries and other information systems occurs separately within each system that one interacts with. However, there are several potential improvements w.r.t. such isolated approaches. Investments of users in personalizing a system, either through explicit provision of information, or through long and regular use are not transferable to other systems. Moreover, users have little or no control over the information that defines their profile, since user profiles are deeply buried in personalization engines. Cross-system personalization, i.e. personalization that shares personalization information across different systems in a user-centric way, overcomes the aforementioned problems. Information about users, which is originally scattered across multiple systems, is combined to obtain maximum leverage. The key idea is that when a large number of users cross over from one system to another, carrying their user profiles with them, a mapping between the user profiles of the two systems can be discovered. In this thesis, we discuss the use of vector valued learning for the purpose of computing recommendations for a new user crossing over from one system to another.

Martin Michalowski

A Generalized Query Framework for Geospatial Reasoning
The ability to reason over geospatial entities using publicly available information is greatly enhanced by the abundance of data sources on the Internet. There are several key challenges that need to be addressed in order to fully realize the benefits of fusing these diverse types of sources. The large amount of data that can be exploited also increases the probability of encountering source failures or data inconsistencies. Therefore, it is imperative that any systems that deal with real-world data sources have the ability to deal with these potential issues. By introducing more data, we are also presented with a scaling problem. Any reasoning algorithms and conflict resolution strategies need to scale to larger problem instances and support larger numbers of data sources. In the research I am proposing, I will create a generalized framework with the ability to gather information required for information fusion, reason over the data produced during the gathering process, and produce answers that are accurate as well as consistent. This framework will deal with source failures and data inconsistencies in a novel and scalable way and will support a wide range of queries and a variety of data sources.

James Niehaus

Employing Cognitive Theories of Discourse Comprehension to Improve Learning in Interactive Narratives
A substantial body of research concerns the topic of education and training using narrative and virtual worlds. The interactivity of these virtual worlds means that in order to effectively use narrative for teaching, computer systems must be able to make story-level decisions and create new stories on the fly. My proposed research addresses this problem, namely: How do we generate stories that teach?. I hypothesize that by using cognitive models in automatic narrative generation, we can improve the recall and comprehension of curriculum material from generated narratives.

Peter Owotoki

Intelligent Monitoring for Preventive Maintenance of Aircraft Systems
My work is about developing and implementing concepts for applying intelligent prediction and classification models to the task of preventive maintenance of aircraft cabin systems. I have a three fold approach to this problem. Firstly a framework for integrating the different components of the system with other cabin components has been developed. The framework enables each component to be embedded with an intelligent model for monitoring of its health states. The second step involves the automatic selection of the models in every component by employing a Meta learning approach. A final step is to develop and implement constraints that make the embedded models transparent, so that they can be used as well for the diagnosis of faults and for the discovery of new maintenance relevant knowledge.

Ashwin Satyanarayana

Intelligent Data Selection Using Sampling and Filtering
In many real world problems, machine learning algorithms have access to massive amounts of data. Mining all the available data is prohibitive due to computational (time and memory) constraints. Much of the current research is concerned with scaling up data mining algorithms (i.e. improving on existing data mining algorithms for larger datasets). An alternative approach is to scale down the data. Thus, determining a smallest sufficient training set size that obtains the same accuracy as the entire available dataset remains an important research question. Our research focuses on selecting how many (sampling) and which (filtering) instances to present to the machine learning algorithm. The goals of this thesis are to study and characterize the properties of learning curves, integrate them with bounds such as Chernoff and Chebychev to come up with an efficient general purpose adaptive sampling schedule and filtering technique, and to empirically validate our algorithm for scaling down the data.

Mohan Sridharan

Robust Autonomous Structure-based Color Learning on a Mobile Robot
To operate in the proverbial real world, autonomous robots are predominantly dependent on sensory information but the ability to accurately sense the complex world is still missing. Visual input, in the form of color images from a camera, should be an excellent source of such information, but color, and images in general, have been used sparingly on mobile robots. This is because most sophisticated vision algorithms require substantial amount of computational and memory resources and cannot account for the rapid camera motion characteristic of mobile robot domains. Also they are sensitive to illumination changes. In this thesis the focus shall be on designing algorithms that can work within the robot's computational and memory contraints while making the best use of the available information. First, I develop a prototype vision system that solves challenging vision problems such as color segmentation and object recognition on-board an autonomous robot, relying on manual training data and operating under constant and reasonably uniform illumination conditions. Second, the robot shall exploit the knowledge of the location of unique objects in its world to autonomously learn the desired colors required for color segmentation. Third, I shall enable the robot to adapt to the changes in illumination. To summarize, the thesis shall provide a mobile robot vision system that autonomously learns colors from structure and adapts to illumination changes.

Gita Sukthankar

Activity Recognition for Physically-Embodied Agent Teams
This thesis focuses on the problem of activity recognition for physically-embodied agent teams. Examples of physically-embodied agent teams include: soldiers moving in formation, gamers controlling virtual bots in a simulated world, or robots playing a Robocup soccer match.} We define team activity recognition as the process of identifying team behaviors from traces of agents' positions and orientations as they evolve over time; the goal is to completely annotate agent traces with: 1) the correct sequence of low-level actions performed by each agent; 2) an assignment of agents to teams and subteams; 3) the set of team plans consistent with the observed sequence. Activity traces are gathered from teams of humans or agents performing military tasks in urban environments. Team behavior annotations can be used for a wide variety of applications including virtual training environments, visual monitoring systems, and commentator agents. For many physical domains, coordinated team behaviors create distinctive spatio-temporal patterns that can be used to identify low-level action sequences; we demonstrate that this can be done in a way that is robust to spatial variations in the environment and human deviations during behavior execution. This thesis addresses the novel problem of agent-to-team assignment for team tasks where team composition, the mapping of agents into teams, changes over time; this allows the analysis of more complicated tasks in which agents must periodically divide into subteams. To do this, we introduce a new algorithm, Simultaneous Team Assignment and Behavior Recognition (STABR), that generates low-level action annotations from spatio-temporal agent traces. Finally, we extend methods in symbolic plan recognition to exploit both temporal constraints between behaviors and agent role constraints in team plans to reduce the number of state history hypotheses that our system must consider.

Chaofan Sun

Adaptive Support Vector Machine Learning with Data Selection
This paper presents data selection procedures for support vector machines (SVMs). The purpose of data selection is to reduce the dataset by eliminating as many non support vectors (non-SVs) as possible. Based on the fact that support vectors (SVs) are those vectors close to the decision boundary, data selection keeps only the closest pair vectors of opposite classes. The selected dataset will replace the full dataset as the training component for any standard SVM algorithm. For big datasets, we use some kind data structure to speed up finding closest pairs. Spatial Approximation Sample Hierarchy (SASH) tree can be used to approximate k-NNs. The basic idea behind SASH tree is an approximation of the triangle inequality d(x,z) < d(x,y) + d(y,z), where d represents the distance of two vectors. In other words, if y is the closest neighbor of x, and z is the closest neighbor of y, then z is most likely one closest neighbor of x. Inspired by this structure, we use also similar structure to approximate closest pairs but from opposite classes, different from k-NNs searching in SASH in which there is no class distinction. Therefore node connections between two adjacent levels are established based on the top closest pairs of opposite classes. For each node, there is also one or more connections to its same class closest pairs in order to keep indirect closest pairs. The time complexity of building this tree is O(dpcnlogn) and querying is O(dpcnlogn). The space used is O(n). Accordingly, searching closest pairs of opposite classes is one order of magnitude faster and 90% more SVs can be covered.

Jim Thomas

A Collaborative Authoring Environment for Plan-Based Interactive Narrative
I am interested in making automated planning technology more accessible and more widely used. Toward that end, I am developing a tool set that enables plan authors to quickly gain an understanding of the characteristics of an automated plan that are of particular interest. My tool employs a domain metatheory I call the Domain Elaboration Framework, or DEF, that allows authors to identify aspects of the plan space and individual plans over which they wish to have better control or knowledge. Although my initial motivation is to apply these insights toward the construction of plans in the domain of interactive narrative, I am attempting to maintain sufficient generality to allow this work to be of use to the broader planning community.

Rattapoom Tuchinda

Interactively Building Information Integration Agents
Any kind of data imaginable can be found on the Internet today: pricing and reviews of goods and service from multiple vendors, maps, and statistic about event around us. This data, when combined the right way can give us interesting insights. The trend is evidence in the proliferation of many web applications called "mashup." However, the process of creating one is limited to those who have knowledge in programming, because of the sophistication required in three separate steps: data retrieval, data integration, and data display composition. My thesis work aims to investigate the framework that allows an everyday user to be able to build information integration plans (mashup is one type of such plans) without having to write any code.

Tao Wang

Action Selection in Bayesian Reinforcement Learning
My research attempts to address on-line action selection in reinforcement learning from a Bayesian perspective. The idea is to develop more effective action selection techniques by exploiting information in a Bayesian posterior, but in a computationally efficient manner. I have focused on developing efficient approximations based on stochastic simulation and value function approximation. The first approach I developed selects actions by growing an adaptive, sparse lookahead tree to estimate the future value of alternative actions. I further augment the approach by considering a new value function approximation strategy for the belief-state Markov decision processes induced by Bayesian learning.

Kiri Wagstaff < Email : kiri.wagstaff@jpl.nasa.gov >

Last modified: Mon Jul 10 19:10:51 2006