Student Abstracts
Abraham Bagherjeiran
A Value Theory of Meta-Learning Algorithms
We use game theory to analyze meta-learning algorithms. The objective of
meta-learning is to determine which algorithm to apply on a given task. This
is an instance of a more general problem that consists of allocating knowledge
consumers to learning producers. Solving this general problem in the field of
meta-learning yields solutions for related fields such as information
retrieval and recommender systems.
Fernanda Caropreso
Towards a Text Representation for Sentence Selection
Sentence Selection consists in identifying the sentences relevant to a
particular topic, task, user or linguistic structure. This is a necessary step
in many document-processing tasks, such as Information Extraction (IE) and
Text Summarization (TS). Sentence Selection is related to Automatic Text
Categorization (ATC), but not all their characteristics are the same. Because
of their differences, we believe some variations to the text representations
and techniques usually used for ATC might be beneficial for Sentence
Selection. We explore the benefits of a syntactically and semantically
enriched text representation for the Sentence Selection task on technical
domains. We also propose to use relational learning techniques to further take
advantage of this enriched representation. We have tried our approach in
several experiments in three different datasets. In all the experiments our
approach performed better than the classical Bag of Words text representation.
The use of the suggested relational representation and relational learner
showed to be the best in two out of the three datasets.
Jinying Chen
Towards High-Performance Word Sense Disambiguation by Combining Rich Linguistic Knowledge and Machine Learning Approaches
Supervised word sense disambiguation (WSD) for truly polysemous words
is a difficult problem for machine learning mainly due to the data
sparseness problem (i.e., a few (training) instances observed in a
very-high-dimension feature space) that often causes overfitting of
machine learning models. At the same time, high accuracy is necessary
for WSD to be beneficial for high-level applications, such as
information retrieval, question answering, and machine learning. This
paper presents our thesis work that addresses the data sparseness
problem for WSD through combining rich linguistic knowledge and
machine learning methods (feature selection and generalization, active
learning) in order to achieve high accuracy verb sense
disambiguation. We built a supervised WSD system by using a smoothed
maximum entropy (MaxEnt) model and linguistically motivated features
(e.g., the semantic categories of a verb's noun phrase (NP)
arguments). With three specific enhancements to the automatic
extraction and generation of linguistically motivated features, our
system achieved higher performance than previously published best
results in an evaluation using the SENSEVAL2 English verbs with
fine-grained senses (64.6% accuracy; 16.7 senses on average). We
developed a clustering-based feature selection method to filter out
noisy semantic features introduced by sense ambiguity of a verb's NP
arguments, which improved the system performance on verbs whose senses
rely heavily on their NP arguments. To generalize semantic features,
we developed a new clustering algorithm that automatically acquires
semantically coherent noun groups from large text corpora. Using the
acquired noun groups and their corresponding WordNet concepts
(high-level WordNet synsets and hypernyms) as semantic features
improved our system's performance further. These noun groups can also be used
as features for other NLP applications such as machine translation,
information retrieval, and question answering. We also investigated
the effectiveness of active learning in the creation of more labeled
training data for supervised WSD. Our experiments showed that two
uncertainty-based active learning methods, combined with the smoothed
MaxEnt model, worked well on learning coarse-grained English verb
senses (a reduction of 1/3 to 1/2 training data).
Yun-Gyung Cheong
A Computational Model of Narrative Generation for Suspense
Although suspense contributes significantly to the enjoyment of a
narrative by its readers, its role in dynamic story generation systems has
been largely ignored. This paper presents Suspenser, a computational model
of narrative generation that takes as input a given story world and
constructs a narrative structure intended to evoke the desirable level of
suspense from the reader at a specific moment in the story. This system is
based on the concepts that a) the readers suspense level is affected by
the number of solutions available to the problems faced by a narratives
protagonists, and b) story structure can influence the readers narrative
comprehension process. I use the Longbow planning algorithm to approximate
the readers planning-related reasoning in order to estimate the number of
anticipated solutions that the reader builds at a specific point in the
story.
Pedro A. Diaz-Gomez
Choosing Characteristics for Genetic Algorithms
One of the difficulties in working with Genetic Algorithms (GAs) is
choosing the characteristics of the fitness function, selection
mechanism to be used, cross over and mutation probability, population
size---appropriate for solving a problem. Besides the difficulty of
the application problem to be solved, an additional difficulty arises
because the quality of the solution found, or the computational
resources required to find it, depends on the selection of the
characteristics, that is, finding a correct fitness function and
appropriate operators and parameters to solve a problem with GAs is
itself a difficult problem. The goal of this dissertation, then, is to
gain some insight into the difficult problem of finding a good fitness
function and good operators to solve a problem with a GA, in the
context of two specific applications: Audit trail analysis of
intrusions and the snake in the box problem.
Laura Dietz
Probabilistic Topic Models to Support a Scientific Community
Excellent research requires a good understanding of the community according to
publications, researchers and topics in order to quickly find potential
research gaps as well as cooperation partners for specific tasks. Such an
effective community support is especially required by researchers that are new
to the field as well as in cross disciplinary areas.
The contribution of my thesis will be a set of probabilistic models and
algorithms for automatic recognition of thematic aspects in relations of
actors and publications. Unsupervised data-driven algorithms are required to
allow for application by the end user without expert knowledge.
One example is the recognition of the topical aspect in a cites relation,
which aims to distinguish background literature from publications that had a
significant impact on the citing document. This allows to give historiographic
details on the flow of ideas within the research field. Another example is the
clarification of topical roles of co-authors in a joint authoring process
which will give rise to topic based social network analysis.
Eric Eaton
Multi-Resolution Learning for Knowledge Transfer
Related objects may look similar at low resolutions; differences begin to
emerge naturally as the resolution is increased. By learning across multiple
resolutions of input, knowledge can be transfered between related objects. My
dissertation develops this idea and applies it to the problem of multitask
transfer learning.
Seyda Ertekin
Fast Automatic Classification with Online Support Vector Machines
In recent years, we have witnessed significant increase in the amount of
data in digital format, due to the widespread use of computers and advances
in storage systems. As the volume of digital information increases, people
need more effective tools to better find, filter and manage these resources.
Classification, the assignment of instances (i.e. pictures, text documents,
emails, Web sites etc.) to one or more predefined categories based on their
content, is an important component in many information organization and
management tasks. Support Vector Machines (SVMs) is a popular machine
learning algorithm for classification problems due to their theoretical
foundation and good generalization performance. However, SVMs have not yet
seen widespread adoption in the communities working with very large datasets
due to the high computational cost involved in solving quadratic programming
(QP) problem in the training phase. This research presents an online SVM
learning algorithm, LASVM, which yields classification accuracy rates of the
state-of-the-art SVM solvers but requires less computational resources.
LASVM tolerates much smaller main memory and has a much faster training
phase. We also show that not all the examples are equally informative in the
training set. We present methods to select the most informative examples and
exploit those to reduce the computational requirements of the learning
algorithm.
Andrew Fast
Learning Models of Macrobehavior in Complex Adaptive Systems
Macrobehaviors in complex adaptive systems are high-level behaviors that arise
from a combination of low-level interactions and characteristics of
individuals. Over time, the actions of individuals participating in these
systems influence the behavior of other individuals leading to the emergence
of complex aggregate behavior. These aggregate behaviors are often not well
understood and are difficult to identify and predict without automated tools;
however, to date there are very few statistical tools designed for this
purpose. In this work, I present new techniques, founded on existing
relational learning algorithms, to learn the structure of the temporal and
compositional dependencies present within complex systems. Initial models of
the network among professional football coaches will be presented along with
the challenges of structure learning for complex systems.
Janae Foss
Techniques for Generating Optimal, Robust Plans when Temporal Uncertainty is Present
Planning under uncertainty has been well studied, but
usually the uncertainty is in action outcomes. This work
instead investigates uncertainty in the amount of time
that actions require to execute. In addition to this temporal
uncertainty, the problems being studied must have
robust solution plans that are optimized based on an objective
function. This thesis summary details two iterative
approaches that have been used to solve these type
of problems and discusses future work, including state based
approaches.
Michel Galley
Automatic Summarization of Conversational Multi-party Speech
My thesis focuses on automatic summarization of conversational speech,
a task facing many challenges not found in written texts, e.g., lack
of formatting elements and punctuation. My analysis of human-generated
extractive summaries demonstrated the existence of non-local
dependencies between utterances, e.g., a question will generally not
appear in the summary unless its answer is also included. This
finding led me to investigate ways of incorporating non-local
structure in sequence models such as conditional Markov models and
conditional random fields, and was an integral element in my design of
an extractive summarizer that achieves state-of-the-art performance in
a meeting summarization task. In addition, I investigate the problem
of rendering errorful and disfluent spoken utterances into concise
sentences that are maximally readable. I investigate a number of
trainable approaches, including a syntax-driven compression method
that aims to delete uninformative and syntactically optional
constituents based on a large number of grammatical, lexical,
acoustic, and information retrieval features whose parameters are
discriminatively trained to optimize summary quality.
Rachel Greenstadt
Privatizing Constraint Optimization
Distributed Constraint Optimization (DCOP) is rapidly emerging as a
prominent technique for multiagent coordination. However, despite agent
privacy being a key motivation for applying DCOPs in many applications,
rigorous quantitative evaluations of privacy loss in DCOP algorithms have
been lacking. My thesis research aims to determine accurate methods for
analyzing privacy loss, evaluate existing algorithms according to those
methods and design new algorithms that lose less privacy. My approach
includes detailed experimental analysis of existing algorithms---to gain
an intuition and understanding about how design parameters interact to
achieve privacy or privacy loss. Understanding how these parameters
interact will facilitate the development of new algorithms that are more
suitable for use in settings where privacy matters.
Arnav Jhala
Darshak: An Intelligent Cinematic Camera Planning System
A virtual camera is a powerful communicative tool in virtual
environments. For virtual environments with an underlying narrative
component, there is a need for automated camera planning systems that
account for the situational parameters of the interaction and not just
the graphical arrangement of the virtual world. We propose a camera
planning system called Darshak that takes as input a story in the form
of sequence of events and generates a sequence of camera actions based
on cinematic idioms. The camera actions, when executed in the virtual
environment update a list of geometric constraints on the camera. A
constraint solver then places the camera based on these constraints.
Ron Katz
Automated Negotiators Using Generic and Demographic Opponent Modeling
In my research I develop algorithms and heuristics for agents participating in
negotiations where other participants are humans. I'm currently focus on
environments in which agents conduct interactions repeatedly with different
people, performing generic opponent modeling of their behavior. Specifically,
I focused on a set of negotiation environments named Cliff-Edge (CE), which
include common daily interactions, such as sealed-bid auctions, dynamic
pricing and Ultimatum Game. In all of these interactions the success
probability decreases monotonically as the expected reward increases. Another
aspect of my research considers modeling of human learning and decision
making, using concepts from AI and machine learning.
Wenhui Liao
Dynamic and Active Information Fusion for Decision Making Under Uncertainty
In recent years, there has been a great deal of interest in
information fusion for both military and civilian applications. The thesis
aims to solve several technical challenges in active fusion for timely
decision-making. The first challenge is problem modeling. For that, we propose
a unified framework based on Influence Diagrams for simultaneously modeling
and performing sensor selection, sensor fusion, and decision making. The
second challenge is model learning. We will improve the existing learning
methods to construct and parameterize the model from data and domain knowledge
to deal with real-world applications. The third challenge is active sensing
and efficient information fusion. It includes three key issues: how to choose
a sensor selection criterion and compute the criterion efficiently; how to
select optimal sensor sets given a sensor selection criterion; and how to fuse
information efficiently. For the first issue, we propose using
value-of-information as a measure of sensory utility, and further propose an
approximation algorithm to efficiently compute nonmyopic value-of-information
for multiple sensors. For the second issue, two efficient sensor selection
algorithms are proposed to decide the optimal sensor sets under two typical
scenarios: the budget-limit case and the optimal trade-off case. For the third
issue, a factor-tree inference (FTI) algorithm is proposed to efficiently fuse
the information collected from the selected sensors. Finally, to demonstrate
the proposed framework and the associated algorithms, we apply them to an
application of human computer interaction. We also expect to apply them to a
military decision-making application.
Bhaskar Mehta
Cross System Personalization by Learning Manifold Alignments
Today, personalization in digital libraries and other information
systems occurs separately within each system that one interacts
with. However, there are several potential improvements w.r.t. such
isolated approaches. Investments of users in personalizing a system,
either through explicit provision of information, or through long and
regular use are not transferable to other systems. Moreover, users
have little or no control over the information that defines their
profile, since user profiles are deeply buried in personalization
engines. Cross-system personalization, i.e. personalization that
shares personalization information across different systems in a
user-centric way, overcomes the aforementioned problems. Information
about users, which is originally scattered across multiple systems, is
combined to obtain maximum leverage. The key idea is that when a large
number of users cross over from one system to another, carrying their
user profiles with them, a mapping between the user profiles of the
two systems can be discovered. In this thesis, we discuss the use of
vector valued learning for the purpose of computing recommendations
for a new user crossing over from one system to another.
Martin Michalowski
A Generalized Query Framework for Geospatial Reasoning
The ability to reason over geospatial entities using publicly available
information is greatly enhanced by the abundance of data sources on the
Internet. There are several key challenges that need to be addressed in order
to fully realize the benefits of fusing these diverse types of sources. The
large amount of data that can be exploited also increases the probability of
encountering source failures or data inconsistencies. Therefore, it is
imperative that any systems that deal with real-world data sources have the
ability to deal with these potential issues. By introducing more data, we are
also presented with a scaling problem. Any reasoning algorithms and conflict
resolution strategies need to scale to larger problem instances and support
larger numbers of data sources. In the research I am proposing, I will create
a generalized framework with the ability to gather information required for
information fusion, reason over the data produced during the gathering
process, and produce answers that are accurate as well as consistent. This
framework will deal with source failures and data inconsistencies
in a novel and scalable way and will support a wide range of queries and a
variety of data sources.
James Niehaus
Employing Cognitive Theories of Discourse Comprehension to Improve Learning in Interactive Narratives
A substantial body of research concerns the topic of education and
training using narrative and virtual worlds. The interactivity of
these virtual worlds means that in order to effectively use narrative
for teaching, computer systems must be able to make story-level
decisions and create new stories on the fly. My proposed research
addresses this problem, namely: How do we generate stories that
teach?. I hypothesize that by using cognitive models in automatic
narrative generation, we can improve the recall and comprehension of
curriculum material from generated narratives. Peter Owotoki
Intelligent Monitoring for Preventive Maintenance of Aircraft Systems
My work is about developing and implementing concepts for applying
intelligent prediction and classification models to the task of
preventive maintenance of aircraft cabin systems. I have a three fold
approach to this problem. Firstly a framework for integrating the
different components of the system with other cabin components has
been developed. The framework enables each component to be embedded
with an intelligent model for monitoring of its health states. The
second step involves the automatic selection of the models in every
component by employing a Meta learning approach. A final step is to
develop and implement constraints that make the embedded models
transparent, so that they can be used as well for the diagnosis of
faults and for the discovery of new maintenance relevant knowledge.
Ashwin Satyanarayana
Intelligent Data Selection Using Sampling and Filtering
In many real world problems, machine learning algorithms have access to
massive amounts of data. Mining all the available data is prohibitive due
to computational (time and memory) constraints. Much of the current
research is concerned with scaling up data mining algorithms (i.e.
improving on existing data mining algorithms for larger datasets). An
alternative approach is to scale down the data. Thus, determining a
smallest sufficient training set size that obtains the same accuracy as
the entire available dataset remains an important research question. Our
research focuses on selecting how many (sampling) and which (filtering)
instances to present to the machine learning algorithm. The goals of this
thesis are to study and characterize the properties of learning curves,
integrate them with bounds such as Chernoff and Chebychev to come up with
an efficient general purpose adaptive sampling schedule and filtering
technique, and to empirically validate our algorithm for scaling down the
data.
Mohan Sridharan
Robust Autonomous Structure-based Color Learning on a Mobile Robot
To operate in the proverbial real world, autonomous robots are
predominantly dependent on sensory information but the ability to
accurately sense the complex world is still missing. Visual input, in
the form of color images from a camera, should be an excellent source
of such information, but color, and images in general, have been used
sparingly on mobile robots. This is because most sophisticated vision
algorithms require substantial amount of computational and memory
resources and cannot account for the rapid camera motion
characteristic of mobile robot domains. Also they are sensitive to
illumination changes. In this thesis the focus shall be on designing
algorithms that can work within the robot's computational and memory
contraints while making the best use of the available information.
First, I develop a prototype vision system that solves challenging
vision problems such as color segmentation and object recognition
on-board an autonomous robot, relying on manual training data and
operating under constant and reasonably uniform illumination
conditions. Second, the robot shall exploit the knowledge of the
location of unique objects in its world to autonomously learn the
desired colors required for color segmentation. Third, I shall enable
the robot to adapt to the changes in illumination. To summarize, the
thesis shall provide a mobile robot vision system that autonomously
learns colors from structure and adapts to illumination changes.
Gita Sukthankar
Activity Recognition for Physically-Embodied Agent Teams
This thesis focuses on the problem of activity recognition for
physically-embodied agent teams. Examples of
physically-embodied agent teams include: soldiers moving in formation,
gamers controlling virtual bots in a simulated world, or robots playing
a Robocup soccer match.} We define team activity recognition as the
process of identifying team behaviors from traces of agents' positions
and orientations as they evolve over time; the goal is to completely
annotate agent traces with: 1) the correct sequence of low-level actions
performed by each agent; 2) an assignment of agents to teams and subteams;
3) the set of team plans consistent with the observed sequence. Activity
traces are gathered from teams of humans or agents performing military
tasks in urban environments. Team behavior annotations can be used
for a wide variety of applications including virtual training environments,
visual monitoring systems, and commentator agents.
For many physical domains, coordinated team behaviors create distinctive
spatio-temporal patterns that can be used to identify low-level action
sequences; we demonstrate that this can be done in a way that is robust to
spatial variations in the environment and human deviations during behavior
execution. This thesis addresses the novel problem of agent-to-team
assignment for team tasks where team composition, the mapping of agents into
teams, changes over time; this allows the analysis of more complicated
tasks in which agents must periodically divide into subteams. To do this,
we introduce a new algorithm, Simultaneous Team Assignment and Behavior
Recognition (STABR), that generates low-level action annotations from
spatio-temporal agent traces. Finally, we extend methods in symbolic plan
recognition to exploit both temporal constraints between behaviors and
agent role constraints in team plans to reduce the number of state history
hypotheses that our system must consider.
Chaofan Sun
Adaptive Support Vector Machine Learning with Data Selection
This paper presents data selection procedures for support vector
machines (SVMs). The purpose of data selection is to reduce the
dataset by eliminating as many non support vectors (non-SVs) as
possible. Based on the fact that support vectors (SVs) are those
vectors close to the decision boundary, data selection keeps only the
closest pair vectors of opposite classes. The selected dataset will
replace the full dataset as the training component for any standard
SVM algorithm. For big datasets, we use some kind data structure to
speed up finding closest pairs. Spatial Approximation Sample Hierarchy
(SASH) tree can be used to approximate k-NNs. The basic idea behind
SASH tree is an approximation of the triangle inequality d(x,z) <
d(x,y) + d(y,z), where d represents the distance of two vectors. In
other words, if y is the closest neighbor of x, and z is the closest
neighbor of y, then z is most likely one closest neighbor of
x. Inspired by this structure, we use also similar structure to
approximate closest pairs but from opposite classes, different from
k-NNs searching in SASH in which there is no class
distinction. Therefore node connections between two adjacent levels
are established based on the top closest pairs of opposite
classes. For each node, there is also one or more connections to its
same class closest pairs in order to keep indirect closest pairs. The
time complexity of building this tree is O(dpcnlogn) and querying is
O(dpcnlogn). The space used is O(n). Accordingly, searching closest
pairs of opposite classes is one order of magnitude faster and 90%
more SVs can be covered. Jim Thomas
A Collaborative Authoring Environment for Plan-Based Interactive Narrative
I am interested in making automated planning technology more accessible
and more widely used. Toward that end, I am developing a tool set that
enables plan authors to quickly gain an understanding of the
characteristics of an automated plan that are of particular interest. My
tool employs a domain metatheory I call the Domain Elaboration Framework,
or DEF, that allows authors to identify aspects of the plan space and
individual plans over which they wish to have better control or knowledge.
Although my initial motivation is to apply these insights toward the
construction of plans in the domain of interactive narrative, I am
attempting to maintain sufficient generality to allow this work to be of
use to the broader planning community.
Rattapoom Tuchinda
Interactively Building Information Integration Agents
Any kind of data imaginable can be found on the Internet today: pricing and
reviews of goods and service from multiple vendors, maps, and statistic about
event around us. This data, when combined the right way can give us
interesting insights. The trend is evidence in the proliferation of many web
applications called "mashup." However, the process of creating one is limited
to those who have knowledge in programming, because of the sophistication
required in three separate steps: data retrieval, data integration, and data
display composition. My thesis work aims to investigate the framework that
allows an everyday user to be able to build information integration plans
(mashup is one type of such plans) without having to write any code.
Tao Wang
Action Selection in Bayesian Reinforcement Learning
My research attempts to address on-line action selection in
reinforcement learning from a Bayesian perspective. The idea is to
develop more effective action selection techniques by exploiting
information in a Bayesian posterior, but in a computationally
efficient manner. I have focused on developing efficient approximations
based on stochastic simulation and value function approximation.
The first approach I developed selects actions by growing an adaptive,
sparse lookahead tree to estimate the future value of alternative
actions. I further augment the approach by considering a new value
function approximation strategy for the belief-state Markov decision
processes induced by Bayesian learning.