In addition to content selection issues, I investigated approaches for
improving the readability of automatic summaries. Specifically, we
developed a rewrite module for generation of appropriate references to
people in summaries. The module is based on a Markov model capturing
the dependence of the syntactic for of a reference on the syntax used
in the previous reference. The model was trained on journalistic
text. Human subjects were found to show a preference for the rewritten
texts.
Jacob Beal
Leveraging Language into Learning
I am investigating learning as a byproduct of translation between
different perspectives. In previous work, I demonstrated agents
rapidly constructing a simple inflected language on the basis of
shared experiences. My current work posits that if there are
structured differences between the agents' perspectives, then more
complicated concepts can be learned, and reasoning can proceed as a
byproduct of translation attempting to bring the various agents' world
models into alignment. This may shed light on how human intelligence
functions, as well as enabling complex systems to organize their
communication interconnects without precision manufacture.
Ellen Campana
An Empirical Analysis of the Costs and Benefits of Naturalness in
Spoken Dialog Systems
My work is concerned with design of spoken interfaces, including
spoken dialog systems. All researchers who work on such systems seem
to agree that they should be designed in such a way that they are easy
for users to interact with. There are two main approaches to designing
systems that are easy to use. One approach is to develop standardized
systems, with the hope that in time users will learn how to interact
with them easily. Another approach is to develop natural systems that
approximate human-human interaction, with the hope that one day such
systems will become so natural that humans will find them easy to
interact with. Currently, there are no empirical methods in
widespread use for investigating the likely success of the two
approaches. The goal of my thesis work is to extend a classic tool of
cognitive psychology, the dual-task methodology, to this answer this
question. As a test case I will focus on generation and understanding
of referring expressions because these are central to language use and
because the two design approaches make different predictions about how
they should be implemented in spoken systems.
Mark Carman
Learning Source Descriptions for Web Services
New Web Services are being made available on the internet all the
time, and while some provide completely new functionality, most are
slight variations on already existing themes. I am interested in the
problem of enabling systems to take advantage of these new services
without the need for reprogramming. An existing system can only make
use of a new service if it knows what functionality the service
provides. In the case of information producing services, this
functionality is described using Local-as-View (LAV) source
definitions. Such definitions can then be used to compose services or
incorporate them into existing workflows. I am developing a framework
for learning a service's LAV source definition automatically, by
actively invoking the service and comparing the output produced with
that of other known sources. The framework combines Inductive Logic
Programming (ILP) and Query Reformulation techniques in order to
systematically generate and test plausible source definitions. I have
tested the framework on a real Web Service implementation and discuss
some issues which arise in practice when trying to force the learning
system to converge on the correct source definitions.
Vincent Conitzer
Computational Aspects of Mechanism Design
In a preference aggregation setting, a group of agents must jointly
make a decision, based on the individual agents' privately known
preferences. To do so, the agents need some protocol (or mechanism)
that will elicit this information from them, and make the decision.
Examples of such mechanisms include voting protocols, auctions, and
exchanges. In most real-world settings, preference aggregation is
confronted with the following three computational issues. First,
there is the complexity of executing the mechanism. Second, when
standard mechanisms do not apply to or are suboptimal for the setting
at hand, there is the complexity of designing the mechanism. Third,
the agents face the complexity of (strategically) participating in the
mechanism. My thesis statement is that by studying these computational
aspects of the mechanism design process, we can significantly improve
the generated mechanisms in a hierarchy of ways, leading to increased
economic welfare.
Li Ding
On Boosting Semantic Web Data Access
The Semantic Web has been deployed as millions of RDF documents on the
Web. In order to utilize the huge amount of knowledge in the Semantic
Web, effective data access and data quality evaluation mechanisms are
needed. This thesis proposes a metadata and search engine based
approach to support both objectives. Preliminary work has developed
(i) the WOB ontologies, which help building metadata about the
Semantic Web and its context; and (ii) the semantic web search and
navigation model, which models web scale semantic web data access with
additional navigation paths. Both has been implemented in Swoogle,
which discovers, indexes, and ranks approximately 0.5M online RDF
documents and provides web search services (document search and term
search) to both machine and human agents. Ongoing and future work
will refine and evaluate existing theories and implementations, and
investigate the data quality issues by tracking knowledge provenance
and using context-based analysis.
Wolfgang Ketter
Dynamic Regime Identification and Prediction Based on Observed
Behavior in Electronic Marketplaces
We present a method for an autonomous agent to identify dominant
market conditions, such as oversupply or scarcity. The
characteristics of economic regimes are learned from historic data and
used, together with real-time observable information, to identify the
current market regime and to forecast market changes. The approach is
validated with data from the Trading Agent Competition for Supply
Chain Management.
Mykel Kochenderfer
Adaptive Modeling and Planning for Reactive Agents
This research is concerned with problems where an agent is situated in
a stochastic world without prior knowledge of the world's
dynamics. The agent must act in such a way so as to maximize its
expected discounted reward over time. The state and action spaces are
extremely large or infinite, and control decisions are made in
continuous time. The objective of this research is to create a system
capable of generating competent behavior in real time.
Xin Li
Self-Emergence of Structures in Gene Expression Programming
Automatic discovering and predicting the hidden pattern or
relationships among the monitoring data produced from the
manufacturing and design processes are pivotal in improving the
production quality. This thesis work aims at applying the Gene
Expression Programming (GEP), a recently developed evolutionary
algorithm, to fulfill these complex data mining tasks by preserving
and utilizing the self-emergence of solution structures during its
evolutionary process. The main contributions include the investigation
of the constant creation techniques for promoting good functional
structures emergent in the evolution, analysis of the limitation with
the current implementation of GEP, proposal of a new genotype
representation scheme for better inheritance of solution structures
and introduction of a novel utilization of the emergent structures to
achieve a flexible search process for solutions at a higher level.
Bhaskara Marthi
Concurrent Hierarchical Reinforcement Learning
The field of hierarchical reinforcement learning attempts to speed up
reinforcement learning with human prior knowledge about what good
policies look like. Existing HRL frameworks such as MAXQ and ALisp
work best in domains in which the agent has a single ``effector'' and
is engaged in a single task at any point. However, many domains
consist of multiple effectors and have multiple tasks in progress
simultaneously. For example, in the computer game Stratagus, the
player must control multiple ``units'' (effectors), and each unit may
be involved in a different high-level task. My collaborators and I
have worked on extending HRL to handle such domains. To this end, we
developed the language Concurrent ALisp, in which prior knowledge is
represented as a ``multithreaded partial program'', and solved the
algorithmic problems resulting from the exponentially large number of
joint choices the agent is faced with at each step. We also showed
how to use the threadwise and temporal structure of the program to
decompose the Q-function additively, and presented learning algorithms
that make use of this decomposition.
Ani Nenkova
Discourse Factors in Multi-Document Summarization
My thesis focuses on the study of the processes involved
multi-document summarization of news articles. I have developed a
manual annotation scheme, called the pyramid method, that allows us to
analyze human choices on content during summarization. The analysis of
multiple human summaries show that the content units that appear in
human-authored summaries follow a power-law distribution, formally
confirming the intuition that there can be different, equally good
from content perspective summaries. This empirical analysis led to the
development of a frequency based summarizer that performs on par with
the state-of-the-art systems for generic multi-document summarization.
Jennifer Neville
Structure Learning for Statistical Relational Models
Many data sets are relational in nature (e.g., citation graphs, the
World Wide Web, genomic structures). These data offer unique
opportunities to improve model accuracy, and thereby decision-making,
if machine learning techniques can effectively exploit the relational
information. To date research on statistical relational models has
focused primarily on knowledge representation and inference---there
has been little attention paid to the challenges and opportunities
that are unique to learning in relational domains. This work will
consider in depth the issue of structure learning and focus on
developing accurate and efficient structure learning techniques for
statistical relational models.
Ozgur Simsek
Towards Competence in Autonomous Agents
My thesis aims to contribute towards building autonomous agents that
are able to develop competency over their environment---agents that
are able to achieve mastery over their domain and are able to solve
new problems as they arise using the knowledge and skills they
acquired in the past. I propose a number of methods for building
competence in autonomous agents using the reinforcement learning
framework, a computational approach to learning from
interaction. These methods allow an agent to autonomously develop a
set of skills---closed-loop policies over lower-level actions---that
allows the agent to interact effectively with its environment and
flexibly deal with new tasks.
Trey Smith
Rover Science Autonomy: Probabilistic Planning for Science-Aware Exploration
Future Mars rovers will have the ability to autonomously navigate for
distances of kilometers. In one sol a traverse may take a rover into
unexplored areas beyond its local horizon. The rover can explore these
areas more effectively if it is able to detect and react to science
opportunities on its own, what we call science autonomy. We are
studying science autonomy in two ways: first, by implementing a simple
science autonomy system on a rover in the field, and second, by
developing probabilistic planning technology that can enable more
principled autonomous decisionmaking in future systems.
Radu Soricut
Natural Language Generation for Text-Based Applications Using an Information-Slim Representation
My research interests are in the Natural Language Processing area,
focusing on natural language generation, language modeling, machine
translation, and automatic summarization. My current activity focuses
on devising representation formalisms and algorithms to be used for
natural language generation in the context of text-to-text
applications. In this context, the natural language generation process
is driven by salient words and phrases derived from the input text,
and also by general language knowledge captured by probabilistic
language models. This generic style of natural language generation
fits a variety of text-to-text applications, such as Machine
Translation and Automatic Summarization.
Snehal Thakkar
Planning for Geospatial Data Integration
Integration of geospatial data is an important problem that has
implications in applications such as response to unexpected events and
urban planning. In this article I describe my work on extending the
existing data integration techniques to support integration of
geospatial data. In particular, I describe how to represent available
sources and operations in a data integration system. I show that the
representations can be used with an existing query reformulation
technique called Inverse Rules to dynamically generate an integration
plans to answer user queries. The article also describes a technique
call tuple-level filtering to optimize the dynamically generate plans.
Shimon Whiteson
Improving Reinforcement Learning Function Approximators via Neuroevolution
Temporal difference methods are theoretically grounded and empirically
effective methods for addressing sequential decision making problems
with delayed rewards. Most problems of real-world interest require
coupling TD methods with a function approximator to represent the
value function. However, using function approximators requires
manually making crucial representational decisions. This paper
introduces evolutionary function approximation, a novel approach to
automatically selecting function approximator representations that
enable efficient individual learning. Our method evolves individuals
that are better able to learn. We present a fully implemented
instantiation of evolutionary function approximation which combines
NEAT, a neuroevolutionary optimization technique, with Q-learning and
Sarsa, two popular TD methods. The resulting NEAT+Q and NEAT+Sarsa
algorithms automatically learn effective representations for neural
network function approximators. This paper also introduces on-line
evolution, which improves the on-line performance of evolutionary
computation by borrowing selection mechanisms used in TD methods to
choose individual actions and using them in evolution to select
policies for evaluation. We evaluate our contributions with an
extended empirical study in the autonomic computing domain of server
job scheduling. The results demonstrate that evolutionary function
approximation can substantially improve the performance of TD methods
and on-line evolution can significantly improve evolutionary methods.
This paper also presents additional tests that offer insight into what
factors can make function approximation difficult in practice.