| By
Babak Hodjat
A
Adaptive Agent Oriented Software Architecture (AOSA) technology
is a new approach to software engineering that enables rapid development
and deployment of powerful applications. AAOSA uses a community
of collaborative claim-based and message-driven components known
as agents.
The goal of AAOSA is to provide software designers with a software
development platform that contains support and mechanisms to coordinate
the community of AAOSA agents forming the application.
Agents make claims on parts of the input. The rules to claim parts
of the input and resolve conflicts between reported claims make
this system a unique Agent Oriented Software Engineering (AOSE)
paradigm. This is the intelligence of the system and is programmed
into the agent core, which contains the protocols and shared knowledge
that the agents need for collaboration. In addition, agents may
have unique application-specific capabilities implemented by application
designers. AAOSA applications are high performance and memory-efficient.
Written in Java, AAOSA is device, platform, and input method-independent.
Technological Foundation
Dejima’s proprietary and patented technology has been designed
for an Agent Oriented Software Engineering (AOSE) methodology as
opposed to standardizing existing Multi-Agent Systems (MAS) or Distributed
Artificial Intelligence (DAI) systems.
The underlying technology is based on DAI methods in which difficult
problems, such as trying to achieve human-like intelligence, are
approached by first breaking the problem down into simple sub-domains,
and then by assigning computational modules (agents) to each sub-domain.
This approach counts on the coordination between these modules and
the emergent behavior of the total system to derive a solution.
Objects vs. Agents
In mainstream computer science the notion of an agent is seen as
a natural development of the object-based concurrent programming
paradigm, where an agent is self-contained concurrently executing
a software process, which encapsulates some state, and is able to
communicate with other agents via message passing.In many cases
agent architectures are designed and implemented using object orientation.
Messaging in objects is simply a method of invocation. In contrast,
agents distinguish between different types of messages and in many
cases model them and use complex protocols to negotiate. Agents
also analyze messages and decide whether to execute them, whereas
objects invoke their methods without question. The agent-oriented
approach helps designers visualize and model the interactions, the
communications, and the collaborative elements of the application
by emphasizing the social aspects of the unit of programming.

Figure 1: Sample transaction Direct Mobile
with SMS modality.
Applications
AAOSA can support a wide range of applications and software systems
that require complex classification of the input to facilitate appropriate
output. One of the areas in which AAOSA has been successfully applied
is the area of Natural Interaction Interfaces (NII). NIIs are an
extension of natural language interfaces in which multiple modes
of interaction are supported. Examples of successful NII applications
built on AAOSA include wireless enabled portals (Figure 1), eCRM
(Figure 2), and e-commerce. Some specific products based on AAOSA
include message/contacts applications, database interfaces, and
interfaces to home entertainment sets.

Figure 2: Sample transaction with Dejima® Direct™
SFA with the RIM PDA e-mail modality. The query was sent in the
message subject as: “Find opps closing this qtr w/ a prob
> 80%”.
Designing
an AAOSA Software
A visual programming environment facilitates the creation of NIIs
using AAOSA (Figure 3). The AAOSA core supports designers and developers
by managing the complexities of how to communicate between agents,
and how to coordinate the community of agents. These properties
enable designers that use AAOSA to focus on the task of modeling
the real application instead of spending time on basic support functions.
The AAOSA core applies powerful DAI technologies including complete
encapsulation of responsibilities, extensible, reusable, distributable,
and multiple entry and exit points.

Figure
3: A Screen Shot of the AAOSA SDK with a portion of a financial
agent sub-network. Green lines show path of claims for the input:
“Gimme a chart for IBM”. The resulting XML, combined
from agent claims is: “<chart></chart><companies><name><![CDATA[ibm]]></name></companies>”
AAOSA-based
Natural Interaction Interfaces
Most human-computer interfaces in use today are complicated and
difficult to operate. This is due, in part, to the growing number
of features the interface provides. User should be able to express
their intentions as freely and naturally as possible. Interaction
should be limited to avoid overwhelming the users with redundant
options at inappropriate times. Agent-oriented software is used
to pinpoint the semantic sub-domains responsible for responding
to input. The designer of an AAOSA-based Natural Interaction Interface
creates a semantic network of agents.
Extending Context Sensitive Parsers
A grammatical parser can be implemented in software using AAOSA.
Practical problems force one to make some improvements. Creating
a grammar is a complicated task. Changing grammars based on learning
is also difficult. Grammars alone are not enough to fulfill the
requirements noted in the last section.
The interpretation policies should be much fuzzier than that of
a parser (i.e., the production rules). For instance, rather than
requiring the claims on which a new claim is based to be in sequence,
we can require them only to be exclusive. The interpretation policies
will determine what the best reduction condition is and each agent
will compute a confidence factor for its claims based on the extent
the reduced claims differ from the desired ones. Using a threshold,
claims of higher confidence are used as query responses.
Another main difference between the parser and the new proposed
natural interaction system is that the context considered in the
reductions of a context-sensitive grammar is limited to the input.
In the real world, the decision to make a claim may be made based
on context information that is not necessarily present in the input.
AAOSA NII Features
Input may come from different sources such as voice, Web, PDAs,
WAP and SMS, etc., and users may switch between them quite easily.
The semantic modeling minimizes the development required to move
from one language to another and even have support for more than
one language at a time or a mix of languages (e.g., English and
French).
Interacting with the user during coordination is a means to resolve
contradicting claims from agents. This also brings about the means
to carry out smart dialogues where users may deviate from answering
the system directly by adding extra information or changing the
context established in prior interactions of the dialogue. For example
in a Movie Information application dialogues such as the following
are quite possible:
User> Get me some comedies in Mountain View.
System> There are 15 comedies playing in theaters in Mountain
View. Would you like to see them all? You can also give me more
information about the movies or theaters.
User> Actually, get me a drama with Bruce Willis for tonight.
System> Here is a list of drama movies starring Bruce Willis
in Mountain View.
Ungrammatical, unpredicted input will be accepted by the system.
The system downgrades gracefully to a dialogue system to get the
user to the desired functionality. Many different ways can be used
to accomplish the same functional result. Applications are relatively
small in memory footprint and potentially embedded. Addition and/or
removal of agent sub-network while the system is running is possible
and changes the lingual and contextual scope of the application.
The system can respond to combinations of commands in a single input
without having been designed for them before hand. The combination
heuristics in each agent is responsible for this behavior.
Contextual information can be used during coordination to resolve
contradicting claims from agents. Various methods can be used to
resolve ambiguities in agents. They include the recency of the request,
the functional status of the back-end application, and probability
based on history of dialogues. The system can also interact with
the user to identify the user’s preference in the case of
ambiguities or unresolved interpretations.
Limitations
of Current Natural Interaction Methods
Word
Spotting Methods
Word spotting is used in many search engines and computer games.
These methods are fast and effective in small applications but are
not extendable to larger applications where users often need to
actuate different combinations of functions or when the same phrase
may be used to mean different things based on the context (e.g.,
presence of other words, history of the interaction, etc.).
Linguistic Methods
Grammatical methods are traditionally used to model the lingual
input to Natural Interaction Interfaces. These models are difficult
to create and usually require linguistic knowledge. The lexical
analysis-phase feeds into the grammatical parser, which in turn
may feed into the semantic and contextual analysis modules thus
making it difficult to use information from one module to help the
processing of another. For example, contextual information may come
in handy even at the lexical level. These models are heavily language
dependant and porting an application from one language to another
requires going through all the development steps again.
Statistical Methods
Statistical methods are usually too inaccurate to be used independent
of some kind of language model. These methods try to statistically
model and predict the word occurrences and heavily depend on the
availability of a large corpus. It is difficult to tune these methods
using the many thresholds and parameters and they usually require
tweaking by a human expert. These methods are based on the behavior
of a sampling of a target audience and have problems adapting to
individual users or different target audiences.
Frame-based Systems
For complex systems, both Linguistic and Statistical systems sometimes
use semantic frames to limit the scope of probable input and simplify
the models used for each frame. This simplification becomes more
important when the input modality is at risk of losing accuracy,
such as in systems that use speech recognition as their main input
modality. The main problem here is to help navigate the user through
these frames. This is usually accomplished by introducing interactions
and dialogs to, in many cases, reduce the system to a semi-menu
based interface. |