Multimodality is a powerful concept for dealing with dialogue cohesion
in a human-computer natural language (NL)-centered system. This work
is a modest step toward more effective exploitation of the potentially
large bandwidth of communication provided by this situation. The rela
tions between exploration, navigation, and NL-based communication are
discussed in general and with reference to two prototypes. Light cogni
tive load feedback and direct manipulation are proposed so that user a
nd system can cooperate in mutually establishing the structure of the
ongoing dialogue. The main points are: (i) use of an appropriate dialo
gue structure to constrain inference in the anaphora resolution proces
s; (ii) use of a graphical representation of the structure, to limit t
he problem of opacity; (iii) allowance for the possibility of direct m
anipulation on this representation, to avoid the necessity of operatin
g linguistically at the metalevel. The context of the work is within N
L-centered multimodal information access systems, in which basic entit
ies are pairs (most commonly question and answer). A dialogue model is
provided by a modified version of the centering model; it is both suf
ficiently simple to be displayed in an intuitive fashion on the screen
, and sufficiently powerful to give accurate results. An extension of
the discourse model, oriented to the treatment of deixis, is also prop
osed. Finally, steps toward an overall approach to the integration of
navigational and mediated aspects of interaction are discussed.