Research Challenges. Knowledge Graph Representation in GIOs. The goal is to represent open domain infor- mation for any information need. Current knowledge graph schemas impose limitations on the kinds of information that can be preserved. ▇▇▇▇▇▇▇▇▇▇▇ et al. found that many KG schemas are inappropriate for open information needs. OpenIE does not limit the schema, but only low-level information (sub-sentence) is extracted. In contrast, semi-structured knowledge graphs such as DBpedia offer a large amount of untyped relation information which is currently not utilizable. A challenging question is how to best construct and represent knowledge graphs so that they are maximally useful for open domain information retrieval tasks. This requires new approaches for representation of knowledge graphs, acquisition of knowledge graphs from raw sources, and align- ment of knowledge graph elements and text. This new representation requires new approaches for indexing and retrieval of relevant knowledge graph elements. Adversarial GIOs. Not all GIOs are derived from trustworthy information. Some information ecosystem actors are trying to manipulate the economics or attention within the ecosystem. It is impossible to identify “fake” information in objects without good provenance. To gain the user’s trust, it is important to avoid bias in the representation which can come from bias in the underlying resources or in the generation algorithm itself. To accommodate the former, the GIO framework enables provenance tracing to raw sources. Additionally, contradictions of information units with respect to a larger knowledge base of accepted facts need to be identified. Such a knowledge base needs to be organized according to a range of political and religious beliefs, which may otherwise lead to contradictions. The research question is how to organize such a knowledge base, and how to align it with harvested information units. Finally approaches for reasoning within a knowledge base of contradicting beliefs need to be developed. Equally important is to quantify bias originating from machine learning algorithms which may amplify existing bias. Merging of Heterogeneous GIOs. To present pleasant responses, it is important to detect re- dundancy, merging units of information, such as sentences, images, paragraphs, knowledge graph items. For example, this includes detecting when two sentences are stating the same message (i.e., entailment). For example “the prime minister visited Paris” from a document about ▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇, and an identical sentence “the prime minister visited Paris” from a document about ▇▇▇▇ ▇▇▇▇▇, should not be conflated. Even more challenging is the detection of information units that are vaguely related (according to a relation that is relevant for the information need). The availability of such approaches would allow for structuring and organizing content. Provenance references of information units and textual context can potentially help the integration of informa- tion units. Record merging in data bases is achieved by counting agreements versus disagreements. The research challenge is to perform such a merge in a multi-modal and open-domain setting. Resource Location Across Turn-based Conversational Information Seeking. In multi- turn interactions, people engage with a GIO as a response. A user asks a question and is presented with a response that is generated of multiple parts. It is likely that this user would like to interact with one part of the response. For example, the user may ask a follow-up question about one part or may reconsider a part of an earlier response at a later time. The research challenge is to provide a representation of multi-part answers and an appropriate presentation so that the user can refer to a part. This is even more challenging for voice interactions for which new kinds of anaphora resolution need to be developed. Resource location is difficult even in click-and-point presentations, especially when the answer arises from summarization of different information units. We suspect that this requires aligning information units into groups that the user intuitively interprets as one concept. This research is related to previous approaches of identifying information “nuggets” as an input to summarization. The challenge is to identify nuggets without human involvement, by training algorithms that identified re-usable informative components that make for useful GIO information units in different context. Deriving Explanations from GIOs. The rationale of a generated response (with respect to the information need) needs to be explainable to the user. In search snipped generation, such explanations are typically identified through high-density regions of keyword matches. For complex generated information responses this will not be sufficient. We envision that such an explanation entails two parts: 1) Explain how the given information need was interpreted, and 2) explaining every part of the generated response. For example, in a system that jointly reasons about relevant entities and relevant text, the system may be asked why a particular entity is relevant. While the relevant text related to this entity may be one first approach, one can imagine that a user would rather hear a direct explanation such as “This entity is relevant because ... ”. In order to give such explanations, a system must be able to understand the concepts contained in the text and their relation to the information need. The research challenge is how to understand both the information need and response text on a conceptual level that would allow such explanations. Context and Personalization. Any information-seeking behavior is dependent on context: the user’s prior knowledge, the task to be accomplished, and the previous interaction. The GIO framework allows for modeling, storing, and considering any user context that is available. While studied for many decades, it is still open which representation of the user’s context is optimal for which task, and how to derive representations that are versatile across different retrieval tasks at once. Even more important is to model shifts in user’s context and interest over time. This affects both the selection of information unit, the appropriate generation of the response object, and the effectiveness of the chosen presentation.
Appears in 3 contracts
Sources: End User Agreement, End User Agreement, End User Agreement