Research Challenges. Conversational IR systems can be seen as a federation of agents or subsystems, but they will also be inherently complex systems, with models that will reach beyond the boundaries of individual components. With that will arise challenges such as how to bootstrap such systems with reasonable effort, how to ensure they are responsive as a whole, how to perform component-wise diagnosis, and at what level to consider their robustness. Ethical challenges arising across the field of information retrieval, such as trust in information, biases, and transparency, will likely be exacerbated by the inherent narrowing of the communica- tion channel between the systems with their users.

Research Challenges. These general research questions manifest themselves along the entire information retrieval “stack” and motivate a broad range of concrete research directions to be investigated: Does the desire to present fair answers to users necessitate different content acquisition meth- ods? If traceability is essential, how can we make sure that basic normalization steps — such as content filtering, named entity normalization, etc. — do not obfuscate this? How can we give assurances in terms of fairness towards novel retrieval paradigms (e.g., neural retrieval mod- els being trained and evaluated on historic relevance labels obtained from pooling mainly exact term-matching systems)? How should we design an information retrieval system’s logging and experimental environment in a way that guarantees fair, confidential, and accurate offline and online evaluation and learning? Can exploration policies be designed such that they comply with guarantees on performance? How are system changes learned online made explainable? Indexing structures and practices need to be designed/revisited in terms of their ability to ac- commodate downstream fairness and transparency operations. This may pose novel requirements towards compression and sharding schemes as fair retrieval systems begin requesting different aggregate statistics that go beyond what is currently required for ranking purposes. Interface design is faced with the challenge of presenting the newly generated types of infor- mation (such as provenance, explanations or audit material) in a useful manner while retaining effectiveness towards their original purpose. Retrieval models are becoming more complex (e.g., deep neural networks for IR) and will require more sophisticated mechanisms for explainability and traceability. Models, especially in conversational interaction contexts, will need to be “interrogable”, i.e., make effective use of users’ queries about explainability (e.g., “why is this search result returned?”). Recommender systems have a historic demand for explainability geared towards boosting adop- tion and conversion rates of recommendations. In addition to these primarily economic considera- tions, transparent and accountable recommender systems need to advance further and ensure fair and auditable recommendations that are robust to changes in product portfolio or user context. Such interventions may take a considerably different shape than those designed for explaining the results of ranked retrieval syst...

Research Challenges. Challenges are faced on each of the areas that the proposed research covers. The proposed research touches on the collection over which the search engine operates, the user’s interaction with the search system, the user’s cognitive processes, and the evaluation of the changes to the user’s knowledge state or performance on tasks. At the level of the collection, we are concerned with the mix of information that is available. For large scale collections, such as the web, it is very difficult to understand the amount of material on a given topic, and thus is it hard to know what the existing biases are in the collection. For example, we might be interested in measuring the accuracy of decisions that users make after using a search engine. Collections of interest will contain a mix of correct and incorrect information, but the scale of the collection will make it difficult to understand the amount of correct and incorrect information in the collection apriori to the user’s search session. The field of IR is still in its infancy with respect to understanding user interaction and user cognitive processes. For us to be able to design systems that lead users to their desired knowledge state or decision, we will need to better understand how their cognitive processes affect their inter- action with the system and how the stream of information that they consume changes their mental state. A challenge here will be a lack of expertise in cognitive science and psychology (how people learn, how people make decisions, biases). Progress in this area will likely require collaboration outside of IR and require input from and engagement of other communities, including: cognitive science, human-computer interaction, psychology, behavioural economics, and application/domain specific communities (e.g., intelligence community, clinical community). The envisioned systems may require radical changes to aspects of user interfaces. Uptake of new UI solutions, however, is often difficult and poses extra onus on users, thus creating a high barrier to entry for the proposed new systems. Finally, evaluation ranges from the simple to the complex. We are interested both in sim- ple measures such as decision accuracy, and complex measures such as increases in curiosity. Evaluation is envisioned to embrace larger aspects of the user-system interaction than just the information seeking phase, e.g., evaluation of decisions users take given the information systems provided. Given that almost a...

Research Challenges. Some may think that online evaluation is off limits to academia because of a need to ‘get’ live users. However TREC, NTCIR, and CLEF have explored ways of making such a provision. In addition, smaller-scale evaluation in laboratory or live-laboratory settings, or in situ, could lead to advances in evaluation taking account of rich contextual and individual data. We believe that it may also be possible to simulate user bases with recordings of user interaction in conjunction with counterfactual logging. Such collections may include logs, crowd-sourced labels, and user engagement observations. Such data may be collected by means of user-as-a-service components that can provide IR systems with on-demand users who can interact with the system (e.g., given a persona description) to generate logs and the context where online evaluations can be carried on.

Research Challenges. There have been past initial attempts to build explanatory models of performance based on linear models validated through ANOVA but they are still far from satisfactory. Past approaches typically relied on the generation of all the possible combinations of components under examina- tion, leading to an explosion in the number of cases to consider. Therefore, we need to develop greedy approaches to avoid such a combinatorial explosion. Moreover, the assumptions under- lying IR models and methods, datasets, tasks, and metrics should be identified and explicitly formulated, in order to determine how much we are departing from them in a specific application and leverage this knowledge to more precisely explain observed performance. We need a better understanding of evaluation metrics Not all the metrics may be equally good in detecting the effect of different components and we need to be able to predict which metric fits components and interaction better. Sets of more specialized metrics representing different user standpoints should be employed and the relationships between system-oriented and user-/task- oriented evaluation measures (e.g. satisfaction, usefulness) should be determined. A related research challenge is how to exploit richer explanations of performance to design better and more re-usable experimental collections where the influence and bias of undesired and confounding factors is kept under control. Most importantly, we need to determine the features of datasets, systems, contexts, and tasks that affect the performance of a system. These features together with the developed explanatory performance models can be eventually exploited to train predictive models able to anticipate the performance of IR systems in new and different operational conditions.

Research Challenges. Existing high baselines: Over the long history of IR, we have developed models and approaches for ad-hoc and other types of search. These models are based on human understanding of the search tasks, the languages and the ways that users formulate queries. The models have been fine- tuned using test collections. The area has a set of models that work fairly well across different types of collections, search tasks and queries. Compared to other areas such as image understanding, information retrieval has very high baselines. A key challenge in developing new models is to be able to produce competitive or superior performance with respect to the baselines. In the learning setting, a great challenge is to use machine learning methods to automatically capture important features in representations, which have been manually engineered in traditional models. While great potential has been demonstrated in other areas such as computer vision, the advantage of automatically learned representations for information retrieval has yet to be confirmed in practice. The current representation learning methods offer a great opportunity for information retrieval systems to create representations for documents, queries, users, etc. in an end-to-end manner. The resulting representations are built to fit a specific task. Potentially, they could be more adapted to the search task than a manually designed representation. However, the training of such representation will require a large amount of training data. Low data resources: representation learning, and supervised machine learning in general, is based heavily on labeled training data. This poses an important challenge for using this family of techniques for IR: How can we obtain a sufficient amount of training data to train an infor- mation retrieval model? Large amounts of training data usually exist only in large search engine companies, and the obstacle to making the data available to the whole research community seems difficult to overcome, at least in the short term. A grand challenge for the community is to find ways to create proxy data that can be used for representation learning for IR. Examples include the use of anchor texts, and weak supervision by a traditional model. Data-hungry learning methods have inherent limitations in many practical application areas such as IR. A related challenge is to design learning methods that require less training data. This goal has much in common with that of the machine learning ...

Research Challenges. Knowledge Graph Representation in GIOs. The goal is to represent open domain infor- mation for any information need. Current knowledge graph schemas impose limitations on the kinds of information that can be preserved. Xxxxxxxxxxx et al. found that many KG schemas are inappropriate for open information needs. OpenIE does not limit the schema, but only low-level information (sub-sentence) is extracted. In contrast, semi-structured knowledge graphs such as DBpedia offer a large amount of untyped relation information which is currently not utilizable. A challenging question is how to best construct and represent knowledge graphs so that they are maximally useful for open domain information retrieval tasks. This requires new approaches for representation of knowledge graphs, acquisition of knowledge graphs from raw sources, and align- ment of knowledge graph elements and text. This new representation requires new approaches for indexing and retrieval of relevant knowledge graph elements. Adversarial GIOs. Not all GIOs are derived from trustworthy information. Some information ecosystem actors are trying to manipulate the economics or attention within the ecosystem. It is impossible to identify “fake” information in objects without good provenance. To gain the user’s trust, it is important to avoid bias in the representation which can come from bias in the underlying resources or in the generation algorithm itself. To accommodate the former, the GIO framework enables provenance tracing to raw sources. Additionally, contradictions of information units with respect to a larger knowledge base of accepted facts need to be identified. Such a knowledge base needs to be organized according to a range of political and religious beliefs, which may otherwise lead to contradictions. The research question is how to organize such a knowledge base, and how to align it with harvested information units. Finally approaches for reasoning within a knowledge base of contradicting beliefs need to be developed. Equally important is to quantify bias originating from machine learning algorithms which may amplify existing bias. Merging of Heterogeneous GIOs. To present pleasant responses, it is important to detect re- dundancy, merging units of information, such as sentences, images, paragraphs, knowledge graph items. For example, this includes detecting when two sentences are stating the same message (i.e., entailment). For example “the prime minister visited Paris” from a document ab...

Research Challenges. Several interesting research challenges continue to exist when building traditional efficient and effective IR systems (such as compression, first stage query resolution, and so on). In multi- stage retrieval systems the complexity is substantially higher and new areas need addressing. For example, at present we do not even know where and why these systems are slow. As mentioned above, exciting new challenges exist in the areas of conversational IR and learned data structures. While the notion of combining learning with efficient indexing is not an entirely new idea, recent advances in neural IR models have shown that learned data structures can in fact be faster, smaller, and as effective as their exact solution counterparts. However, enforc- ing performance guarantees in learned data structures is still a research problem requiring work. Likewise, as search becomes even more interactive, new opportunities for efficient indexing and ranking are emerging. For example, virtual assistants can leverage iterations on complex informa- tion in order to improve both effectiveness and efficiency in the interaction. But how to evaluate iterative changes for interactive search tasks is a significant challenge, and very few collections currently exist to test new approaches, let alone to test the end-to-end efficiency performance of such systems.

Research Challenges. As already mentioned, the vision of RAINBOW is to design and develop an open and trusted fog computing platform that facilitates the deployment and management of IoT services and cross-cloud applications. It could be argued that this vision intersects with many research trends in the field of fog computing, Since RAINBOW research offering has to be very clear, a distinct set of technical outcomes will be listed. More specifically, RAINBOW will progress the state of the art by elaborating on: • A Cloud-service modelling language for fog/edge application i.e. provide the theoretical framework for formulation and solving in real-time constraint- satisfaction problems that relate to fog/edge QoS. Such constraints may refer to application/connectivity qualitative parameters (i.e. served requests/s, latency, throughput, jitter etc.). • Orchestration algorithms that try to perform the proper enactment at the orchestration level during runtime in order to maintain a proper QoS. In order to do so the “online” version of optimization problems that will be formulated with the aforementioned model must be solved. In general, online multi-objective problems are computation intractable. Hence, RAINBOW should come up with specific heuristic/pruning techniques that will provide efficient solutions. • Efficient Data Storage, Querying and Processing which aims to solve the problem of efficient query planning and query execution in datasets that are physically dispersed and their (parallel) acquisition cannot be performed with equivalent guarantees. This research goal employs query optimization techniques. • Secure Zero-touch configuration of fog nodes built on top of existing transport- layer protocols for Mesh networks. The emphasis on this research issue is to attempt to solve the pressing need of secure device management and, more specifically the problem of zero-knowledge/collision-free identity acquisition in a mesh environment comprising heterogeneous types of devices with different configuration characteristics. It should be noted that according to the RAINBOW trust model, each node that is joining a mesh network should adhere to the security considerations of the underlying mesh protocol. • Creating trust enablers that relate to Configuration Integrity Verification and Remote Attestation of fog applications. Such enablers are tailored towards verifying the correct configuration and execution state (during both design- and runtime) of the entire fog application stack...

Research Challenges. The quest for Adaptive Systems

Research Challenges Sample Clauses

Filter & Search

Related Clauses

Parent Clauses