Relational and XML Data Exchange (Synthesis Lectures on Data Management)

Data exchange is the problem of finding an instance of a target schema, given an instance of a source schema and a specification of the relationship between the source and the target. Such a target instance should correctly represent information from the source instance under the constraints imposed by the target schema, and it should allow one to evaluate queries on the target instance in a way that is semantically consistent with the source data. Data exchange is an old problem that re-emerged as an active research topic recently, due to the increased need for exchange of data in various formats, often in e-business applications. In this lecture, we give an overview of the basic concepts of data exchange in both relational and XML contexts. We give examples of data exchange problems, and we introduce the main tasks that need to addressed. We then discuss relational data exchange, concentrating on issues such as relational schema mappings, materializing target instances (including canonical solutions and cores), query answering, and query rewriting. After that, we discuss metadata management, i.e., handling schema mappings themselves. We pay particular attention to operations on schema mappings, such as composition and inverse. Finally, we describe both data exchange and metadata management in the context of XML. We use mappings based on transforming tree patterns, and we show that they lead to a host of new problems that did not arise in the relational case, but they need to be addressed for XML. These include consistency issues for mappings and schemas, as well as imposing tighter restrictions on mappings and queries to achieve tractable query answering in data exchange. Table of Contents: Overview / Relational Mappings and Data Exchange / Metadata Management / XML Mappings and Data Exchange