A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems

International audience The Web 2.0 and the inexpensive cost of storage have pushed towards an exponential growth in the volume of collected and produced data. However, the integration of distributed and heterogeneous data sources has become the bottleneck for many applications, and it therefore stil...

Full description

Bibliographic Details
Main Author: Duchateau, Fabien
Other Authors: Base de Données (BD), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS), Markus Helfert, Chiara Francalanci, Joaquim Filipe eds
Format: Conference Object
Language:English
Published: HAL CCSD 2013
Subjects:
Online Access:https://hal.science/hal-01155475
https://hal.science/hal-01155475/document
https://hal.science/hal-01155475/file/duchateau-data13.pdf
id ftunivlyon2:oai:HAL:hal-01155475v1
record_format openpolar
spelling ftunivlyon2:oai:HAL:hal-01155475v1 2023-07-30T04:04:25+02:00 A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems Duchateau, Fabien Base de Données (BD) Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS) Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL) Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL) Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon) Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL) Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS) Markus Helfert Chiara Francalanci Joaquim Filipe eds Reykjavik, Iceland 2013-07-29 https://hal.science/hal-01155475 https://hal.science/hal-01155475/document https://hal.science/hal-01155475/file/duchateau-data13.pdf en eng HAL CCSD SciTePress hal-01155475 https://hal.science/hal-01155475 https://hal.science/hal-01155475/document https://hal.science/hal-01155475/file/duchateau-data13.pdf http://creativecommons.org/licenses/by/ info:eu-repo/semantics/OpenAccess DATA 2013 2nd International Conference on Data Management Technologies and Applications (DATA) https://hal.science/hal-01155475 2nd International Conference on Data Management Technologies and Applications (DATA), Jul 2013, Reykjavik, Iceland. pp.129-137 Data Integration Schema Matching Ontology Alignment Entity Resolution Entity Matching Selection of Correspondences [INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] info:eu-repo/semantics/conferenceObject Conference papers 2013 ftunivlyon2 2023-07-11T23:02:35Z International audience The Web 2.0 and the inexpensive cost of storage have pushed towards an exponential growth in the volume of collected and produced data. However, the integration of distributed and heterogeneous data sources has become the bottleneck for many applications, and it therefore still largely relies on manual tasks. One of this task, named matching or alignment, is the discovery of correspondences, i.e., semantically-equivalent elements in different data sources. Most approaches which attempt to solve this challenge face the issue of deciding whether a pair of elements is a correspondence or not, given the similarity value(s) computed for this pair. In this paper, we propose a generic and flexible framework for selecting the correspondences by relying on the discriminative similarity values for a pair. Running experiments on a public dataset has demonstrated the im-provment in terms of quality and the robustness for adding new similarity measures without user intervention for tuning. Conference Object Iceland Portail HAL de l'Université Lumière Lyon 2
institution Open Polar
collection Portail HAL de l'Université Lumière Lyon 2
op_collection_id ftunivlyon2
language English
topic Data Integration
Schema Matching
Ontology Alignment
Entity Resolution
Entity Matching
Selection of Correspondences
[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]
spellingShingle Data Integration
Schema Matching
Ontology Alignment
Entity Resolution
Entity Matching
Selection of Correspondences
[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]
Duchateau, Fabien
A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems
topic_facet Data Integration
Schema Matching
Ontology Alignment
Entity Resolution
Entity Matching
Selection of Correspondences
[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]
description International audience The Web 2.0 and the inexpensive cost of storage have pushed towards an exponential growth in the volume of collected and produced data. However, the integration of distributed and heterogeneous data sources has become the bottleneck for many applications, and it therefore still largely relies on manual tasks. One of this task, named matching or alignment, is the discovery of correspondences, i.e., semantically-equivalent elements in different data sources. Most approaches which attempt to solve this challenge face the issue of deciding whether a pair of elements is a correspondence or not, given the similarity value(s) computed for this pair. In this paper, we propose a generic and flexible framework for selecting the correspondences by relying on the discriminative similarity values for a pair. Running experiments on a public dataset has demonstrated the im-provment in terms of quality and the robustness for adding new similarity measures without user intervention for tuning.
author2 Base de Données (BD)
Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS)
Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL)
Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon)
Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL)
Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)
Markus Helfert
Chiara Francalanci
Joaquim Filipe eds
format Conference Object
author Duchateau, Fabien
author_facet Duchateau, Fabien
author_sort Duchateau, Fabien
title A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems
title_short A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems
title_full A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems
title_fullStr A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems
title_full_unstemmed A Generic and Flexible Framework for Selecting Correspondences in Matching and Alignment Problems
title_sort generic and flexible framework for selecting correspondences in matching and alignment problems
publisher HAL CCSD
publishDate 2013
url https://hal.science/hal-01155475
https://hal.science/hal-01155475/document
https://hal.science/hal-01155475/file/duchateau-data13.pdf
op_coverage Reykjavik, Iceland
genre Iceland
genre_facet Iceland
op_source DATA 2013
2nd International Conference on Data Management Technologies and Applications (DATA)
https://hal.science/hal-01155475
2nd International Conference on Data Management Technologies and Applications (DATA), Jul 2013, Reykjavik, Iceland. pp.129-137
op_relation hal-01155475
https://hal.science/hal-01155475
https://hal.science/hal-01155475/document
https://hal.science/hal-01155475/file/duchateau-data13.pdf
op_rights http://creativecommons.org/licenses/by/
info:eu-repo/semantics/OpenAccess
_version_ 1772815833857261568