Investigating the Image of Entities in Social Media: Dataset Design and First Results
International audience The objective of this paper is to describe the design of a dataset that deals with the image (i.e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc. Our main contribution is to build and provide an or...
Main Authors: | , , , , , , , , , , , |
---|---|
Other Authors: | , , , , , , , , , , , |
Format: | Conference Object |
Language: | English |
Published: |
HAL CCSD
2014
|
Subjects: | |
Online Access: | https://hal.science/hal-02052420 https://hal.science/hal-02052420/document https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf |
id |
ftunivlyon3:oai:HAL:hal-02052420v1 |
---|---|
record_format |
openpolar |
spelling |
ftunivlyon3:oai:HAL:hal-02052420v1 2023-06-18T03:41:23+02:00 Investigating the Image of Entities in Social Media: Dataset Design and First Results Velcin, Julien Brun, Caroline Dormagen, Jean-Yves Kim, Young-Min Roux, Claude Boyadjian, Julien Bonnevay, Stephane Neihouser, Marie Sanjuan, Eric Khouas, Leila Peradotto, Anne Molina, Alejandro Entrepôts, Représentation et Ingénierie des Connaissances (ERIC) Université Lumière - Lyon 2 (UL2)-Université Claude Bernard Lyon 1 (UCBL) Université de Lyon-Université de Lyon Penn Image Computing & Science Lab Philadelphia (PICSL) University of Pennsylvania Laboratoire Informatique d'Avignon (LIA) Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI Laboratoire d'Electrochimie et de Physico-chimie des Matériaux et des Interfaces (LEPMI ) Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Institut de Chimie du CNRS (INC)-Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry )-Centre National de la Recherche Scientifique (CNRS) Sciences Po Lille - Institut d'études politiques de Lille (IEP Lille) Equipe de Recherche en Ingénierie des Connaissances (ERIC) Université Lumière - Lyon 2 (UL2) Reykjavik, Iceland 2014-05-26 https://hal.science/hal-02052420 https://hal.science/hal-02052420/document https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf en eng HAL CCSD hal-02052420 https://hal.science/hal-02052420 https://hal.science/hal-02052420/document https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf info:eu-repo/semantics/OpenAccess 9th International Conference on Language Resources and Evaluation https://hal.science/hal-02052420 9th International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland aspect-oriented opinion mining political data French corpus [INFO.INFO-WB]Computer Science [cs]/Web [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [STAT.ML]Statistics [stat]/Machine Learning [stat.ML] info:eu-repo/semantics/conferenceObject Conference papers 2014 ftunivlyon3 2023-06-06T22:51:21Z International audience The objective of this paper is to describe the design of a dataset that deals with the image (i.e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc. Our main contribution is to build and provide an original annotated French dataset. This dataset consists of 11 527 manually annotated tweets expressing the opinion on specific facets (e.g., ethic, communication, economic project) describing two French policitians over time. We believe that other researchers might benefit from this experience, since designing and implementing such a dataset has proven quite an interesting challenge. This design comprises different processes such as data selection, formal definition and instantiation of an image. We have set up a full open-source annotation platform. In addition to the dataset design, we present the first results that we obtained by applying clustering methods to the annotated dataset in order to extract the entity images. Conference Object Iceland Université Jean Moulin - Lyon 3: Publications scientifiques (HAL) |
institution |
Open Polar |
collection |
Université Jean Moulin - Lyon 3: Publications scientifiques (HAL) |
op_collection_id |
ftunivlyon3 |
language |
English |
topic |
aspect-oriented opinion mining political data French corpus [INFO.INFO-WB]Computer Science [cs]/Web [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [STAT.ML]Statistics [stat]/Machine Learning [stat.ML] |
spellingShingle |
aspect-oriented opinion mining political data French corpus [INFO.INFO-WB]Computer Science [cs]/Web [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [STAT.ML]Statistics [stat]/Machine Learning [stat.ML] Velcin, Julien Brun, Caroline Dormagen, Jean-Yves Kim, Young-Min Roux, Claude Boyadjian, Julien Bonnevay, Stephane Neihouser, Marie Sanjuan, Eric Khouas, Leila Peradotto, Anne Molina, Alejandro Investigating the Image of Entities in Social Media: Dataset Design and First Results |
topic_facet |
aspect-oriented opinion mining political data French corpus [INFO.INFO-WB]Computer Science [cs]/Web [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [STAT.ML]Statistics [stat]/Machine Learning [stat.ML] |
description |
International audience The objective of this paper is to describe the design of a dataset that deals with the image (i.e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc. Our main contribution is to build and provide an original annotated French dataset. This dataset consists of 11 527 manually annotated tweets expressing the opinion on specific facets (e.g., ethic, communication, economic project) describing two French policitians over time. We believe that other researchers might benefit from this experience, since designing and implementing such a dataset has proven quite an interesting challenge. This design comprises different processes such as data selection, formal definition and instantiation of an image. We have set up a full open-source annotation platform. In addition to the dataset design, we present the first results that we obtained by applying clustering methods to the annotated dataset in order to extract the entity images. |
author2 |
Entrepôts, Représentation et Ingénierie des Connaissances (ERIC) Université Lumière - Lyon 2 (UL2)-Université Claude Bernard Lyon 1 (UCBL) Université de Lyon-Université de Lyon Penn Image Computing & Science Lab Philadelphia (PICSL) University of Pennsylvania Laboratoire Informatique d'Avignon (LIA) Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI Laboratoire d'Electrochimie et de Physico-chimie des Matériaux et des Interfaces (LEPMI ) Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Institut de Chimie du CNRS (INC)-Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry )-Centre National de la Recherche Scientifique (CNRS) Sciences Po Lille - Institut d'études politiques de Lille (IEP Lille) Equipe de Recherche en Ingénierie des Connaissances (ERIC) Université Lumière - Lyon 2 (UL2) |
format |
Conference Object |
author |
Velcin, Julien Brun, Caroline Dormagen, Jean-Yves Kim, Young-Min Roux, Claude Boyadjian, Julien Bonnevay, Stephane Neihouser, Marie Sanjuan, Eric Khouas, Leila Peradotto, Anne Molina, Alejandro |
author_facet |
Velcin, Julien Brun, Caroline Dormagen, Jean-Yves Kim, Young-Min Roux, Claude Boyadjian, Julien Bonnevay, Stephane Neihouser, Marie Sanjuan, Eric Khouas, Leila Peradotto, Anne Molina, Alejandro |
author_sort |
Velcin, Julien |
title |
Investigating the Image of Entities in Social Media: Dataset Design and First Results |
title_short |
Investigating the Image of Entities in Social Media: Dataset Design and First Results |
title_full |
Investigating the Image of Entities in Social Media: Dataset Design and First Results |
title_fullStr |
Investigating the Image of Entities in Social Media: Dataset Design and First Results |
title_full_unstemmed |
Investigating the Image of Entities in Social Media: Dataset Design and First Results |
title_sort |
investigating the image of entities in social media: dataset design and first results |
publisher |
HAL CCSD |
publishDate |
2014 |
url |
https://hal.science/hal-02052420 https://hal.science/hal-02052420/document https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf |
op_coverage |
Reykjavik, Iceland |
genre |
Iceland |
genre_facet |
Iceland |
op_source |
9th International Conference on Language Resources and Evaluation https://hal.science/hal-02052420 9th International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland |
op_relation |
hal-02052420 https://hal.science/hal-02052420 https://hal.science/hal-02052420/document https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf |
op_rights |
info:eu-repo/semantics/OpenAccess |
_version_ |
1769006934363471872 |