Investigating the Image of Entities in Social Media: Dataset Design and First Results

International audience The objective of this paper is to describe the design of a dataset that deals with the image (i.e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc. Our main contribution is to build and provide an or...

Full description

Bibliographic Details
Main Authors: Velcin, Julien, Brun, Caroline, Dormagen, Jean-Yves, Kim, Young-Min, Roux, Claude, Boyadjian, Julien, Bonnevay, Stephane, Neihouser, Marie, Sanjuan, Eric, Khouas, Leila, Peradotto, Anne, Molina, Alejandro
Other Authors: Entrepôts, Représentation et Ingénierie des Connaissances (ERIC), Université Lumière - Lyon 2 (UL2)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon, Penn Image Computing & Science Lab Philadelphia (PICSL), University of Pennsylvania, Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, Laboratoire d'Electrochimie et de Physico-chimie des Matériaux et des Interfaces (LEPMI ), Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Institut de Chimie du CNRS (INC)-Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry )-Centre National de la Recherche Scientifique (CNRS), Sciences Po Lille - Institut d'études politiques de Lille (IEP Lille), Equipe de Recherche en Ingénierie des Connaissances (ERIC), Université Lumière - Lyon 2 (UL2)
Format: Conference Object
Language:English
Published: HAL CCSD 2014
Subjects:
Online Access:https://hal.science/hal-02052420
https://hal.science/hal-02052420/document
https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf
id ftunivnantes:oai:HAL:hal-02052420v1
record_format openpolar
spelling ftunivnantes:oai:HAL:hal-02052420v1 2023-05-15T16:50:31+02:00 Investigating the Image of Entities in Social Media: Dataset Design and First Results Velcin, Julien Brun, Caroline Dormagen, Jean-Yves Kim, Young-Min Roux, Claude Boyadjian, Julien Bonnevay, Stephane Neihouser, Marie Sanjuan, Eric Khouas, Leila Peradotto, Anne Molina, Alejandro Entrepôts, Représentation et Ingénierie des Connaissances (ERIC) Université Lumière - Lyon 2 (UL2)-Université Claude Bernard Lyon 1 (UCBL) Université de Lyon-Université de Lyon Penn Image Computing & Science Lab Philadelphia (PICSL) University of Pennsylvania Laboratoire Informatique d'Avignon (LIA) Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI Laboratoire d'Electrochimie et de Physico-chimie des Matériaux et des Interfaces (LEPMI ) Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Institut de Chimie du CNRS (INC)-Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry )-Centre National de la Recherche Scientifique (CNRS) Sciences Po Lille - Institut d'études politiques de Lille (IEP Lille) Equipe de Recherche en Ingénierie des Connaissances (ERIC) Université Lumière - Lyon 2 (UL2) Reykjavik, Iceland 2014-05-26 https://hal.science/hal-02052420 https://hal.science/hal-02052420/document https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf en eng HAL CCSD hal-02052420 https://hal.science/hal-02052420 https://hal.science/hal-02052420/document https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf info:eu-repo/semantics/OpenAccess 9th International Conference on Language Resources and Evaluation https://hal.science/hal-02052420 9th International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland aspect-oriented opinion mining political data French corpus [INFO.INFO-WB]Computer Science [cs]/Web [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [STAT.ML]Statistics [stat]/Machine Learning [stat.ML] info:eu-repo/semantics/conferenceObject Conference papers 2014 ftunivnantes 2023-02-22T08:21:00Z International audience The objective of this paper is to describe the design of a dataset that deals with the image (i.e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc. Our main contribution is to build and provide an original annotated French dataset. This dataset consists of 11 527 manually annotated tweets expressing the opinion on specific facets (e.g., ethic, communication, economic project) describing two French policitians over time. We believe that other researchers might benefit from this experience, since designing and implementing such a dataset has proven quite an interesting challenge. This design comprises different processes such as data selection, formal definition and instantiation of an image. We have set up a full open-source annotation platform. In addition to the dataset design, we present the first results that we obtained by applying clustering methods to the annotated dataset in order to extract the entity images. Conference Object Iceland Université de Nantes: HAL-UNIV-NANTES
institution Open Polar
collection Université de Nantes: HAL-UNIV-NANTES
op_collection_id ftunivnantes
language English
topic aspect-oriented opinion mining
political data
French corpus
[INFO.INFO-WB]Computer Science [cs]/Web
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
spellingShingle aspect-oriented opinion mining
political data
French corpus
[INFO.INFO-WB]Computer Science [cs]/Web
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
Velcin, Julien
Brun, Caroline
Dormagen, Jean-Yves
Kim, Young-Min
Roux, Claude
Boyadjian, Julien
Bonnevay, Stephane
Neihouser, Marie
Sanjuan, Eric
Khouas, Leila
Peradotto, Anne
Molina, Alejandro
Investigating the Image of Entities in Social Media: Dataset Design and First Results
topic_facet aspect-oriented opinion mining
political data
French corpus
[INFO.INFO-WB]Computer Science [cs]/Web
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
description International audience The objective of this paper is to describe the design of a dataset that deals with the image (i.e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc. Our main contribution is to build and provide an original annotated French dataset. This dataset consists of 11 527 manually annotated tweets expressing the opinion on specific facets (e.g., ethic, communication, economic project) describing two French policitians over time. We believe that other researchers might benefit from this experience, since designing and implementing such a dataset has proven quite an interesting challenge. This design comprises different processes such as data selection, formal definition and instantiation of an image. We have set up a full open-source annotation platform. In addition to the dataset design, we present the first results that we obtained by applying clustering methods to the annotated dataset in order to extract the entity images.
author2 Entrepôts, Représentation et Ingénierie des Connaissances (ERIC)
Université Lumière - Lyon 2 (UL2)-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-Université de Lyon
Penn Image Computing & Science Lab Philadelphia (PICSL)
University of Pennsylvania
Laboratoire Informatique d'Avignon (LIA)
Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI
Laboratoire d'Electrochimie et de Physico-chimie des Matériaux et des Interfaces (LEPMI )
Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Institut de Chimie du CNRS (INC)-Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry )-Centre National de la Recherche Scientifique (CNRS)
Sciences Po Lille - Institut d'études politiques de Lille (IEP Lille)
Equipe de Recherche en Ingénierie des Connaissances (ERIC)
Université Lumière - Lyon 2 (UL2)
format Conference Object
author Velcin, Julien
Brun, Caroline
Dormagen, Jean-Yves
Kim, Young-Min
Roux, Claude
Boyadjian, Julien
Bonnevay, Stephane
Neihouser, Marie
Sanjuan, Eric
Khouas, Leila
Peradotto, Anne
Molina, Alejandro
author_facet Velcin, Julien
Brun, Caroline
Dormagen, Jean-Yves
Kim, Young-Min
Roux, Claude
Boyadjian, Julien
Bonnevay, Stephane
Neihouser, Marie
Sanjuan, Eric
Khouas, Leila
Peradotto, Anne
Molina, Alejandro
author_sort Velcin, Julien
title Investigating the Image of Entities in Social Media: Dataset Design and First Results
title_short Investigating the Image of Entities in Social Media: Dataset Design and First Results
title_full Investigating the Image of Entities in Social Media: Dataset Design and First Results
title_fullStr Investigating the Image of Entities in Social Media: Dataset Design and First Results
title_full_unstemmed Investigating the Image of Entities in Social Media: Dataset Design and First Results
title_sort investigating the image of entities in social media: dataset design and first results
publisher HAL CCSD
publishDate 2014
url https://hal.science/hal-02052420
https://hal.science/hal-02052420/document
https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf
op_coverage Reykjavik, Iceland
genre Iceland
genre_facet Iceland
op_source 9th International Conference on Language Resources and Evaluation
https://hal.science/hal-02052420
9th International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland
op_relation hal-02052420
https://hal.science/hal-02052420
https://hal.science/hal-02052420/document
https://hal.science/hal-02052420/file/LREC14_FINAL_VELCIN.pdf
op_rights info:eu-repo/semantics/OpenAccess
_version_ 1766040657445519360