Memorias de investigación
Communications at congresses:
Building training sets for sentiment analysis in Twitter semi-automatically
Year:2019

Research Areas
  • Physics chemical and mathematical

Information
Abstract
Standard sentiment analysis techniques usually rely either on sets of rules based on semantic and affective information or in supervised machine learning approaches whose quality heavily depends on the size and significance of a training set of pre-labeled text samples. In many situations, this labeling needs to be performed by hand, potentially limiting the size of the training set. In order to address this issue, in this work we propose a methodology to retrieve text samples from Twitter and automatically label them. We then apply this methodology to a Twitter conversation and assess the quality of the produced training set. Additionally, we also tackle the situation in which the base rates of positive and negative sentiment samples in the training and test sets are biased with respect to the system in which the classifier is intended to be applied. The results presented in this respect hold relevance beyond this particular application.
International
Si
Congress
NetSci 2019 [https://vermontcomplexsystems.org/events/netsci]
970
Place
Burlington (EEUU)
Reviewers
Si
ISBN/ISSN
0000-0000
Start Date
27/05/2019
End Date
31/05/2019
From page
0
To page
0
NetSci 2019
Participants

Research Group, Departaments and Institutes related
  • Creador: Grupo de Investigación: Grupo de Sistemas Complejos
  • Departamento: Ingeniería Agroforestal