Lexicografía y dialectología del español argentino: análisis desde las redes sociales
Damián E. Aleman (Grandata) y Santiago Kalinowski (Academia Argentina de Letras)
The task of detecting regionalisms (expressions or words used in certain regions) has traditionally relied on the use of questionnaires and surveys, and has also heavily depended on the expertise and intuition of the surveyor. The irruption of Social Media and its microblogging services has produced an unprecedented wealth of content, mainly informal text generated by users, opening new opportunities for linguists to extend their studies of language variation. this work we present three novel metrics based on Information Theory to detect regionalisms on Twitter. Our metrics take into account both the number of occurrences of the word in certain regions and the number of users who mention it. This tool has helped lexicographers discover several unregistered words of Argentinian Spanish, as well as different meanings assigned to registered words.