This is the data used for experiments in the paper:
Núria Bel, Gabriel Bracons, and Sophia Anderberg. 2021. "Finding Evidence of Fraudster Companies in the CEO’s Letter to Shareholders with Sentiment Analysis" Information 12, no. 8: 307. https://doi.org/10.3390/info12080307
Please, cite the paper if you use the corpus.
For our study, a corpus of annual financial reports from different companies was compiled. The publication and wide dissemination of annual financial reports is a legal requirement in Spain in order to promote transparency. The annual financial reports are formal documents with detailed information addressed to shareholders about the financial activities of companies and are meant to be a justification of the management. The information is presented in figures with extensive explanations and a section devoted to management discussion and analysis. Traditionally, they also include, as foreground, a letter addressed to shareholders and signed by the president or the CEO of the company. We collected these letters extracting them from publicly available annual reports. We used Spanish reference newspapers (for instance El País, El Periódico, El Mundo) and court decisions to identify companies that have been sanctioned for accounting fraud or for misrepresentation in financial information during the period from 2011 to 2018. A conservative method has been applied in the selection process, since only years in which the fraud has been proved in a court judgment or it has been publicly accepted have been added. Cases in which there were only suspicions have not been added. Additionally, we have anonymized the texts to be able to share the data with other researchers.