Sentiment Analysis on a long text

FINBert (and many sentiment analysis programs) seem to have a limiation of 512 characters.  I am interested in the strategies people use to analyze text longer than 512 characters.

Hello Tim, 



There are several strategies that people use to analyze text longer than 512 characters in sentiment analysis, including:

  • Chunking: dividing the text into smaller, manageable chunks and analyzing each chunk separately.
  • Sampling: selecting a representative subset of the text for analysis, rather than analyzing the entire text.
  • Summarization: creating a summary of the text using text summarization techniques and analyzing the sentiment of the summary.
  • Pre-processing: removing irrelevant or redundant information from the text, such as stop words or non-relevant entities, before analyzing the sentiment.
Alternatively, you can explore the use of other transformer-based models, such as Longformer or Reformer, which are designed to handle longer sequences and could be fine-tuned for financial sentiment analysis tasks.

I hope this helps!