Notes on:
Huang, C., Simpson, S., Ulybina, D., & Roitman, A. (2019): News-Based Sentiment Indicators

Author(s): Huang, C., Simpson, S., Ulybina, D., & Roitman, A.

Published: DEC 2019

url: https://papers.ssrn.com/abstract=3523146

What?

Build an early warning indicator (ewi) by analysing sentiment of financial news. Sentiments keywords are determined by semantic clustering methods.

Why?

Sentiment-based early warning signal (ews) to predict banking crises is not new. However, previous approaches often involve surveys, which are costly and delayed.

How?

Data: 3 million daily news article from Financial Times.
Approx. 48 article per month for each of the 20 countries.
Dates back to 1980 where many article had to be scanned

Two main steps:

Build semantic cluster: Use word2vec to triagulate the top \(n\) most similar terms to a set of seed terms, represent a semantic concept of interest (e.g. “fear”, “hedge”).
Measure cluster frequency:
\[ Index_{ij} = \frac{no. of specific words_{ij}}{no. of words_{ij}} \times 1000 \]
for each country \(i\) in month \(j\).

To assess the reliability of ewis, they computed its (a) Precision, (b) Recall and (c) F-score.

Because crises are rare, they did not actually used the whole sample. This may impose an selection bias on their model?

When assessing and computing prediction metrics, we do not consider the full sample (1980- 2018), because we know that after 1999 there are no more crisis in the K&R sample. For example, the financial crisis in Argentina in 2001, is not taken into account to compute the evaluation metrics. However, it is important to note that most of our sentiment indicators would have successfully triggered an ews ahead of time (Appendix 3).

And?

Indices based on general sentiment (e.g. being positive or nagative) perform better than specific sentiments (e.g. fear), but not in individual countries level.
The same indicies may work better for some countries, and some indices may work better for one countries → Is this worthy of takeaway? Sound banal to me.
ews are triggered correctly for most countries.
Combining different sentiments increase performance (but beware of overfitting!!)
These indices perform better in recent crises → Interesting. What causes this? Was it data quality (sample size, digitalization)? Or there exists a structural change?

Its noteworthy that their model use a time-invariant set of words, so changes in language use are not documented.

This post is in the collection of my public reading notes.