English language newspapers corpus containing around 650 million words in 1.5 million articles from 14 newspapers (the initial version of the corpus, containing UK broadsheets, was created in 2011 and was extended in 2017 to include newspapers from other countries including India, USA, Hong Kong, Nigeria and the Arab world, as well as UK tabloids).