AI determines CEFR level at 90% accuracy

In collaboration with experienced English Language Teaching (ELT) authors, EDIA created a machine learning1 algorithm that classifies any text on the standard CEFR2 language difficulty scale. The algorithm computes the exact CEFR level with 90% accuracy, which is higher than what is currently achievable with human experts.

Training data & ELT Experts

The training data consists of a variety of English texts ranging from academic papers and news articles to children’s books. Each text was rated by three of our eight ELT experts, all of whom have significant experience writing textbooks, graded readers, assessments and similar content for renowned ELT publishers.

Human Annotators

human annotator charts
A less reliable expert vs the median of two other experts per text.
The most reliable expert vs the median of two other experts per text.


The algorithm is trained to minimise the distance to the median of the domain experts, per text. It takes a variety of measures into account, ranging from role and frequency of individual words to the grammatical structure of sentences.


Algorithm annotator

To visualise results, we had our algorithm label texts that it has not analysed before and compared it to the median3 of the labels given by the three ELT experts. We also included a performance overview of the ELT experts compared to each other.

Classifies text
  • checkmarkAccurately
  • checkmarkConsistently
  • checkmarkFaster
EDIA's AlgorithmAuthor Average

Multilingual Technology

We have also analysed our algorithm on German and Italian examples, provided by the Merlin4 dataset. The results were comparable on both, showing that our technology has multilingual potential.


Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use in order to perform a specific task effectively without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence.


The Common European Framework of Reference for Languages (CEFR) is an international language learning standard set by experts from the Council of Europe. The CEFR scale has levels ranging from A1 (beginner) to C2 (native).


The CEFR median is the middle value of multiple CEFR ratings. For example, if 3 annotators rate a text A1, B1+ and B2 respectively, the median would be B1+ since it is both the second highest and second lowest difficulty rating for the text.


The MERLIN dataset is an EU text corpus aligned to CEFR.

