In case you had not noticed: we are in the middle of the Digital Decade. The current ten-year period in which the European Union is fully committed to digitalisation. This goes hand in hand with a large number of laws that must be taken into account.
All these rules serve clear objectives, as set out in the Digital Decade Policy Programme. For example, the Digital Services Act (DSA) offers EU citizens greater online protection, and the AI Act promotes better and safer use of artificial intelligence. It does not stop at rules alone. More than 207 billion euro has already been invested to achieve the stated digital objectives.
One of the objectives set by the European Commission is that everyone can participate in digital opportunities. It therefore seems logical that new European legislation, and the documents in which it appears, should be drafted in clear and understandable language. Yet anyone who opens the AI Act, the Data Act or the Digital Omnibus Regulation will quickly encounter texts that are not easy to read without an “AVI advanced” level. As a reader of blogs about the Digital Decade, your aim is probably to understand these complex material.
Fortunately, as you would expect, we are pretty good at mapping out these rules. The Digital Decade Roadmap helps every organisation to understand the applicable legislation.
The readability of legal texts is extremely important. Think of consent forms that must be written in clear and plain language, or the terms of use of an online platform. In this blog, I therefore discuss a number of theories on text readability and provide some practical tools to help you translate the Digital Decade into plain language.
Nowadays, a complex text is quickly thrown into ChatGPT with the prompt “explain this in Jip and Janneke language”. You then receive a text that looks suspiciously like an Annie M.G. Schmidt book, probably about a less exciting subject. Moreover, you cannot assume that ChatGPT produces a correct summary. The theory behind understandable texts is more extensive than a Jip and Janneke story. Knowledge of this theory can help when writing in plain language.
There are many ways to assess the readability of a text, and a fair amount of research has already been done on the subject, as reflected in this article by researchers at the University of Oxford. In the Netherlands, people are mainly familiar with the AVI method mentioned earlier and the Common European Framework of Reference. In the context of AI and software, it is also worth mentioning the Halstead complexity model, a method originally intended to determine how complex software code is. To keep this blog readable and enjoyable, I will explain two perhaps lesser-known models.
The first model also has the most appealing name. The SMOG formula stands for “Simple Measure of Gobbledygook”. This formula is mainly intended for English texts and produces a readability score indicating how many years of total education are needed to understand the text. It is particularly useful as a tool for clear communication in healthcare. The formula itself is quite simple: 3 plus the square root of the number of words with more than three syllables in 30 randomly selected sentences.
If we use the English version of the Data Act and select 30 random sentences (sample taken via ChatGPT and checked to ensure they actually appear in the text), the result is as follows: 3 + √234 ≈ 3 + 15.3 = rounded to 15 years of education needed to understand the text. In practice, this is fairly accurate. A 19- or 20-year-old law student should be able to understand the regulation.
A second method for determining the complexity of texts is the Lexical Density score (the LD score). This score indicates how information-dense a text is, in other words how much substantive content it contains compared to the total number of words. It clearly shows how complex a text can be, especially in legal and academic writing. The LD score is calculated as follows:
LD = (Number of content words / Total number of words) x 100%
Content words include nouns, verbs and adjectives. Articles, prepositions and conjunctions, among others, are not content words. If the score is 40% or lower, you can assume the text is simple. At 50% or higher, the text is likely to be academic in nature. One of the 30 randomly selected sentences from the English Data Act is the following: “Dark patterns are design techniques that push or deceive consumers into decisions that have negative consequences for them.”, taken from recital 38. This sentence contains 18 words, of which 10 are content words. (10/18) x 100% ≈ 55.6%, which means the sentence is likely to contain a significant amount of technical language. This makes sense, as it explains the concept of “dark patterns”.
All these different readability formulas do not immediately translate into practice. Simply knowing that a text is full of technical terminology is not enough. The challenge is to translate this into a more accessible text. To make texts as readable as possible, many organisations develop a style guide. Such a guide can include a number of points:
Structure: How you structure your text can have a major impact on its readability. The logical structure of introduction, main body and conclusion is often a good starting point.
Approach: Knowing who you are writing for is essential in determining whether your text will be understandable to the reader. Writing for the newsletter of your sports club requires a different approach from writing an academic paper.
Writing style: The use of old or fancy words has a significant effect on readability. This does not only apply to terms such as “interoperability specifications”, but also to words like “nevertheless”, “ergo” or “however”. Lawyers, myself included, often find it difficult to let go of these.
It may sound obvious, but to convey information clearly, knowledge of the subject matter is essential. The SMOG formula indicates how many years of education are needed to understand a text. A similar number of years of study, and often many more, are usually required to communicate that text clearly.
Readability is an important requirement for the texts and advice produced by ICTRecht. When a legal text is clear and understandable for all parties, fewer misunderstandings arise about its content. If everyone reads the text in the same way, there will be less need for interpretation disputes based on the Haviltex doctrine (named after the 1981 judgment).
The core values of expertise, practicality, commitment and an entrepreneurial mindset therefore remain central when writing. It does not matter whether it concerns an email, a DPIA or a blog about the EHDS.
Are you looking for practical and clear advice? We all carry the AVI Digital Decade with us. Feel free to get in touch.