Is an AI system detecting text generated by ChatGPT a high-risk AI system?
This is an English version of my newsletter - if you prefer to read in Polish just go to the main page of the newsletter.
Recently, the government's National Information Processing Institute provided Polish universities with a solution for detecting in student’s theses texts generated by an artificial intelligence system. The Ministry of Science and Higher Education announced this as part of a communication entitled 'JSA - the promoter will check if the student has used ChatGPT technology'.
The solution is described as follows:
"The detection method implemented in the JSA system is based on the hypothesis that the greater the regularity in a text, the more likely it is to have been produced by a language model. Such an assumption is rooted in the algorithm used, which is capable of generating regular and predictable text. - The model used in JSA was trained on a large text dataset. Its operation is most easily explained using the Perplexity measure."
The mention of training on a large text dataset suggests that this is a solution that is itself an AI system.
Assuming, therefore, that this is an artificial intelligence system, let us try to consider how such a system would be classified under the Artificial Intelligence Act (AI Act). Admittedly, the regulation has not yet been published and will not come into force until two years after publication, but in my opinion it is already worthwhile to start sharpening our minds on trying to apply this legislation now.
AI system classification
In my opinion, the system in question would be a high-risk AI system as an AI system used in education and intended to be used to evaluate learning outcomes (Annex III.3.b of the AI Act).
Its classification as a high-risk AI system can be avoided if it is deemed not to pose a significant risk of harm to the health, safety or fundamental rights of natural persons, including by not materially affecting the outcome of the decision-making process.
According to Article 6.2a of the AI Act, such a situation arises in particular in the following cases:
It is possible that the system in question is a system performing a narrow procedural task. In my opinion, however, such a classification of this system would be unjustified.
The possibility of using the above-mentioned exemptions only applies when the risks associated with the system under assessment are limited and 'do not materially influence the outcome of the decision making'.
The authors of the tool indicate honestly in its description that
"It is currently virtually impossible to create a tool that would indicate whether or not the author of a text has used artificial intelligence. For such a solution to exist, it would be necessary to identify with greater accuracy all the characteristics of a text written by a human and those of a text generated using artificial intelligence tools. One of the obstacles to creating such an accurate tool is the constant development of the models on which text generators are based. Another problem is that both AI-generated text can be realistic and difficult to distinguish from natural text, as well as natural text can have automatically generated features (e.g. definitions, rules, diagrams)."
However, the mere indication in the user manual that the system can generate so-called false positives, i.e. "incorrectly identifying a text or part of a text as having been generated by artificial intelligence", seems insufficient, as there is still a risk that the person assessing the work may wrongly suspect the author of trying to 'cheat' with AI systems.
The potential extent of harm that can be caused by the use of such a system by a person who does not fully understand how it works and its limitations seems significant from the point of view of the student who is the author of the assessed work. In addition, the system will be used in a situation where there is an imbalance of power between the person negatively affected by the system (the student) and the user of the AI system (the academic staff), who has a higher status and the right to make decisions directly affecting the assessed person.
Recommended by LinkedIn
As a result, I believe that the system in question would be a high-risk AI system not benefiting from any of the exclusions mentioned above.
What are the implications of classifying an AI system as high-risk?
Such a system would have to be registered in the EU database of AI systems. Incidentally, the system would have to be registered in this database even if the provider of this system considered that it was not high risk based on the exclusion criteria mentioned above.
The provider of such a system would in addition have a number of responsibilities, including but not limited to:
In turn, any deployer of the system (i.e. the university) will have to ensure (among other things) that:
In addition, the university will have to assess the impact on the fundamental rights of a high-risk AI system. Such an obligation is incumbent on public law entities and private entities providing public services (i.e. services that are important to individuals, one of which is education, another, e.g. health care), and to some extent on banks and insurers.
The purpose of the fundamental rights impact assessment is for the university, as the deployer of the AI system, to identify specific threats to the rights of natural persons and to determine the measures to be taken if these threats materialise.
As a reminder, fundamental rights are rights and freedoms enjoyed by every person in the European Union. A catalogue of these rights is contained in the 2009 Charter of Fundamental Rights of the European Union.
Summary
The above analysis is of course a theoretical exercise based on the assumption that the described system is an artificial intelligence system, which is not necessarily true. If this is not the case and I will become aware of it - I will supplement this publication with such information.
I have used the publication of a communication about the system as a pretext to show how the provisions of the AI Act will be applied and how they will affect a number of actors.
From the moment the regulation comes into force (staggered between 2025 and 2027), the free production and use of AI systems will have to be forgotten, and I suspect that we will increasingly rarely see any software that do not contain at least an AI-based module. This means that basically every software project will have to include an analysis of whether it is an AI system and, if so, how such a system needs to be classified.
It is therefore a good idea to try on these regulations now, especially if you are planning to purchase IT systems that may be artificial intelligence systems.
If you enjoyed this newsletter, follow me on LinkedIn.
AI Engineer
10moDetecting an AI generated text nowadays is impossible but if we have access to crucial informations such as the way the text is generated for exemple , it can be very helpful
Growing Newsletters from 0 to 100,000 subs
10moGreat analysis on the new AI system at Polish universities! It's crucial to stay ahead of the game when it comes to AI regulations. Tomasz Zalewski
AI Educator | Talk about AI, SaaS, Growth | Built a 100K+ AI Community & a Strong SaaS Discussion Community with 15K+ SaaS Founders & Users
10moThat sounds like a crucial analysis, looking forward to diving into it! Tomasz Zalewski
🔬📣Vom Arbeitswissenschaftler zum Wissenschaftskommunikator: Gemeinsam für eine sichtbarere Forschungswelt
10moGreat analysis on the impact of the new AI system at Polish universities! It's crucial to stay ahead of the regulatory curve. 👍 #aiact #ai
AI Speaker & Consultant | Helping Organizations Navigate the AI Revolution | Generated $50M+ Revenue | Talks about #AI #ChatGPT #B2B #Marketing #Outbound
10moGreat analysis! Exciting to see the discussions around AI regulation evolving. Tomasz Zalewski