1. Introduction

Part I (Sunde & Sunde, 2021) described the concept of PrevBOT: a semi-automated tool based on machine learning and robotics, operating online under the command and responsibility of a police officer (the human operator). The purpose of PrevBOT is to prevent online child sexual exploitation and abuse (CSEA). Part I examined whether machine learning applied to Authorship Analysis was suitable for developing PrevBOT. The phenomenological, technological, and criminological foundations of PrevBOT were discussed. It was concluded that developing PrevBOT was achievable and that it would extend the capability of the police to prevent CSEA in public chat rooms more effectively, compared to manual patrolling.

Part II discusses whether PrevBOT may be put into use by the police within the legal framework of fundamental rights to data protection, privacy, and fair trial. The analysis concerns a European context where the principal legal instruments regarding fundamental rights are the European Human Rights Convention (1950) (ECHR) and the EU Charter of Fundamental Rights (2012) (CFREU). Use of PrevBOT raises issues regarding the right to private life, and to protection against police provocation and self-incrimination, as covered by the right to a fair trial and to be presumed innocent. The right to privacy is laid down in the ECHR Article 8 and the CFREU Article 7, and to a fair trial in the ECHR Article 6 and the CFREU Article 47 and 48. Benefitting from the rich ECtHR case law, the present analysis is limited to address the provisions of the ECHR.

Regarding the analysis concerning data protection, the focus is primarily on the EU Law Enforcement Directive (2016) (LED), which regulates personal data processing by law enforcement. The Council of Europe Convention 108+ (2018) (C108+) is also relevant. In contrast to the LED, it is open also to non-European states, thus being the only global convention for data protection. However, the LED corresponds to the EU General Data Protection Regulation (2016) (ʻGDPRʼ), save for the adjustments needed to cater for the needs of the police (Sajfert & Quintel, 2018), and is the most detailed legal framework in this area. Noticing that efforts have been made to ensure “synchronization” between the GDPR/LED and the C108+ (Terwangne, 2021), the present analysis deals with the conditions of the LED.

The aim is, by unpacking the crucial issues, to provide guidance to nations wishing to use PrevBOT. This could be of interest to police in the Nordic countries who seek to develop concepts for online policing. Such an endeavor needs competence from multiple disciplines, including the law. The interest in PrevBOT is, i.a., demonstrated in a Norwegian research project which investigates the possibilities for using chatlogs collected as evidence in criminal investigation of CSEA cases as training data for a PrevBOT (Sunde et al., 2022). The use of biometric technology is at the center of the analysis, making LED Article 10, concerning processing of special categories of personal data, relevant. In extension of the analysis, some issues related to the proposed European Artificial Intelligence Act are discussed. The risk-based approach of this regulatory framework defines several of the policeʼs possible AI-uses as ʻprohibitedʼ or ʻhigh-riskʼ. The prospects of using PrevBOT largely depend on the future configuration of these rules.

2. Placing PrevBOTʼs functions in a crime prevention context

2.1 Understanding PrevBOTʼs functions

Understanding PrevBOTʼs functions concerning categorization and identification is crucial to the legal analysis. This section describes their purpose and how they relate to the crime preventive mandate of the police. As for the latter, a three-tiered crime preventive model is applied. While generally applied in public health intervention, the modelʼs broader relevance, i.a., to the field of crime prevention in the police and criminal justice sector is increasingly recognized (Stokols, 2018; Muir, 2021).

2.2 PrevBOTʼs functions

ʻCategorizationʼ refers to PrevBOTʼs capability to predict online participantsʼ age and/or gender and the occurrence of sexualized speech in a public chat room. ʻIdentificationʼ refers to its capability to predict the presence of known CSEA offenders. In real time, it compares the writing style in chat conversations to ʻlinguistic fingerprintsʼ of known CSEA offenders in a police reference repository. The reference material is made by computation of chat logs collected during investigations of previous offenses. Rather than suggesting a single match, PrevBOT produces a list of linguistic fingerprint candidates to a match, similar to the procedure applied to fingerprints and DNA samples. Both functions are enabled by machine learning and Authorship Analysis.

PrevBOT is rooted in situational crime perspectives (SCP) and extension theory. SCP takes the offenderʼs motivation for granted, “instead seeking to limit their scope for offending, and influence their perception and decisions regarding criminal action” (Part I, section 5.1). Further, with reference to extension theory, PrevBOT “extends the ability of [police] presence in more spaces than otherwise possible, allowing the entering and monitoring of more chatrooms per police officer” (Part I, section 5.5).

PrevBOTʼs categorizations are output in the form of predictions. Predictions of the existence and frequency of sexualized speech provide the police with information relevant to classifying specific forums as high-risk to children (ʻProblematic Spacesʼ). Problematic Spaces can then be prioritized for closer monitoring, enabling PrevBOT to detect persons who target children while pretending to be younger and/or of different gender than indicated in the user profile or the chat (ʻProblematic Personʼ (PP)). The human operator may also use PrevBOT as vehicle for sending a preventative message to the PP.

By ʻunmaskingʼ adults who pose as youngsters and/or lie about their gender, PrevBOTʼs predictions make it more difficult for would-be offenders, thus limiting the ʻscopeʼ for offending. Besides, the preventative message may influence their perceptions and decisions regarding CSEA. More generally, police ʻvisibilityʼ may be a psychological effect of the policeʼs improved ability to identify and monitor Problematic Spaces, detect PPs, and intervene against CSEA. Like police visibility in physical space, this may have a preventive effect.

PrevBOTʼs identifications (a linguistic fingerprint match) indicate that the PP is a known CSEA offender. In combination with the asymmetric age constellation between the parties to the conversation, plus the fact that online CSEA offenders often have many victims (Part I, section 2; Sunde 2019; 2020; Nilssen 2021), there might be reasonable ground to open a criminal investigation where the PP is suspected of still abusing children sexually. In line with SCP, identification limits the offenderʼs possibility for reoffending, while criminal investigation provides an entry point for intervention that could end CSEA abuse of other children.

2.3 Placing the functions in the preventive model

In the preventive model, primary and tertiary prevention respectively refer to prevention addressing causes and symptoms, while secondary prevention refers to early intervention (Muir, 2021, p. 9). Application of the model must be adapted to the nature of the problem at hand and the actors involved in preventing it. Crime prevention often requires multi-agency collaboration, involving the efforts of more actors than the police.

As a rule, the police should stay out of primary prevention, as other actors regularly are better placed to address the cause of a problem (Muir, 2021, p. 30–33). There is also the risk that the police “[become] intrusively pervasive across much of social life” (Loader, 2020, p. 12), resulting in increased surveillance and interference with personsʼ fundamental rights.

PrevBOTʼs categorizations aim at early intervention. Although online CSEA became a manifest problem of worrying dimensions many years ago, the ʻwindowʼ for secondary prevention is open with respect to the occurrence of each criminal offense. It is highly relevant that the police target social groups deemed to be a risk of committing certain crimes. The European Code of Police Ethics (2001) (ECPE) states that crime prevention is a main objective of the police (Article I.1), and a mandate to proactively intervene may be unequivocally expressed in law, e.g., section 7(1)(3) of the Norwegian Police Act (1995).

Muir takes CSEA as a use case to illustrate the model. Muir relies, like PrevBOT, on SPC theory, proposing to implement processes of “identity authentication and age verification in the online environment” to “design out” the CSEA problem (2021, p. 14). This would “reduce the opportunities to offend” by making it more difficult for would-be offenders to give false information about their age, thus making online spaces safer for children.

Addressing causes, Muir places responsibility on ʻthose best placedʼ, i.e., the internet and software industry. PrevBOTʼs intervention, however, is not primary akin to Muirʼs, but secondary, the purpose being to intervene early, before the PP gets control of the victim. In this situation, the police are clearly mandated to intervene.

The identifying function is mainly reactive, resting on the preventive effects of criminal investigation and punishment. Incapacitation, combined with the renewed possibility to approach the offender and the general deterrent effect of the criminal justice process, is deemed to have a (tertiary) preventive effect. Furthermore, private vigilantism against CSEA offenders is an international phenomenon (Lomell, 2020) and cautioned against (Halnes & Ugelvik, 2019; Lomell, 2020). The phenomenon could be reduced through application of PrevBOT, as increased effectivity of the police may reinforce public trust in the rule of law (Muir, 2021, p. 31).

There is also a proactive dimension to criminal investigation and prosecution: CSEA victims regularly keep the sexual abuse secret due to shame, and because they are silenced by threats from the offender to publish sexualized images on social media for schoolmates and parents to see (Part I, section 2; Sunde, 2019; 2020; Nilssen, 2021). Terre des Hommes (2013) emphasized that proactive investigation by the police is a precondition for coming to grips with the CSEA problem. PrevBOT may provide information necessary for investigating known CSEA offenders who resume the criminal conduct. This can proactively end the continued victimization of children who do not report, thus reducing harm.

Summarized, PrevBOTʼs categorization and identification functions answer to an evidence-based, problem-oriented approach anchored in sound theoretical foundations, enabling interventions on secondary and tertiary levels, where the police properly may use powers.

3. Comparison of manual and automated patrolling

To identify the legal issues special to PrevBOT, it is useful to compare manual and automated methods for online observation, infiltration, and preventative intervention.

3.1 Efficiency and effectivity

PrevBOTʼs autonomous properties enable policing of online forums and are superior to manual patrolling against CSEA both in terms of efficiency and effectivity. As PrevBOT can be put into operation simultaneously in several chat rooms and monitor significantly more chat conversations than would be possible for a police officer, it is more efficient than manual patrolling. Moreover, in public chat rooms, the conversations are open for everyone to see. However, CSEA offenders are risk-sensitive and make quick moves (Part I, section 2.2). To minimize the risk of detection they are likely to try to move the chat to a private channel for the conversation to continue one-to-one (Broome, Izura & Davies, 2020; Lorenzo-Dus, Kinzel & Cristofaro, 2020). This significantly reduces the possibility of visually observing CSEA risk indications in public chat rooms, which is the option available to a police officer.

PrevBOT is better equipped for the task through automated processing of available data, thus categorizing participants according to relevant risk indicators and identifying former CSEA offenders present in the chat room. Furthermore, PrevBOT is superior in collecting the digital traces needed to retrace offenders. This makes PrevBOT a more effective alternative to manual patrolling, improving the rate at which the police may submit preventative messages or initiate a criminal investigation.

3.2 The preventative message

PrevBOT may engage in secondary intervention by submitting a preventative message to the PP. The message informs the PP that s/he has chatted with the police and that contacting a child for sexual purposes is a punishable offense. In addition, the message may motivate the PP to seek professional help and inform about relevant programs for sex offender treatment. To tackle the risk of messaging an innocent (false positive), the message can facilitate easy feedback, e.g., through a link which provides information on how to file a complaint if s/he feels unfairly targeted and contact information for a complaint service in the police.

The police officer and PrevBOT are equally positioned concerning the preventative message: as participants to the chat, both can submit a message to the conversation partner (the PP). However, as predictions of age and/or gender are needed, only PrevBOT may effectively get the opportunity to submit such a message.

3.3 Criminal investigation of serial CSEA offenders

As per criminal procedural law, a criminal investigation is opened pursuant to a decision by a duly authorized person. The decision must be based on a reasonable suspicion that a crime was committed, substantiated by concrete indications. PrevBOT is a tool for gaining such information through linguistic fingerprinting, age/gender classifications, and information obtained in the conversation with the PP. In addition, it collects electronic traces that are important in the initial phase of the investigation. PrevBOT does not play a role in the investigation per se. The investigation makes use of traditional investigation methods by retracing the suspect and following up with methods such as a house and computer search and police interview.

Some options are equally open to police officers and PrevBOT, such as posting a general warning in the chat room or alerting the platform owner/moderator of PP presence. Yet, the important point remains: that the police officer is less efficient and less likely to gain actionable information in these situations and is thus less effective.

4. Structure and approach of the legal analysis

PrevBOT will access a public chat room, remain, and observe the chats. It will not actively seek contact but may respond to initiatives from others. This may be deemed as infiltration, especially if the chat continues one-to-one in a private channel. Legal issues that arise concern the lawfulness of accessing a public forum, passively observing the chats, and engaging in infiltration. This is analyzed in Section 5.

Observation and infiltration by PrevBOT involve automated processing of personal data. The data protection issues are analyzed in relation to the LED, and the proposed European Artificial Intelligence Act, in Sections 6 and 7.

Technology-neutral criminal provisions and rules regarding policing may be applicable both in physical space and online. The law, however, is more developed concerning acts in physical space, thus making this a suitable starting point for the analysis, assuming that, insofar as police methods can be regarded as similar in their respective physical and online manifestations, the legal application is the same.

5. Issues relating to privacy and fair trial

5.1 Accessing the public chat room

The first question concerns PrevBOT entering a publicly accessible online forum. PrevBOT may enter the forum with a user profile indicating a child, for instance ʻLisa12ʼ or ʻRoger14ʼ. Another option is using a profile indicating an adult. There is even the option to use a profile that discloses the police identity. It is uncontroversial that the police may enter open sources on the internet, as expressed in Article 32(a) of the Cybercrime Convention, further clarified in T-CY (2014) section 3 (see also Lentz, 2019, p. 32). The right to access is not conditional on the disclosure of police identity.

As noted by Lentz (2018; 2019, p. 244) and Rønn (2022, p. 159), the distinction between public and private forums is not always clear due to the semi-private character of many online forums and groups. This might introduce uncertainty in concrete situations as to whether the police may enter and observe. A discussion of these problems is beyond the scope of this article hence the analysis assumes that the forum is an open source within the meaning of the law. Insofar as the police officer lawfully could have entered the forum, so may PrevBOT.

5.2 Observation

5.2.1 Observation by the police officer

A police officer present in a public chat forum gains information visually. In terms of police methods, this is ʻobservationʼ. The question is whether police observation of persons in the chatroom interferes with their right to private life as per the ECHR Article 8(1) “Everyone has the right to respect for his private and family life, his home and his correspondence.” Action by the police must exceed a minimum of intensity to activate Article 8(1) (Kjølbro, 2017, p. 773). If exceeded, the method is an interference which, to be lawful, must have a legitimate aim, legal basis, and be necessary and proportionate (Article 8(2)).

The right to private life encompasses, i.a., a right to identity and personal development, and to establish and develop relationships with other human beings and the outside world (Kjølbro, 2017, p. 773). The ECtHR case law has clarified that the protection of private life may also extend to public places, see P.G. and J.H. v UK (2001) where the Court noted that “there is a zone of interaction of a person with others, even in a public context, which may fall within the scope of ʻprivate lifeʼ” (para 56, emphasis added).

ʻPrivate lifeʼ in a public space is closely linked to the personʼs reasonable expectation of privacy, which is a “significant, although not necessarily conclusive, factor” (P.G. and J.H., para 56). However, according to the Court, general observation in physical space does not interfere with the right to private life, for instance,

[a] person who walks down the street will, inevitably be visible to any member of the public who is also present. Monitoring of the same public scene (for example a security guard viewing through closed-circuit television) is of a similar character, (P.G. and J.H., para 57)

reiterated in Peck v the UK (2003), where CCTV monitoring of an individualʼs movements in a public place was found not to be an interference. Conversely, there may be an interference if the data is recorded systematically or permanently (P.G. and J.H., para 59), or, used to follow a personʼs movements (Uzun v Germany (2010) (Kjølbro, 2017, p. 781–782).

The present issue concerns general observation of the forum per se, in contrast to following a personʼs movements (cf. Uzun). By default, participation in the forum exposes everyone to one another. As this is evident to everyone present, any expectation of privacy is effectively eliminated. Moreover, it is common knowledge that user profiles may be misleading, meaning that those present could be anyone, including the police. This leads to the conclusion that a police officerʼs visual observation of such fora does not exceed the threshold for activating Article 8, hence the conditions of its second paragraph are not applicable.

5.2.2 Observation by PrevBOT

ʻObservationʼ performed by PrevBOT is not visual, as PrevBOT gains information by grabbing and processing the data in the chat room. It was concluded that, in a chat room, one cannot expect chat to be visually unobservable, but this is not to say that individuals might expect that the police extract more information from their chat than they voluntarily share. The data processed by PrevBOT consist of content of the chat conversations, user profiles, and electronic traces, such as IP-addresses and timestamps. These data are ʻpersonalʼ (LED Article 2(1)) and subject to data protection rules. The objective of data protection rules is “to put individuals in a position to know about, to understand and to control the processing of their personal data by others,” (C108+, Explanatory Report, Recital 10), and furthermore that “persons should be made aware of risks, rules, safeguards and rights in relation to the processing of their personal data and how to exercise their rights in relation to the processing” (LED, Preamble, Recital 26). However, through biometrics applied to the personal data PrevBOT extracts information that participants have not voluntarily shared. This raises issues to be discussed in relation to the more detailed rules of the LED in Section 6, although also the protection of the ECHR Article 8 extends to personal and biometric data retained by the police (S and Marper v UK (2008) DNA and fingerprints; Gaughran v UK (2020) facial images).

5.3 Infiltration

ʻInfiltrationʼ means a police officer has engaged in a conversation without revealing the police identity (the General Attorney, 2018). At present, the assumption is that PrevBOT uses a profile indicating a child. This could attract the interest of would-be CSEA offenders.

Should the PP perform speech acts, e.g., ʻlascivious speechʼ, ʻgroomingʼ, or attempt to obtain sexualized child images from PrevBOT, these are punishable offenses. The fact that PrevBOT is a computer system and not a child is not exonerating; the acts are still punishable as criminal attempts (Schermer et al., 2016, section 3.5; Sunde, 2019).

The PP may suggest moving to a private channel to improve the chances of succeeding in obtaining sexual ʻservicesʼ from the ʻchildʼ. Depending on the circumstances, the activity in the private chat room could constitute a criminal CSEA attempt against PrevBOT, perhaps also revealing indications that the PP is a serial offender.

5.3.1 Protection against police provocation

One question is whether PrevBOT acts as a ʻlureʼ and whether the legal protection against police provocation is contravened. The legal protection against police provocation is activated when the police cause others to commit crimes that would otherwise not be committed (Kjølbro, 2017, p. 606; Ramanauskas v Lithuania (2008) para 55). Police provoked crimes may not be criminally charged, but if that still was done, the court should dismiss the case or acquit the defendant. Evidence obtained by unlawful provocation should be barred from trial.

The question of police provocation becomes relevant only to offenses committed against PrevBOT. This analysis has no bearing on a criminal investigation of the PPʼs crimes against other children, even if opened on the basis of information gained by PrevBOT. Evidence collected in the investigation must be assessed on its own merits.

The protection against police provocation does not preclude the use of undercover methods, including using a ʻlureʼ, for obtaining evidence of criminal conduct (Kjølbro, 2017, p. 637–638); in the context of Sweetie: Schermer et. al (2016) section 4.8.2. The legal assessment is contextual, but according to the ECtHR case law, there is no provocation if the police officer remains essentially passive, not inducing the crime (Ramanauskas v Lithuania (2008) para 55; Harris et al. (2018) p. 428–429). The method should be applied only where there is reasonable suspicion of criminal activity so as to avoid targeting innocent persons. In Eurofinacom v France (2004), to obtain evidence the police acted as customers of a website they reasonably suspected of offering prostitution services. As the business was ongoing, the police could not be said to have incited the offense, and there was no violation of Article 6 (see also Volkov and Adamsky v Russia (2015)). Similar rationale might be applied to PrevBOT: PrevBOT first detects Problematic Spaces. Proactive use of PrevBOT in such spaces builds on a reasonable suspicion of CSEA, resembling the Eurofinacom situation.

The crucial question is whether PrevBOT acts in an essentially passive manner. This requires PrevBOT to leave the initiative to the PP and limit itself to answering neutrally. Under such circumstances, an initiative by a PP against PrevBOT could be deemed as being made randomly, and there is no provocation. Put differently, unless PrevBOTʼs presence substantially changes the original situation in the chat room, the approach by the PP was not incited by PrevBOT. Still, a question could be raised as to whether the mere fact that PrevBOT incorrectly poses as a child is provocative as it opens for sexualized initiatives PPs otherwise would not take. The assessment must take the online forum into account. If the forum is designed for adults, the presence of a ʻchildʼ could be provocative and the forum should not have been singled out as ʻproblematicʼ in the first place. If children nonetheless frequently participate, the forum may still be ʻproblematicʼ, legitimating the presence of PrevBOT within the limits explained.

Questions may also be raised as to whether protection against police provocation applies to crime prevention as well. Considering that investigation resources are limited, the use of a preventative message could be favored over prosecution, thus reserving investigation resources for CSEA cases of a more serious nature. Implicitly, the question concerns whether the legal protection extends to a preventative intervention prompted by a crime that was committed.

The scope of Article 6 is limited to secure that the PP is not unfairly subjected to “the determination … of any criminal charge against him” (para. 1). Direct preventative intervention thus falls outside the scope of Article 6 (Kjølbro, 2017, p. 465; Harris et al., 2018, p. 376). Instead, ECHR Article 8 is applicable, probably affording protection against police provocation along the same lines as Article 6. If these limits are respected, a preventative message may be submitted. Beyond these limits, the method undoubtedly interferes with ʻprivate lifeʼ, which, i.a., includes a right to be free from unwanted attention from the police (Kjølbro, 2017, p. 773). Such arbitrary practice is not justified according to Article 8(2) and hence would be a violation of private life.

5.3.2 Protection against self-incrimination

Offenses against PrevBOT are documented in the chat logs. If moving towards a criminal trial, it is vital that the chat logs may be used as evidence, as without them the case collapses because the burden of proof is not fulfilled. The question is whether protection against self-incrimination (Article 6) prevents the chat logs from being used as evidence in a criminal trial because they are deemed to constitute covert police interviews. The legal protection against self-incrimination is essentially a right to remain silent and not to assist in oneʼs own conviction (Kjølbro, 2017, p. 637–638; Harris et al., 2018, p. 423). This underpins the presumption of innocence and reduces the risk of pressure and misunderstandings in the imbalanced situation of a police interview. Hence, a police interview shall not be performed without the suspectʼs knowledge and the suspect having been informed of the procedural rights.

Contextual elements are important to the assessment. A PP who, unprovoked, commits the crimes described in this section has not been ʻinterviewedʼ, as an interview requires some manipulation or initiative on the part of the police. Assuming PrevBOT acts passively, the interaction is not a covert police interview, and the chat logs can be used as evidence.

The protection against self-incrimination is not an issue with respect to information from the PP that indicates offenses committed against other children. Such information is a lead that can be acted upon by the police in the further investigation of the offenses. The evidence would then have to be obtained by other investigative steps, including a police interview where the suspect is informed of his or her rights.

6. Data protection issues relating to biometrics

PrevBOT is a computer system for automated processing of personal data, falling under the scope of the LED as transposed into national legal systems. PrevBOT must comply with the legal framework of the LED as a whole, but at present only issues concerning PrevBOT as a biometric technology are discussed.

6.1 The problem

The LED Article 10 imposes stricter conditions for processing of special categories of personal data than personal data in general. The categories in Article 10 include, i.a., “biometric data for the purpose of uniquely identifying a natural person”, further discussed below in relation to PrevBOT.

6.2 Biometric data

In a study for the European Parliament, Fuster and Peeters (2021, p. 6) note that biometric technologies utilize various body characteristics. These are ʻsourcesʼ or ʻreference measuresʼ of biometric data because they “can be used for the collection of personal data” (Fuster & Peeters 2021, p. 6, with reference to Opinion 3/2012 of the Article 29 Working Party). Further, the fact that biometric data are “by definition enabling the unique identification of individuals, is of particular value” (Fuster & Peeters 2021, p. 6).

For PrevBOT, the relationship between personal/biometric data and the source needs conceptual clarification. As a start, biological material such as blood samples or human tissue is illustrative: the material is the source, affording a possibility to collect personal data from it. Because the data are extracted from unique bodily characteristics, they are also biometric. The biological material per se is not personal data (Fuster & Peeters, 2021, p. 6, with reference to the Article 29 Working Party, 2012: “sources of biometric data shall not be considered biometric data themselves”). In the case of PrevBOT, the chatlogs are the source from which the biometric data are collected through technical processing. The chatlogs are also personal data because the content relates to a person who can be identified. The content probably provides personal details as well, as the purpose is to get in contact with others. The chat logs in visual form are thus personal data.

The source of the biometric data is not the visual content but the writing style of the participant. This is a behavioral characteristic, like gait, hand-written signature, way of using a keyboard or a mouse. PrevBOT analyses writing style based on machine learning applied to Authorship Analysis, and the output is biometric data.

The LED qualifies the definition of biometric data to comprise only such data that “allow or confirm the unique identification” of a person (Article 3(13); see also Article 10). Hence for Article 10 to be relevant to PrevBOT, the output must allow or confirm the unique identification of the participant to the chat conversation. The question is whether this is the case or not.

6.3 The linguistic fingerprint

PrevBOTʼs linguistic fingerprints do not ʻconfirmʼ identity, as the identities of participants in chat rooms are unknown to the police, but a match may ʻallowʼ identification of a person. However, the identification must also be ʻuniqueʼ, which raises the question of whether the match must be 100 percent accurate. False positives may “identify” more persons than one, in which case the data do not uniquely identify the correct person. Does this render Article 10 inapplicable? Arguably, the application of such logic would contradict the very purpose of the directive, i.e., to provide “a high level of protection of personal data” (Preamble, Recitals 4 and 7). Motivated by the great potential for person-oriented mass surveillance and risk of harm to fundamental rights by police use of biometric data, the LED aims at securing an even higher level of protection of such data. Against the backdrop of these risks, highlighted in the general debate relating to AI and biometrics and thoroughly discussed by Fuster & Peeters (2021), to interpret the directive as providing weaker protection because the technology cannot eliminate false positives is genuinely illogical. In comparison, physical fingerprints also involve a risk of false positives (see interesting examples offered by Searston (2019)) yet are nonetheless included in Article 3(13) as an example of biometric data and hence protected by Article 10. As PrevBOTʼs identification function is comparable to physical fingerprints, the protection must be the same. Consequently, it may be used only when “strictly necessary” for the purposes mentioned in Article 10 litra a to c, subject to “appropriate safeguards”.

The conditions for processing special categories of personal data in Article 10 are addressed rather cursorily here as a detailed examination is beyond the scope of the article. However, it should first be noted that strict necessity must be assessed relative to the characteristics of the problem at hand. The PrevBOT concept was born in recognition of the shortcomings of current police strategies against CSEA, and, so far, less intrusive methods promising to achieve the same goal are not available. Importantly, it was demonstrated that PrevBOT affords the police capabilities that are not available through personal faculties (vision), without which a police officer is not in position to recognize a former CSEA convict who has resumed the activity of approaching children for sexual purposes. Appropriate safeguards are of course needed, some of which are indicated in the LED, Preamble Recital 37. Third, the use of linguistic fingerprints is limited to the purposes mentioned in Article 10 litra a to c, and of these litra b seems to fit: “to protect the vital interests … of another natural person,” e.g., children at risk of online CSEA.

6.4 Categorization of age and/or gender

The purpose of PrevBOTʼs categorizations of age/gender is to detect indications of CSEA risk, not to uniquely identify a person. PrevBOTʼs intervention strategy shows this, the plan being to submit preventative messages to PPs who remain anonymous. Yet, in the age of AI and high-powered data computations, the possibility that the classification could narrow down a group of people, indirectly help ʻallowingʼ the unique identification of a person, cannot be excluded. Still, the data are not biometric within the meaning of the LED as in this regard the original purpose for which the data are processed is decisive (Fuster & Peeters, 2021, p. 24-25). Put differently, for the data to be biometric the purpose of the processing must be to uniquely identify a person. As this is not the case for PrevBOTʼs categorizations, the data are not covered by Article 10.

Article 10 also includes the processing of ʻdata concerning a natural personʼs … sexual orientationʼ. A PP is targeted because of the risk of committing a CSEA crime. Indirectly, the police have assumed something about the PPʼs sexual orientation, although this is not expressed by the data extracted by PrevBOT, which only concern age and gender, and are clearly outside the scope of Article 10. Inferences about the PPʼs sexual orientation need contextual data such as participation in a PS, seeking contact with children, misrepresentation of age/gender. These data are not processed by PrevBOT but by the police officer. It is therefore unclear whether PrevBOTʼs categorization of data falls under the scope of Article 10. If it does, the assessment should be the same as set out above.

6.5 Automated processing of personal data

As PrevBOTʼs output is produced automatically by machine learning, the applicability of LED Article 11 becomes an issue. This can be done quite briefly as the provision only prohibits decisions based “solely” on automated processing (if producing “an adverse legal effect concerning the data subject or significantly affects him or her.”) PrevBOT is not conceptualized as being fully autonomous, neither in the decision to submit a preventative message nor for opening a criminal investigation. Instead, it acts at the order of a human operator whose decision is based on the merits of the available information (see Section 3.2). By securing meaningful human involvement in the decision-making, PrevBOT does not fall under the scope of Article 11 (cf. Enarsson et al. (2021, section 3.1) in relation to the GDPR Article 22, which applies mutatis mutandis to the LED Article 11; and Sajfert and Quintel (2018)).

The analysis has shown that the LED does not preclude the use of PrevBOT, only prescribes conditions for it to be lawful. Importantly, a purpose-specific legal framework for the reference linguistic fingerprints repository and the biometric data processing must be established. The framework could be inspired by the corresponding regulation of DNA and fingerprints. Furthermore, safeguards must be established to secure that no data are retained other than those giving reason for intervention. All data processed but not acted upon, should be immediately discharged. With a suitable legal framework along these lines in place, PrevBOT may be realized within the LED framework.

7. The proposed AIA

The European Union is moving progressively towards a regulation, laying down harmonized rules on AI (the Artificial Intelligence Act, 2021 (AIA)). The AIA takes a ʻrisk-based approachʼ, according to which certain uses of AI systems are prohibited, others ʻhigh riskʼ subject to “specific requirements”, and the rest are ordinary AI systems (Article 1).

7.1 The ban on real-time biometric identification systems in publicly accessible spaces

The AIA prohibits the use of “ʻreal-timeʼ biometric identification systems in publicly accessible spaces by law enforcement,” except in three “exhaustively listed and narrowly defined situations” (Preamble, Recital 19) (Article 5.1.d (i)–(iii)). Recital 18 explains the reasons for the prohibition:

[Such use] is considered particularly intrusive in the rights and freedoms of the concerned persons, to the extent that it may affect the private life of a large part of the population, evoke a feeling of constant surveillance and indirectly dissuade the exercise of the freedom of assembly and other fundamental rights. In addition, the immediacy of the impact and the limited opportunities for further checks or corrections in relation to the use of such systems operating in ʻreal-timeʼ carry heightened risks for the rights and freedoms of the persons that are concerned by law enforcement activities.

PrevBOTʼs linguistic fingerprint function is a ʻreal-time biometric identification systemʼ within the meaning of the AIA (see Article 3(33) (36) and (37)), thus encompassed by the prohibition. However, the prohibition concerns ʻpublicly accessible spacesʼ, a notion which only comprises “physical” places (Article 3(39)), leaving out online spaces. This is surprising since the reasons for the prohibition (above) arguably are relevant to online forums as well. However, the Preamble Recital 9 emphasizes that “[o]nline spaces are not covered as they are not physical spaces.” As per the current text of the AIA, PrevBOTʼs identification function thus falls outside the scope of the prohibition. Then, what remains is to consider PrevBOT in relation to the provisions of ʻhigh-riskʼ AI systems.

7.2 High-risk AI systems

Pursuant to Article 6 of the proposed AIA, the AI systems referred to in Annex III shall be considered ʻhigh-riskʼ. Any function of PrevBOT covered by Annex III must comply with a comprehensive set of conditions pertaining to ʻhigh-riskʼ AI systems in the AIA Title III. Annex III Point 1 and 6 are relevant to PrevBOT. Point 1 concerns biometric identification systems “intended to be used for the ʻreal-timeʼ … biometric identification of natural persons without their agreement.” PrevBOTʼs real-time identification function fits this description and is hence high-risk.

Biometric categorization systems per se were originally included in Point 1, later removed by the Council. For PrevBOTʼs categorization function to be ʻhigh-riskʼ, it must be covered by an alternative in Point 6 litra a to f (law enforcement), and of these, litra a, e and f may be relevant. The “intended use” is decisive, and, as one may recall, one purpose is to predict the existence of sexualized speech in a chat room, relevant for classifying a forum as a Problematic Space; another is to assign persons participating in Problematic Spaces to age and/or gender categories, which (depending on the context) may define them as PPs who may be targeted by a preventative message.

As Point 6 litra a, e, or f address use of AI by the police intended to expose persons, they seem not applicable to the categorizations of sexualized speech. Although speech is generated by persons, this is not decisive – the point being that the processing does not target any person but rather helps the operator to assess whether the online forum is a Problematic Space. The categorization of speech is therefore not high-risk.

Categorizations of age and gender on the other hand concern persons. Point 6 litra a addresses AI systems “intended to be used … for making individual risk assessment of natural persons in order to assess the risk for … offending … .” The categorization may lead the person to be regarded as ʻproblematicʼ. This means that the person has been subject to an “individual risk assessment” concerning “the risk for offending,” entailing the conclusion that this part of the categorization function is covered by litra a and is thus ʻhigh-riskʼ. This being the conclusion, there is no need to consider litra e and f.

8. Conclusion

The analysis has shown that PrevBOT may be used by the police for preventing online CSEA. An important point is that neither the LED nor the proposed AIA (by the Commission) hinder the implementation of PrevBOT. What is required is a comprehensive legal framework that defines and clearly delimits its purpose and the conditions for use while also establishing adequate safeguards guaranteeing the fundamental rights of exposed persons.

As expected, PrevBOT must comply with a detailed regulatory framework on a national level. Currently, that might not be in place, but this article has clarified how the legislator is at liberty to develop the law for the purpose of preventing CSEA through use of PrevBOT. The analysis also shows that using PrevBOT is not against fundamental rights to private life or a fair trial. Provided the scientific research relating to authorship analysis and AI holds its promises, the police may lawfully be given a new tool in its efforts against the CSEA problem.

The Council of Europe

C108+ Protocol amending the Convention for the protection of individuals with regard to automatic processing of personal data. 10.10.2018 (CETS 223)
Cybercrime Convention (2001) Convention on Cybercrime. 23.11.2021 (ETS 185).
ECHR The European Human Rights Convention. 4.11.1950 (ETS 5).
ECPE (2001) The European Code of Police Ethics. Recommendation Rec (2001)10, the Council of Europe (Committee of Ministers). 19.9.2001.
T-CY (2014) T-CY Guidance Note # 3 Transborder access to data (Article 32). The Cybercrime Convention Committee. 3.12.2014.

The European Union

CFREU The EU Charter of Fundamental Rights. 26.10.12 (2012/C 326/02).
GDPR The General Data Protection Regulation (2016/679).
LED The EU Law Enforcement Directive (2016/680).
AIA Proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence – 21.4.2021 COM(2021) 206 final.

Norwegian legal sources

The Norwegian Police Act, 4.8.1995/ 53.

The General Attorney (2018) Circular 2/2018 of the Norwegian General Attorney in Criminal Matters concerning ʻInfiltration and provocation as methods in criminal investigationʼ.

References

  • Broome, L. J.
    ,
    Izura, C.
    , &
    Davies, J.
    (2020). A psycho-linguistic profile of online grooming conversations: A comparative study of prison and police staff considerations.

    Child Abuse & Neglect

    , 109. https://doi.org/10.1016/j.chiabu.2020.104647
  • Enarsson, T.
    ,
    Enqvist, L.
    &
    Naarttijärvi, M.
    (2021). Approaching the human in the loop – legal perspectives on hybrid human/algorithmic decision-making in three contexts.

    Information & Communications Technology Law

    . https://doi.org/10.1080/13600834.2021.1958860
  • Fuster, G.G.
    &
    Peeters, M.N.
    (2021). Person identification, human rights and ethical principles – rethinking biometrics in the era of artificial intelligence.

    European Parliamentary Research Service

    . PE 697.191. December 2021, https://doi.org/10.2861/384495
  • Halnes, Y.
    &
    Ugelvik, S.
    (2019). Privat provokasjon: Strafferettslige og straffeprosessuelle konsekvenser av privatpersoners utfordrende etterforskingsvirksomhet.

    Tidsskrift for strafferett.

    19(3), p. 258–287, https://doi.org/10.18261/issn.0809-9537-2019-03-02
  • Harris, D.
    ,
    OʼBoyle, M.
    ,
    Warbrick, C.M.
    (2018).

    Law of the European Convention on Human Rights

    . Fourth edition.

    OUP

    .
  • Kjølbro, J.F.
    (2017).

    Den europæiske menneskrettighedskonvention for praktikere

    . Fourth edition.

    DJØF

    .
  • Lentz, L.W.
    (2018). ʻHackingʼ og det digitale privatliv.

    Juristen

    , 4/2018, p. 141-153.
  • Lentz, L.W.
    (2019).

    Politiets hemmelige efterforskning på internettet

    . PhD-thesis.

    Aalborg University

    .
  • Loader, I.
    (2020).

    Revisiting The Police Mission

    . Insight Paper 2.

    The Police Foundation 2020. UK

    .
  • Lomell, H.M.
    (2020). Selvoppnevnte rettshåndhevere: Om fremveksten av “pedojegere” på nett.

    Tidsskrift for Rettsvitenskap.

    133(5), p. 660–690, https://doi.org/10.18261/issn.1504-3096-2020-05-03
  • Lorenzo-Dus, N.
    ,
    Kinzel, A.
    , & Di
    Cristofaro, M.
    (2020). The communicative modus operandi of online child sexual groomers: Recurring patterns in their language use.

    Journal of Pragmatics

    , 155, 15-27, https://doi.org/10.1016/j.pragma.2019.09.010
  • Muir, R.
    (2021).

    Taking Prevention Seriously: The case for a Crime and Harm Prevention System

    .

    The Police Foundation 2021. UK

    .
  • Nilssen, I.D.
    (2021).

    Den strafferettslige reguleringen av seksuell utpressing. Bør vi ha et straffebud om utpressing med seksuelt motiv?

    Masteravhandling (upublisert). 25.11. 2021.

    Universitetet i Oslo

    .
  • Rønn, K.V.
    (2022). A Professional Code of Etichs. In:
    Stenslie, S.
    ,
    Haugom, L.
    , &
    Vaage, B.H.
    (eds.)

    Intelligence Analysis in the Digital Age

    .

    Routledge

    . pp. 151-162
  • Sajfert, J.
    &
    Quintel, T.
    (2018). Data Protection Directive (EU) 2016/680 for Police and Criminal Justice Authorities. In:
    Cole, M.
    &
    Boehm, F.
    (eds.)

    GDPR Commentary

    .

    Edward Elgar Publishing Ltd

    .
  • Schermer, B.W.
    ,
    Georgieva, I.
    ,
    Van der Hof, S.
    ,
    Koops, B.-J.
    (2016).

    Legal Aspects of Sweetie 2.0

    . Report. Leiden. 3 October 2016.

    Leiden and Tilburg universities

    .
  • Stokols, D.
    (2018).

    Social Ecology in the Digital Age. Solving Complex Problems in a Globalized World

    .

    London

    :

    Academic Press

    .
  • Sunde, N.
    &
    Sunde, I.M.
    Conceptualizing an AI based Police Robot for Preventing Online Child Secual Exploitation and Abuse: Part I – The theoretical and technical foundations of PrevBOT.

    Nordic Journal of Studies in Policing

    2/2021, https://doi.org/10.18261/issn.2703-7045-2021-02-01
  • Sunde, I.M.
    (2019). Sweetie, et politibarn eller en politistyrke på nett?ʼ In:
    Sunde, I.M.
    &
    Sunde, N.
    (Eds.)

    Det digitale er et hurtigtog – Vitenskapelige persektiver på politiarbeid, digitalisering og teknologi

    , pp. 177-205.

    Bergen

    . 2019.
  • Sunde, I.M.
    (2020). Fra grooming til seksuell utpressing: Behov for mer effektivt vern mot nettovergrep.

    Tidsskrift for strafferett

    . 2/2020. pp. 129-145, https://doi.org/10.18261/issn.0809-9537-2020-02-01
  • Sunde, I.M.
    et al. (2022).

    Nettprat-prosjektet – bruk av nettprat-bevis som treningsdata til en lærende KI-modell

    . PHS Forskning 5:2022. https://hdl.handle.net/11250/3023474
  • Terre des Hommes (2013).

    Webcam Child Sex Tourism – Becoming Sweetie: A Novel Approach to Stopping the Global Rise of Webcam Child Sex Tourism

    .

    Terre des Hommes

    .

    The Netherlands

    . 2013.
  • Terwangne, C. de
    (2021). Council of Europe Convention 108: A modernised international treaty for the protection of personal data.

    Computer Law & Security Review

    , 40. April 2021, DOI: 10.1016.
  • 1
    C108+ enters into force in 2023, provided it has at least 38 Parties.
  • 2
    In the conceptual universe of biometrics, linguistic fingerprints are behavioral biometric data, whereas ordinary fingerprints are physiological biometric data (https://justaskthales.com/en/what-are-physiological-biometrics).
  • 3
    The procedural solution could be dismissal of the case or acquittal. In Norway doctrine has it that the defendant should be acquitted (the Supreme Courtʼs decision HR-2011-2105-A, and the General Attorney (2018) section III.2.1.
  • 4
    The Article 29 Working Party is the forerunner of the European Data Protection Board (EDPB).
Copyright © 2022 Author(s)

CC BY-NC 4.0