Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNIL recommendations #135

Open
DimitriPapadopoulos opened this issue Jul 4, 2022 · 16 comments
Open

CNIL recommendations #135

DimitriPapadopoulos opened this issue Jul 4, 2022 · 16 comments

Comments

@DimitriPapadopoulos
Copy link
Contributor

DimitriPapadopoulos commented Jul 4, 2022

For what it's worth, we have tried to apply these Open Brain Consent documents to an AI challenge. From what I gather:

  • The French CNIL disagrees with the very first sentence of the DUA, and believes recipients should be data processors, not data controllers:

    I become the data controller (as defined under the GDPR)

  • The rest is more relevant to the PIA and outside the strict scope of the Open Brain Consent.
  • They refer to the different guidelines of the EDPB. Do you know which ones are relevant to brain imaging?
@DimitriPapadopoulos DimitriPapadopoulos changed the title CNIL CNIL recommendations Jul 4, 2022
@DimitriPapadopoulos
Copy link
Contributor Author

DimitriPapadopoulos commented Jul 13, 2022

As far as I can see, the idea that recipients cannot be data controllers stems from the MR-001 frame of reference:

Responsables de traitement concernés

Le promoteur de la recherche.

Taken in a strict and restrictive sense, it means that only the sponsor of a study can be a data controller for all data processed in the study. No one else. I do find this restrictive interpretation abusive, but I am not a lawyer and most many lawyers I know are cautious and tend to interpret it in this restrictive way to be on the safe side.

@yarikoptic
Copy link
Member

@con/gdpr savvy folks, WDYT?

@CPernet
Copy link
Contributor

CPernet commented Jul 15, 2022

In this context, I can see why the CNIL said 'processor' and I'd agree with that.
When your DUA gives people data controller power, they can do all sorts of things within the limits of the DUA. When saying data processor, they can only do the thing specified by the controler, in your case, analyse for the AI challenge - totally logical IMO.

@CPernet
Copy link
Contributor

CPernet commented Jul 15, 2022

'it means that only the sponsor of a study can be a data controller for all data processed in the study' is not correct - if you are going to share in a controlled access repository, data controller right transfer will work.

@robertoostenveld
Copy link
Contributor

The EU clarification states that the processor works on behalf of the controller. That might be applicable in the case of the AI challenge, where the stated challenge defines what the recipients of the data can and cannot do.

Open Access implies that the recipient of the information has the freedom to do whatever they want (under certain restrictions). The OBC-GDPR-DUA 1.0.0 template aims to pass as many Open Access rights to the recipient of the data as possible, except for the right to identify and the right to redistribute.

I should add that at my university the legal and data management teams are also struggling with the transfer of data from one party to another party, especially when both parties are different legal entities. I now tend to think of it as a chain of contracts, where the informed consent is on the left side of the chain (at the start), the data use agreement for the final recipient (a person that downloads) on the right side of the chain, but where additional contracts might be in between, for example to transfer the data from the department listed on the ethics application (i.e., the legal entity representing the researcher that acquired the data) to an external entity that is responsible for the archive. Along the way, contractual constraints can be added, but not removed. The OBC-GDPR-DUA 1.0.0 tries to pass data to the end point with as little constraints as possible, except for those defined in the informed consent.

@CPernet
Copy link
Contributor

CPernet commented Jul 15, 2022

@robertoostenveld note that in the private sector, it happens quite often that one becomes one of the data controler (of part of the data - like here, as the authors usually keep a pseudonymization key) and at the same time is a data processor (i.e. do a specific job for someone) -- an example of that would be companies that deal with your credit info (as you do a transaction they validate stuff = processor, but they also keep record for fraud = controler).

if the AI challenge discussed above, this also means that once the challenge is done, people should get rid of the data as the processing is done -- which is not what scientific data sharing is about (but again in that context it made sense)

@DimitriPapadopoulos
Copy link
Contributor Author

DimitriPapadopoulos commented Jul 15, 2022

Yet, the CNIL and lawyers I have talked seem to agree that the following sentence means that the sponsor of a study is the only data controller, not only in the context of this specific AI challenge, but in general in all studies following the MR-001 frame of reference - which in France means almost all neuroimaging studies:

Responsables de traitement concernés

Le promoteur de la recherche.

For what it's worth, in most if not all research studies we participate to these days, the sponsors try to enforce they are the only data controller, with other research partners being data processors. That's of course a way for them to "own" all the data, including data acquired or processed by other research partners/sites.

@robertoostenveld
Copy link
Contributor

In that case the sponsor of the study is the only one who can share data with the participants of the AI challenge. If the challenge organizers are not part of the sponsor organization, they cannot share.

@DimitriPapadopoulos
Copy link
Contributor Author

DimitriPapadopoulos commented Jul 15, 2022

That's not the point. In this case the sponsor is part of the challenge organizers.

The point is that CNIL appears to disagree with the DUA not only for this specific AI challenge, but also other data transfer situations.

@robertoostenveld
Copy link
Contributor

Please note that the "OBC-GDPR-DUA 1.0.0" is on itself not a finalized DUA, it is a template for a DUA. The idea of a template is that it can be adjusted to the needs of the situation and parties involved, which in this case seems quite well possible with sed s/controllers/processors/g.

The CNIL in general having their own opinions and view is something that I don't think can be addressed through this github issue.

@yarikoptic
Copy link
Member

yarikoptic commented Jul 15, 2022

Please note that the "OBC-GDPR-DUA 1.0.0" is on itself not a finalized DUA, it is a template for a DUA.

hm... then we might want to make it more explicit formalize instructions on how people should extend version (and may be some CHANGELOG) to describe custom changes they introduced to the DUA (beyond entering "editable fields" we allocated). E.g. version could be told to become OBC-GDPR-DUA 1.0.0+cnil1 or alike.

edit: it is somewhat important to ease identification of the document. E.g. whenever people refer to BSD-3 license, they know that it is not anyhow modified beyond that formulation, besides may be copyright owner specified specifically in its wording etc. Similar point is here. Anecdotal case: there once was a software which said "We use open-source license (I forgot if it was MIT or BSD-3)!" but when I looked at it, every permitted use was pre-pended with "NOT" so it really was the opposite of the originally used license ;-)

@CPernet
Copy link
Contributor

CPernet commented Jul 18, 2022

+1 with Yarik, I'll have to think about that and have some updates running,
I do not think that the CNIL is against having other people become controllers in general BUT the document referenced which deals with biomedical data, i.e. what we do here, indicates indeed that whatever things you want to do, including sharing with the US, is possible but the sponsor is always responsible, which indeed is not compatible with transferring controller rights.

tagging @cmaumet for additional thoughts ...

@cmaumet
Copy link

cmaumet commented Jul 18, 2022

Thanks for the tagging @CPernet, I'll check with Elise and Anne who worked on the French translation to get their views.

@DimitriPapadopoulos
Copy link
Contributor Author

DimitriPapadopoulos commented Jul 18, 2022

Also, if recipients of the data are only data processors, the data controller must produce a PIA (Data Protection Impact Assessment) which enforces strict guidelines, which may not be entirely compatible with whatever IT security rules are applied locally by the recipients' organisations, and may prevent recipients from processing data in the cloud or computing centres. While that sounds reasonable from a privacy point of view, in my limited experience, the mere complexity of the situation may put an end to projects.

Additionally, I have no clear written proof of that, but after discussing our project with CNIL, I believe they want to prevent publication of biomedical data, in the following sense: they don't want data to be downloaded by recipients ("send data to the algorithms"), instead they want recipients to have restricted access to secure systems hosting the data ('bring the algorithms to the data"). Only aggregate or fully anonymous personal data may be exported. They even have a name, and very strict regulatory rules for such systems: entrepôt de données de santé (health data storehouse). Again, that sounds reasonable from a privacy point of view, but in practice there are issues with such systems:

  • limited set of existing tools to process data (specific version of very standard tools),
  • auditing costs of bringing new tools, even in the form of sandboxed containers,
  • intrinsic limits to the system: specific operating system, storage and processing capacity, storage and processing type (CPU vs. GPU for example),
  • include external clouds and computing centres in the regulated ecosystem,
  • current rules require datasets are copied to a distinct workspace for each user/project, which can be a problem if datasets are even a fraction of the size of UK Biobank, at least with standard storage systems.
  • ...

I'm not saying it is impossible to share biomedical data, but the mere complexity of it is a major obstacle, and the issues listed above are far from having been addressed, at a national, EU, or international level. A solution might be the advent of national nodes for health data (such as the Health Data Hub in France), and EU and international infrastructures (such as the European Health Data Space). However, I am concerned constraining research data to such a limited set of platforms will hinder research.

@annahespel
Copy link

I do not understand the position of the CNIL, except to consider that the partition of the controllers seemed artificial and perhaps motivated by the circumvention of a point of regulation ....
Some recent alerts have been issued to warn about this. Typical and unfortunately frequent case: an industrial company asks a health care institution to be controller to screen and inform patients of interest, then to transfer data and consider comapny as a new controller. Thus, industrial company carries out the statistical analyses, interprets them and shares the data with his own partners .... Of course, this is hidden subcontracting.

Otherwise, out of context, it's hard to understand. The data controller is the entity that determines the purposes of any personal data and the means of processing it. In this case, as verified with the CNIL, the initial data controller transfers data to a new data controller, being responsible for the information of the data subjects, the security of the transfer and the verification of the legality of the re-use by the new data controller. Thus, it follows from this threefold obligation that constraints such as those proposed in our document be imposed by the initial controller on the new controller, who nevertheless becomes free to choose the purposes and means of the new use, within the limits of their respect.

@CPernet
Copy link
Contributor

CPernet commented Jul 26, 2022

@DimitriPapadopoulos can you share your DUA paragraph with 'data processor' to be able to do that, according to GDPR, you would have added the purpose of processing and a period after which data should be deleted (or being after processing is done)

I also found that strange but if that's what the CNIL said, then it's up to french people to complain ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants