-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Personal data design considerations #31
Comments
@ellisonbg, thank you for opening this discussion! personal data
I think #30 handles both of these criteria (ignoring the arbitrary choice of sensitive level names for now).
lawful basis
This is a great point! I see two ways we could address this with the current design:
|
Thanks this helps. I can imagine two usage cases that you are getting at here:
It sounds like your current approach is doing (1). Do you think it is viable to use the idea here and cover both usage cases - allow the operator to pick the sensitivity level and which "mode" they want. If they pick mode (2), filter out entire messages, rather than properties. Does this make sense? In term of having templates for different lawful bases, I like that mental model – the operator configuration would boil down to:
|
This looks great! There was one thing I had a question about. I'm imagining two sets of events: one at the "unclassified" level (which everyone could see) and another at the "confidential" level where access is restricted to a set of administrators. How will the permissions be set such that only the administrators can see the confidential information? More generally, how should permissions be controlled on the events output by the handlers? |
#30 addresses your point exactly. Operators can set the sensitivity level in each handler by setting an
In your example case, you can setup two handlers—one for administrators and one for everyone to see. The administrator event log includes confidential data by setting it's event_level to 'confidential'. |
@ellisonbg Absolutely. If we merge #30, we can easily craft a PR for "modes". I think #30 provides finer grained sensitivity control and "modes" offer a higher level sensitivity control.
👍 |
@ellisonbg, check out #46—I think that design is a good first step to handling personal data. |
In reviewing some PRs/issues on telemetry, a couple of things have come up for me that I want to capture.
Personal data
Not all telemetry data is the same. The GDPR does a good job of describing personal data:
https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/key-definitions/what-is-personal-data/
I want to make sure that we are designing the system in a manner that 1) forces schema to declare it collects personal data and 3) enables an operator to filter out personal data easily. An example:
Let's say there is a a schema that records a user opening a notebook. The name of the notebook is personal data, but the mere act of opening the notebook not so (unless the event it tagged with a username). An operator shouldn't have to dive into the details of that schema and worry about removing the notebook names, but should instead be able to filter out the personal data with a single flag. Additionally an operator should have a simple way of enabling or disabling the logging of usernames with events. If we don't make it easy for operators to reason about and configure these things, they will end up collecting personal data, even when they don't need or want to.
Lawful basis
The GDPR is also helpful in describing a range of different "lawful bases" for processing personal data. Because Jupyter is deployed across a wide range of situations, we have to design a system that is quickly configurable for these different lawful bases:
https://gdpr-info.eu/art-6-gdpr/
Yes, sometimes the lawful basis is consent and we need to make is really easy for operators to get consent and inform the users. But we shouldn't overfit for that lawful basis in a manner than makes it difficult for other bases. Users still have rights (possibly different ones) under the other lawful bases, and we want to make sure users get those rights as needed. I don't want an operator to have to choose between offering full consent, or no protections at all.
I realize that GDPR isn't a universal code or international law. But it is nonetheless a good starting point for understanding the different questions.
The text was updated successfully, but these errors were encountered: