diff --git a/index.html b/index.html index d0458b67..92ca83c3 100644 --- a/index.html +++ b/index.html @@ -1216,37 +1216,92 @@ ### Ancillary uses -In order to uphold the principle of [[[#data-minimization]]], [=sites=] and -[=user agents=] should seek to understand and respect people's goals and preferences about -use of data about them. - [=Sites=] sometimes use data in ways that aren't needed for the user's immediate -goals. These uses are known as ancillary uses, -and data that is primarily useful for [=ancillary uses=] is ancillary data. +goals. For example, they might bill advertisers, measure site performance, or +tell developers about bugs. These uses are known as ancillary uses. - +[=Sites=] can get the data they want for [=ancillary uses=] from a variety of places: -Different [=users=] will want to share different kinds and amounts of -[=ancillary data=] with [=sites=]. Some [=people=] will not want to share any -[=ancillary data=] at all. +
+
Non-ancillary APIs
+
+ Web APIs that were designed to support users' immediate goals, like DOM events and element position + observers. +
-Users may be willing to share [=ancillary data=] if it is aggregated with -the data of other users, or [=de-identified=]. This can be useful -when [=ancillary data=] contributes to a collective benefit in a way -that reduces privacy threats to individuals (see collective -privacy). +
Ancillary APIs computed from existing information
+
+ APIs that filter, summarize, or time-shift information available from + [=non-ancillary APIs=], like the [[[event-timing]]] and IntersectionObserver. See + [[[#information]]] for restrictions on how existing non-ancillary APIs can + be used to justify new ancillary APIs. +
-
- There is ongoing work on these kinds of technologies in the sensitive or be used as part of browser +fingerprinting to recognize people +across contexts. In order to uphold the principle of [[[#data-minimization]]], [=sites=] and +[=user agents=] should seek to understand and respect people's goals and preferences about +use of this data. + +The task force does not have consensus about how [=user agents=] should handle +[=ancillary APIs computed from existing information=]. +Advocates of these APIs argue that they're hard to use to +extract [=personal data=], they're more efficient than collecting the same +information though [=non-ancillary APIs=], sites are less likely to adopt these +APIs if a significant number of people turn them off, and that the act of +turning them off can contribute to [=browser fingerprinting=]. +Opponents argue that if data's easier or cheaper to collect, more sites will +collect it, and because there's still some risk, users should be able +to turn off this group of APIs that probably won't directly break a site's +functionality. + +Because different users are likely to have different preferences: + +
+Specifications +for [=ancillary APIs computed from existing information=] and [=ancillary APIs +that provide new information=] should identify them as such, so that [=user +agents=] can provide appropriate choices for their users. +
+ +#### Designing ancillary APIs that provide new information {#designing-ancillary-apis-with-new-information} + +
+ +[=Ancillary APIs that provide new information=] should not reveal any [=personal +data=] that isn't already available through other APIs, without an indication +that doing so aligns with the user's wishes and interests. + +
+ +Most [=ancillary uses=] don't require that a site learn any [=personal data=]. +For example, site performance measurements and ad billing involve averaging or +summing data across many users such that any individual's contribution is +obscured. Private aggregation techniques can often allow an API to serve its use +case without exposing [=personal data=], by preventing any of the people +involved from being identifiable. + + -[=User agents=] should aggressively minimize [=ancillary -data=] and should avoid burdening the user with additional [=privacy labor=] -when deciding what [=ancillary data=] to expose. To that end, user agents may -employ user research, solicitation of general preferences, and heuristics about -sensitivity of data or trust in a particular [=context=]. +Some [=ancillary uses=] don't require their data to be related to a person, but +the useful aggregations across many people are difficult to design into a web +API, or they might require new technologies to be invented. API designers have a +few choices in this situation: + +* Sometimes an API can [=de-identify=] the data instead, but this is difficult + if a web page has any input into the data that's collected. +* API designers can check carefully that the API doesn't reveal _new_ [=personal + data=], as described by [[[#information]]]. For example, the API might reveal + that a person has a fast graphics card, that they click slowly, or that they + use a certain proxy, but the fact that they click slowly is already + unavoidably revealed + by DOM event timing. +* [=User agents=] can ask their users' permission to enable this class of API. + To reduce [=privacy labor=], a [=user agent=] could use a first-run dialog to + ask the user whether they generally support sharing this data, rather than + asking for each use of the APIs. + +If an API had to make one of these choices, and then something else about the +API needs to change, designers should consider replacing the whole API with one +that avoids exposing [=personal data=]. + +Some other [=ancillary uses=] do require that a person be connected to their +data. For example, a person might want to file a bug report that a website +breaks on their particular computer, and be able to get follow-up communication +from the developers while they fix the bug. This is an appropriate time to ask +the person's permission. -To help [=sites=] understand user preferences, user agents can provide -browser-configurable signals to directly communicate common user preferences -(such as a [=global opt-out=]). - -Data exposed for the [=ancillary uses=] of telemetry and analytics may reveal -information about user configuration, device, environment, or behavior that -could be used as part of browser fingerprinting to identify users across -sites. Revealing user preferences or other heuristics in providing or disabling -functionality could also contribute to a browser fingerprint. - -Functionality for telemetry and analytics should be explicitly noted by -specification authors, to help [=user agents=] provide configuration options -to their users. - - +
+ +User agents should provide a way to disable [=ancillary APIs that provide new +information=]. + +
+Some people may want to save processing time or bandwidth that's not necessary +to achieve their immediate goals, or they might know something about their +specific situation that makes the API designers' general decisions inappropriate +for them. Because the information provided by [=ancillary APIs that provide new +information=] isn't +available in any other way, [=user agents=] should let people turn them off, +despite the additional risk of [=browser fingerprinting=]. ## Information access {#information} @@ -1508,7 +1577,7 @@ -Data is de-identified when there exists a high level of confidence +Data is de-identified when there exists a high level of confidence that no [=person=] described by the data can be identified, directly or indirectly (e.g. via association with an [=identifier=], user agent, or device), by that data alone or in combination with other available information. Note