No boundaries: data exfiltration by third parties embedded on web pages
In a blogpost series in 2017/18, we revealed how web trackers exfiltrate personal information from web pages, browser password managers, form inputs, and the Facebook Login API. Our findings resulted in many fixes and privacy improvements to browsers, websites, third parties, and privacy protection tools. However, the root causes of these privacy failures remain unchanged, because they stem from the design of the web itself.
In a paper at the 2020 Privacy Enhancing Technologies Symposium, we recap our findings and propose two potential paths forward. This companion website provides an overview of the attacks and summarizes the response to the initial release of our findings.
Paper (PDF) »
The study is a collaboration between Gunes Acar (KU Leuven), Steven Englehardt (Mozilla), and Arvind Narayanan (Princeton University)
Findings
We discovered three broad classes of vulnerabilities during our 2017 measurements: the misuse of browser login managers by trackers, data exfiltration from social login APIs, and the leaking of personal data to trackers due to whole-DOM exfiltration. All three classes are explored in depth in our paper. As an example, we provide a brief summary of login manager misuse below.
Example finding: Trackers misuse browser login managers
We found two tracking scripts misusing browser login managers to collect email addresses. These scripts were present on 1,100 of the Alexa top 1M sites in September 2017.
The mechanism
First, a user fills out a login form on the page and asks the browser to save the login credentials. The third-party script does not need to be present on the login page. Then, the user visits another page on the same website which includes the third-party script. The script inserts an invisible login form, which is automatically filled in by the browser's login manager. The third-party script retrieves the user's email address by reading the populated form and sends the email hashes to third-party servers.
Demo »Responses
The publication of our findings resulted in several privacy fixes and improvements to browsers, websites, third parties and privacy protection tools. The table below provides an overview. A full description of the responses is available in Section 7.1 of our paper.
Party | Attacks | |||
---|---|---|---|---|
Login Manager | Social Integration | DOM Exfiltration | ||
Browsers | Chrome | Considering restrictions | No fix | No fix |
Firefox | Considering restrictions | Proposal[1] | Proposal[1] | |
Safari | Require interaction | No fix | No fix | |
Brave | Require interaction | Proposal[2] | Proposal[2] | |
Blocklists | EasyList & EasyPrivacy | Already Blocked[3] | Not blocked | Blocked |
Disconnect | Blocked | Not blocked[4] | Blocked | |
Third parties | Adthink | Stopped | N/A | N/A |
OnAudience | Stopped | Stopped | N/A | |
FullStory | N/A | N/A | Fixed PW leak bug | |
N/A | N/A | Limited | ||
Smartlook | N/A | N/A | Limited | |
Yandex | N/A | N/A | Limited | |
First parties | Walgreens | N/A | N/A | Stopped |
Bonobos | N/A | N/A | Stopped | |
Gradescope | N/A | N/A | Stopped |
Source code — A modified version of OpenWPM
The code for our modified version of the OpenWPM web privacy measurement tool can be found on GitHub.
Data
The data from our 2017 measurements is available for download in bzip2 archives:
- Spoofed Facebook Login Crawl (31.1 GB)—Used to detect Social API misuse.
- Chunk Injection Crawl 1 - with injection (25.6 GB)—Used to detect Whole-DOM Scraping.
- Chunk Injection Crawl 2 - no injection (13.8 GB)—Used to detect Whole-DOM Scraping.
- Identity Injection Crawl (31.8 GB)—Used to detect Login Manager misuse and Whole-DOM Scraping.
Acknowledgements
We thank our shepherd Oleksii Starov for his valuable feedback, Mihir Kshirsagar for his help with questions related to regulations, Dimitar Bounov and Sorin Lerner for bringing Gradescope leaks to our attention. This study is supported by an NSF grant (CNS 1526353). Some of our measurements were funded by an Amazon AWS Cloud Credits for Research grant. Gunes Acar holds a Postdoctoral fellowship from the Research Foundation Flanders (FWO). This work was supported by CyberSecurity Research Flanders with reference number VR20192203.
Reference
Reference: Gunes Acar, Steven Englehardt, Arvind Narayanan. No boundaries: data exfiltration by third parties embedded on web pages. In Proceedings of the 20th Privacy Enhancing Technologies Symposium (PETS), July 13–16, 2020.
BibTeX:
@inproceedings{acar2020noboundaries,
author = {Gunes Acar and Steven Englehardt and Arvind Narayanan},
title = {No boundaries: data exfiltration by third parties embedded on web pages},
booktitle = {Proceedings of the 20th Privacy Enhancing Technologies Symposium (PETS)},
year = 2020,
month = jul,
publisher = {Sciendo},
doi = {10.2478/popets-2020-0070},
url = {https://petsymposium.org/2020/files/papers/issue4/popets-2020-0068.pdf}
}
Contact
Gunes Acar | gunes.acar@esat.kuleuven.be |
Steven Englehardt | senglehardt@mozilla.com |
Arvind Narayanan | arvindn@cs.princeton.edu |