No boundaries: data exfiltration by third parties embedded on web pages

In a blogpost series in 2017/18, we revealed how web trackers exfiltrate personal information from web pages, browser password managers, form inputs, and the Facebook Login API. Our findings resulted in many fixes and privacy improvements to browsers, websites, third parties, and privacy protection tools. However, the root causes of these privacy failures remain unchanged, because they stem from the design of the web itself.

In a paper at the 2020 Privacy Enhancing Technologies Symposium, we recap our findings and propose two potential paths forward. This companion website provides an overview of the attacks and summarizes the response to the initial release of our findings.

Paper (PDF) »

The study is a collaboration between Gunes Acar (KU Leuven), Steven Englehardt (Mozilla), and Arvind Narayanan (Princeton University)

We discovered three broad classes of vulnerabilities during our 2017 measurements: the misuse of browser login managers by trackers, data exfiltration from social login APIs, and the leaking of personal data to trackers due to whole-DOM exfiltration. All three classes are explored in depth in our paper. As an example, we provide a brief summary of login manager misuse below.

We found two tracking scripts misusing browser login managers to collect email addresses. These scripts were present on 1,100 of the Alexa top 1M sites in September 2017.

The mechanism

First, a user fills out a login form on the page and asks the browser to save the login credentials. The third-party script does not need to be present on the login page. Then, the user visits another page on the same website which includes the third-party script. The script inserts an invisible login form, which is automatically filled in by the browser's login manager. The third-party script retrieves the user's email address by reading the populated form and sends the email hashes to third-party servers.

Demo »

The publication of our findings resulted in several privacy fixes and improvements to browsers, websites, third parties and privacy protection tools. The table below provides an overview. A full description of the responses is available in Section 7.1 of our paper.

Party Attacks
Login Manager Social Integration DOM Exfiltration
Browsers Chrome Considering restrictions No fix No fix
Firefox Considering restrictions Proposal[1] Proposal[1]
Safari Require interaction No fix No fix
Brave Require interaction Proposal[2] Proposal[2]
Blocklists EasyList & EasyPrivacy Already Blocked[3] Not blocked Blocked
Disconnect Blocked Not blocked[4] Blocked
Third parties Adthink Stopped N/A N/A
OnAudience Stopped Stopped N/A
FullStory N/A N/A Fixed PW leak bug
Facebook N/A N/A Limited
Smartlook N/A N/A Limited
Yandex N/A N/A Limited
First parties Walgreens N/A N/A Stopped
Bonobos N/A N/A Stopped
Gradescope N/A N/A Stopped

The code for our modified version of the OpenWPM web privacy measurement tool can be found on GitHub.

The data from our 2017 measurements is available for download in bzip2 archives:

We thank our shepherd Oleksii Starov for his valuable feedback, Mihir Kshirsagar for his help with questions related to regulations, Dimitar Bounov and Sorin Lerner for bringing Gradescope leaks to our attention. This study is supported by an NSF grant (CNS 1526353). Some of our measurements were funded by an Amazon AWS Cloud Credits for Research grant. Gunes Acar holds a Postdoctoral fellowship from the Research Foundation Flanders (FWO). This work was supported by CyberSecurity Research Flanders with reference number VR20192203.

Reference: Gunes Acar, Steven Englehardt, Arvind Narayanan. No boundaries: data exfiltration by third parties embedded on web pages. In Proceedings of the 20th Privacy Enhancing Technologies Symposium (PETS), July 13–16, 2020.

BibTeX:

@inproceedings{acar2020noboundaries,
  author    = {Gunes Acar and Steven Englehardt and Arvind Narayanan},
  title     = {No boundaries: data exfiltration by third parties embedded on web pages},
  booktitle = {Proceedings of the 20th Privacy Enhancing Technologies Symposium (PETS)},
  year      = 2020,
  month     = jul,
  publisher = {Sciendo},
  doi       = {10.2478/popets-2020-0070},
  url       = {https://petsymposium.org/2020/files/papers/issue4/popets-2020-0068.pdf}
}