"The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement" presented at ACM IMC 2023

"The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement" presented at ACM IMC 2023

October 24, 2023

We presented a paper titled The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement at the ACM Internet Measurement Conference in Montréal, Canada.


Much of the content and structure of the Web remains inaccessible to evaluate at scale because it is gated by user authentication. This limitation restricts researchers to examining only a superficial layer of a website: the landing page or public, search-indexable pages. Since it is infeasible to create individual accounts across thousands of webpages, we examine the prevalence of Single Sign-On (SSO) on the web to explore the feasibility of using a few accounts to authenticate to many sites. We find that 58% of the top 10K websites with logins are accessible with popular 3rd-party SSO providers, such as Google, Facebook, and Apple, indicating that leveraging SSO offers a scalable solution to access a large volume of user-gated content.

PDF   Code (GitHub)

Calvin Ardi and Matt Calder. 2023. The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement. In Proceedings of the 2023 ACM on Internet Measurement Conference (IMC ‘23). Association for Computing Machinery, New York, NY, USA, 124–130. https://doi.org/10.1145/3618257.3624841
author = {Ardi, Calvin and Calder, Matt},
title = {The Prevalence of Single Sign-On on the Web: Towards the Next
Generation of Web Content Measurement},
year = {2023},
isbn = {9798400703829},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3618257.3624841},
doi = {10.1145/3618257.3624841},
abstract = {Much of the content and structure of the Web remains
inaccessible to evaluate at scale because it is gated by user
authentication. This limitation restricts researchers to examining only
a superficial layer of a website: the landing page or public,
search-indexable pages. Since it is infeasible to create individual
accounts across thousands of webpages, we examine the prevalence of
Single Sign-On (SSO) on the web to explore the feasibility of using a
few accounts to authenticate to many sites. We find that 58\% of the top
10K websites with logins are accessible with popular 3rd-party SSO
providers, such as Google, Facebook, and Apple, indicating that
leveraging SSO offers a scalable solution to access a large volume of
user-gated content.},
booktitle = {Proceedings of the 2023 ACM on Internet Measurement
pages = {124–130},
numpages = {7},
keywords = {web measurement, web authentication, top lists, single
location = {<conf-loc>, <city>Montreal QC</city>,
<country>Canada</country>, </conf-loc>},
series = {IMC '23}