"The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement" presented at ACM IMC 2023

"The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement" presented at ACM IMC 2023

October 24, 2023

We presented a paper titled The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement at the ACM Internet Measurement Conference in Montréal, Canada.

Abstract:

Much of the content and structure of the Web remains inaccessible to evaluate at scale because it is gated by user authentication. This limitation restricts researchers to examining only a superficial layer of a website: the landing page or public, search-indexable pages. Since it is infeasible to create individual accounts across thousands of webpages, we examine the prevalence of Single Sign-On (SSO) on the web to explore the feasibility of using a few accounts to authenticate to many sites. We find that 58% of the top 10K websites with logins are accessible with popular 3rd-party SSO providers, such as Google, Facebook, and Apple, indicating that leveraging SSO offers a scalable solution to access a large volume of user-gated content.


PDF   Code (GitHub)


Calvin Ardi and Matt Calder. 2023. The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement. In Proceedings of the 2023 ACM on Internet Measurement Conference (IMC ‘23). Association for Computing Machinery, New York, NY, USA, 124–130. https://doi.org/10.1145/3618257.3624841
@inproceedings{10.1145/3618257.3624841,
author = {Ardi, Calvin and Calder, Matt},
title = {The Prevalence of Single Sign-On on the Web: Towards the Next
Generation of Web Content Measurement},
year = {2023},
isbn = {9798400703829},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3618257.3624841},
doi = {10.1145/3618257.3624841},
abstract = {Much of the content and structure of the Web remains
inaccessible to evaluate at scale because it is gated by user
authentication. This limitation restricts researchers to examining only
a superficial layer of a website: the landing page or public,
search-indexable pages. Since it is infeasible to create individual
accounts across thousands of webpages, we examine the prevalence of
Single Sign-On (SSO) on the web to explore the feasibility of using a
few accounts to authenticate to many sites. We find that 58\% of the top
10K websites with logins are accessible with popular 3rd-party SSO
providers, such as Google, Facebook, and Apple, indicating that
leveraging SSO offers a scalable solution to access a large volume of
user-gated content.},
booktitle = {Proceedings of the 2023 ACM on Internet Measurement
Conference},
pages = {124–130},
numpages = {7},
keywords = {web measurement, web authentication, top lists, single
sign-on},
location = {<conf-loc>, <city>Montreal QC</city>,
<country>Canada</country>, </conf-loc>},
series = {IMC '23}
}