Single Sign-On (SSO)

Measuring the Prevalence of Single Sign-On Providers #

This project aims to understand the prevalence of logins on websites towards the goal of large scale measurement of user-gated content on the Web.

We develop two techniques to measure the number of first-party and third-party login mechanisms, the latter supported by Single Sign-On (SSO), on the top 10K websites based on the Chrome User Experience Report (CrUX) and find that:

  • 51% of the top 10K have a login, and more than half of those (30% of the top 10K) offer 3rd-party SSO login.

  • The most popular SSO providers are Google, Facebook, and Apple. These three enable sign-in for 47% of all sites with login and 24% of the top 10K sites.

This page covers our research, code, and dataset.

Research Paper #

We published our initial results at ACM IMC 2023 in Montréal, Canada:

Calvin Ardi and Matt Calder. 2023. The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement. In Proceedings of the 2023 ACM on Internet Measurement Conference (IMC ‘23). Association for Computing Machinery, New York, NY, USA, 124–130. https://doi.org/10.1145/3618257.3624841
@inproceedings{10.1145/3618257.3624841,
author = {Ardi, Calvin and Calder, Matt},
title = {The Prevalence of Single Sign-On on the Web: Towards the Next
Generation of Web Content Measurement},
year = {2023},
isbn = {9798400703829},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3618257.3624841},
doi = {10.1145/3618257.3624841},
abstract = {Much of the content and structure of the Web remains
inaccessible to evaluate at scale because it is gated by user
authentication. This limitation restricts researchers to examining only
a superficial layer of a website: the landing page or public,
search-indexable pages. Since it is infeasible to create individual
accounts across thousands of webpages, we examine the prevalence of
Single Sign-On (SSO) on the web to explore the feasibility of using a
few accounts to authenticate to many sites. We find that 58\% of the top
10K websites with logins are accessible with popular 3rd-party SSO
providers, such as Google, Facebook, and Apple, indicating that
leveraging SSO offers a scalable solution to access a large volume of
user-gated content.},
booktitle = {Proceedings of the 2023 ACM on Internet Measurement
Conference},
pages = {124–130},
numpages = {7},
keywords = {web measurement, web authentication, top lists, single
sign-on},
location = {<conf-loc>, <city>Montreal QC</city>,
<country>Canada</country>, </conf-loc>},
series = {IMC '23}
}

The code and data used in the paper can be found at https://github.com/webmeasurements/imc2023-sso.

Documentation for the code (TODO) and dataset (TODO) will be found here.

Applications #

TODO

Contributing #

TODO

Contact #