During Wikimedia’s Mailman3 migration, we discovered and fixed a security issue that would have disclosed the contents of private list archives during the import process. This post explains the issue, how we discovered it and how it was fixed.
How people use Search to access Wikipedia is a common question by researchers. Until now, however, there has been little data available about this relationship. To help address these questions, the Wikimedia Foundation is releasing a new, faceted dataset on search engine traffic to Wikipedia so you can ask questions like “What is the most common search engine in my country?” or “Which search engine is most-used by Android users?”
Wikimedia’s participation in Outreachy Round 21 focused on projects related to data science and engineering. In this post, the interns share the outcomes and experiences of their projects.
The Wikimedia Analytics Engineering team manages multiple systems, all gravitating around a big (for our standards) Hadoop cluster. This post describes our path to changing our Hadoop distribution in a single day, together with the lessons learned while doing it.
In the first of three posts about the implementation of Single Sign On (SSO). This post looks at the original landscape of Wikimedia’s web-based services, summarizes requirements for a new SSO provider, looks at existing FLOSS solutions, and explains why Apereo CAS was chosen.