The Internet Should Be More Like Wikipedia

Over 20 years, Wikipedia has built a reputation most websites would envy. What can the rest of us learn from that?
Stephen Gossett
October 13, 2020
Updated: October 14, 2020
Stephen Gossett
October 13, 2020
Updated: October 14, 2020

It’s not too difficult to imagine an alternate timeline in which Katherine Maher, the executive director of Wikipedia’s parent organization, is paraded alongside Big Tech leaders in Congressional grillings over digital misinformation.

Indeed, if you had asked Brian Keegan years ago about potential interference targets, he would have guessed that Wikipedia’s combination of scrappy self-governance and popularity would have made it prime prey for the Russian troll factory.

“Your first bet wouldn’t be Twitter, Facebook, YouTube — it would probably be something like Wikipedia, because it’s just a hardscrabble, bare-bones crew of people who are kind of keeping the wheels from falling off the thing,” said Keegan, a professor at University of Colorado–Boulder’s Department of Information Science who studies peer production and social computing.

Keegan contributed an essay to the new book Wikipedia @ 20, which considers the site’s legacy and future as it hits its two-decade anniversary. (A print edition is forthcoming via MIT Press on October 13, but — in proper knowledge-commons style — the full text of each essay is available free online.) The book covers topics ranging from dismissive early media coverage of Wikipedia to current efforts to diversify editorial ranks.

But Keegan, examining the site’s history of handling breaking news, most directly explores the site’s relative ability to fend off misinformation and manipulation — what he aptly dubs “sociotechnical sludge.” Indeed, as Wikipedia approaches 20, there’s been a small proliferation of articles celebrating the site as “a bastion of the Good Internet” and “the last best place on the internet.”

So how exactly has Wikipedia kept its good name while newer platforms have failed? And what transferable lessons does it offer?

How has Wikipedia avoided the “sludge?”

  • Peer production over ad- and engagement maximization.
  • Robust editorial oversight.
  • No reliance on personalization or amplification algorithms.

 

wikipedia commons
Image: Shutterstock

Wikipedia Isn’t Trying to Optimize for Engagement

One obvious potential reason for its “sludge” resistance, according to Keegan, is that Wikipedia is built on a fundamentally different foundation from that of popular social platforms. The site’s model is “a relic of a simpler time on the internet,” he told Built In. Specifically, it’s a not-for-profit, social production model rather than engagement- and ad-driven.

Anyone can suggest edits to a Wikipedia page, but the foundation isn’t incentivized to maximize participation — there are no shareholders closely tracking monthly active users or click-through rates. Various barriers of entry — the many-layered editing process, revision displays that can be confusing to newcomers — become, in practice, the opposite of optimizing for engagement.

Wikipedia may not be actively throwing up roadblocks to participation, but it’s “also OK with not everyone being an editor — the way a Mark Zuckerberg or Jack Dorsey would like all seven billion people in the world to be users of their platforms,” Keegan said.

“I think that’s a good first step toward keeping a basic level of civility and content quality,” he added.

Few are naive enough to expect the major social platforms to fundamentally push away from IPO- or acquisition-driven models. But that doesn’t mean people aren’t building roadmaps.

Among the most notable is Keegan’s University of Colorado–Boulder colleague, Nathan Schneider, who helped pioneer the concept of “exit to community.” That model envisions paths to shift from investor ownership to user and/or employee ownership and governance.

Schneider’s ideas are also making waves in the open-source community, as Built In previously reported.

Social platforms could also potentially take a lesson from Wikipedia’s (kinda-sorta) decentralization. Wikipedia operates on a central database, but its peer production model and edit transparency bears some similarities to decentralized networks. And this option feels decidedly less radical a step for the major social platforms than, say, exit to community. In fact, Jack Dorsey announced first steps toward building “an open and decentralized standard for social media” last year, which Twitter would ultimately incorporate, seemingly as a step toward a more user-empowered vision of content moderation.

RelatedThe Key to a Better Social Media Ecosystem? Friction.

 

The Site Has a ‘Strong Editorial Identity’

wikipedia at 20A second trait that helps Wikipedia stand relatively further above the muck? Wikipedia has a built-in editorial lodestar that social media platforms lack. Whereas sites like Facebook have recently invested more heavily in moderation (with limited results), their sense of neutrality remains “passive” — compared to Wikipedia’s “active” sense of neutrality, Keegan writes.

Wikipedia operates according to five fundamental “pillars,” one of which is that the site maintains a neutral point of view. That also includes a mandate to strive for reliable citations and “verifiable accuracy.”

Of course, what constitutes neutrality is its own fraught conversation, and not just in terms of “post-truth” misinformation and intentional water-muddying. Journalism, for instance, continues to grapple with how ideals of impartiality have sometimes marginalized underrepresented groups.

That’s not dissimilar to what Wikipedia editors sometimes face. As Slate reported in June, heated debates took place over whether an entry should be titled “Death,” “Murder” or “Killing of George Floyd,” with the issue of neutral perspective driving the back-and-forth. (Editors ultimately favored the latter.)

“People sometimes use neutrality to silence voices.”

Jackie Koerner, co-editor of Wikipedia @ 20, said it’s not uncommon for issues of race and gender to become flashpoints. She pointed to civil rights-related articles, where an entry may have the facts correct but fail to fully illustrate the gravity of police violence against demonstrators — in other words, the true nature of the event is actually obfuscated, not clarified. “People sometimes use neutrality to silence voices,” she said.

All that said, the foundational commitment to neutrality does seem to have steered the site well in its first two decades, even as the community wrestles with its complications going forward.

 

wikipedia human algorithm
Image: Shutterstock

Wikipedia Uses Machine Learning to Flag Suspicious Updates

Wikipedia is also a holdover from simpler times in how it eschews the algorithmic personalization and amplification mechanisms that govern so much of online existence in 2020. Most users find their way to a Wiki entry via old-fashioned search-and-click, and the closest thing you’ll find to amplified content are homepage features like “In the News” and “On This Day” links, Keegan writes.

The flanks of experienced editors are undeniably integral. “The responsiveness of Wikipedia editors to current events also provides an important counter-factual to claims from engineering culture that human-in-the-loop systems lack the scalability, speed and accuracy of automated systems, despite accumulating evidence of automated systems’ multiple liabilities,” Keegan writes.

Still, it’s incorrect and counterproductive to characterize Wikipedia as a bastion of algorithm-free people power. In a 2019 op-ed, Katherine Maher cautioned against empty Web 1.0 nostalgia and underscored the important role that artificial intelligence plays in spotting potential vandalism and directing editors to pages that might need evaluation.

“In short, AI that is deployed by and for humans can improve the experience of both people consuming information and those producing it,” she wrote.

“AI that is deployed by and for humans can improve the experience of both people consuming information and those producing it.”

The most notable example of AI assistance is Objective Revision Evaluation Service (ORES), created by R. Stuart Geiger and Aaron Halfaker. The machine-learning API can automatically generate scores for whether a Wikipedia contribution is potentially damaging and the likelihood an edit was done in good faith.

“If you have a sequence of bad scores, then you’re probably a vandal. If you have a sequence of good scores with a couple of bad ones, you’re probably a good faith contributor,” Halfaker told the Observer when the tool was introduced.

There have been other machine-learning breakthroughs too. Koerner pointed to Quicksilver, natural-language-processing software that in 2018 scanned millions of Wikipedia pages, news articles and scientific papers, then came up with a list of 40,000 notable scientists that had been missing from Wikipedia. It even generated first drafts for each entry.

RelatedAI Has an Ethics Problem. Gen Z Is Poised to Help Fix It.

 

wikipedia international
Image: Shutterstock

Challenges Ahead

For all of Wikipedia’s successes, it still suffers from blind spots and biases that negatively inform its content. Just ask Donna Strickland.

The optical physicist spearheaded groundbreaking research around laser intensity, which had implications in medicine, manufacturing and beyond. Despite her pioneering work, Strickland for years did not have a Wikipedia page — an oversight corrected only after she was awarded a Nobel Prize. The editorial make-up is roughly 80 percent men to 20 percent women, Maher told Marketplace earlier this year. But the gender gap is just one of several on the minds of equity-focused community members.

Koerner addresses Wikipedia’s (sometimes unintentional) bias problems in her essay. “Wikipedia materialized through predominantly westernized cisgender male voices, opinions, and biases,” she writes. “The awareness in the community, at that time, illustrated a rather singular point-of-view and developed policies and practices accordingly. This foundation is difficult to break.”

At the same time, internal conflicts often belie Wikipedia’s harmonious reputation. Aggressive users and a perceived lack of recourse have been cited as reasons for Wikipedia’s decline in contributor numbers. That contentious culture can manifest in everything from “relentless” gender harassment to needlessly difficult editing rounds. (Keegan retired as a regular contributor after a protracted fight against incorrect photo alignment on philosopher/chemist Joseph Priestly’s entry. The talk page is a must-read for fans of pedantic bureaucracy.)

Wikipedia’s parent organization is taking steps to address these issues. In a push to “eliminate harassment, toxicity, and incivility,” Wikimedia announced in May it would launch a universal code of conduct. The draft is set to be delivered to the board of trustees for review on October 13. And the fact that official Wikipedias now exist for 313 languages is a testament to the international outreach of projects like Wikimedia Incubator.

Even Wikipedia’s recently announced redesign — the site’s first significant visual overhaul in a decade — is, in a sense, an inclusivity measure. The stated goal is to make it more intuitive for all newcomers.

Challenges acknowledged, the fact remains that Wikipedia has not, in fact, found itself on the receiving end of legislative and user ire. To what degree its various successes — peer production and governance, editorial oversight, humans and algorithms together in the loop, an increasing willingness to counteract bias — are more broadly transferable depends on your perspective. And any such transference is hardly simple. “It’s literally a trillion-dollar question,” Keegan said.

Even Keegan, who notes in his essay the integrity-preserving value of non-personalized shared digital experiences (“Every English Wikipedia user’s ‘Abraham Lincoln’ article is the same regardless of their geography, gender, browsing history or social graph”), also pointed out that stated preferences against personalization don’t necessarily align with users’ true behavior.

“Twitter has very smart data scientists,” he told Built In. “I’m sure they run A/B experiments and have evidence that shows algorithmically mediated ones lead to greater engagement, more time on site — probably something around an overall improved user experience metric — compared to just the naive chronological feed.”

That said, he believes studying the lessons of Wikipedia’s past can be helpful for the future. “Those core values of deliberation, stewardship, and common ownership and common good seem old-fashioned, but that retro-ness is worth reconsidering and re-centering as we think about what values we want from the platforms we invest our data, time and attention in.”

RelatedData Belong to Everyone

Great Companies Need Great People. That's Where We Come In.

Recruit With Us