Discussion:
[Wikimedia-l] [Wikidata] Knowledge Integrity: A proposed Wikimedia Foundation cross-departmental program for 2018-2019
David Cuenca Tudela
2018-04-17 08:39:41 UTC
Permalink
Hi Dario&Jake,

Thanks for sharing the plan. Any possibility to include in the plan a
system to archive all reference URLs and external identifiers linked from
Wikidata?
https://phabricator.wikimedia.org/T143488

Additionally I think it would be interesting to have some research done on
which references are DISPLAYED or CLICKED the most on several Wikipedias.
We know already which sources are cited the most, but on which sources do
users hover their mouse the most? Can we also identify which statements are
involved? It could be used to expand them, improve them, or add more
context.

Finally I believe it would be that a tool to assess the
openness/accessibility of the sources of any given article could be really
interesting.

Regards,
Micru


On Tue, Apr 17, 2018 at 2:32 AM, Dario Taraborelli <
Hey all,
(apologies for cross-posting)
We’re sharing a proposed program
<https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/CDP3:_Knowledge_Integrity>
for the Wikimedia Foundation’s upcoming fiscal year
<https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2018-2019/Draft>
(2018-19) and *would love to hear from you*. This plan builds
extensively on projects and initiatives driven by volunteer contributors
and organizations in the Wikimedia movement, so your input is critical.
Why a “knowledge integrity” program?
Increased global attention is directed at the problem of misinformation
and how media consumers are struggling to distinguish fact from fiction.
Meanwhile, thanks to the sources they cite, Wikimedia projects are uniquely
positioned as a reliable gateway to accessing quality information in the
broader knowledge ecosystem. How can we mobilize these citations as a
resource and turn them into a broader, linked infrastructure of trust to
serve the entire internet? Free knowledge grounds itself in verifiability
and transparent attribution policies. Let’s look at 4 data points as
- Wikipedia sends tens of millions of people to external sources each
year. We want to conduct research to understand why and how readers leave
our site.
- The Internet Archive has fixed over 4 million dead links on
Wikipedia. We want to enable instantaneous archiving of every link on all
Wikipedias to ensure the long-term preservation of the sources Wikipedians
cite.
- #1Lib1Ref reaches 6 million people on social media. We want to bring
#1Lib1Ref to Wikidata and more languages, spreading the message that
references improve quality.
- 33% of Wikidata items represent sources (journals, books, works). We
want to strengthen community efforts to build a high-quality, collaborative
database of all cited and citable sources.
A 5-year vision
Our 5-year vision for the Knowledge Integrity program is to establish
Wikimedia as the hub of a federated, trusted knowledge ecosystem. We plan
- A roadmap to a mature, technically and socially scalable, central
repository of sources.
- Developed network of partners and technical collaborators to
contribute to and reuse data about citations.
- Increased public awareness of Wikimedia’s vital role in information
literacy and fact-checking.
5 directions for 2018-2019
We have identified 5 levers of Knowledge Integrity: research,
infrastructure and tooling, access and preservation, outreach, and
1. Continue to conduct research to understand how readers access
sources and how to help contributors improve citation quality.
2. Improve tools for linking information to external sources,
catalogs, and repositories.
3. Ensure resources cited across Wikimedia projects are accessible in
perpetuity.
4. Grow outreach and partnerships to scale community and technical
efforts to improve the structure and quality of citations.
5. Increase public awareness of the processes Wikimedians follow to
verify information and articulate a collective vision for a trustable web.
Who is involved?
- Wikimedia Foundation Technology’s Research Team
- Wikimedia Foundation Community Engagement’s Programs team (Wikipedia
Library)
- Wikimedia Deutschland Engineering’s Wikidata team
The initiative also spans across an ecosystem of possible partners
including the Internet Archive, ContentMine, Crossref, OCLC, OpenCitations,
and Zotero. It is further made possible by funders including the Sloan,
Gordon and Betty Moore, and Simons Foundations who have been supporting the
WikiCite initiative to date.
How you can participate
You can read the fine details of our proposed year-1 plan, and provide
your feedback, on mediawiki.org: https://www.mediawiki.org/wiki/Wikimedia_
Technology/Annual_Plans/FY2019/CDP3:_Knowledge_Integrity
We’ve also created a brief introductory slidedeck about our motivation and
goals: https://commons.wikimedia.org/wiki/File:Knowledge_Integrity_CDP_
proposal_%E2%80%93_FY2018-19.pdf
WikiCite has laid the groundwork for many of these efforts. Read last
WikiCite_2017_report.pdf
Recent initiatives like the just released citation dataset foreshadow the
work we want to do: https://medium.com/freely-
sharing-the-sum-of-all-knowledge/what-are-the-ten-most-
cited-sources-on-wikipedia-lets-ask-the-data-34071478785a
Lastly, this April we’re celebrating Open Citations Month; it’s right in
the spirit of Knowledge Integrity: https://blog.wikimedia.org/2018/04/02/
initiative-for-open-citations-birthday/
--
*Dario Taraborelli *Director, Head of Research, Wikimedia Foundation
<http://twitter.com/readermeter>
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
--
Etiamsi omnes, ego non
geni
2018-04-17 21:34:37 UTC
Permalink
Post by David Cuenca Tudela
Additionally I think it would be interesting to have some research done on
which references are DISPLAYED or CLICKED the most on several Wikipedias.
We know already which sources are cited the most, but on which sources do
users hover their mouse the most? Can we also identify which statements are
involved?
Absolutely not. Leave that kind of spying to advertising companies and
three letter agencies. We have standards.
Post by David Cuenca Tudela
Finally I believe it would be that a tool to assess the
openness
Look for https://en.wikipedia.org/wiki/Template:Open_access
Post by David Cuenca Tudela
/accessibility of the sources of any given article could be really
interesting.
Would turn into an argument over definitions. For example is the
Mabinogion accessible? Public domain, copies can be found on various
websites but I don't speak welsh. Limiting it to English gets to the
next problem. Is this accessible:

https://www.aanda.org/articles/aa/full_html/2014/11/aa21621-13/aa21621-13.html

Its in English but I don't have a degree in physics.

Where there are more obvious limits it gets more complicated. You may
be tempted to lump all paywalls together is it really fair to lump
something that costs €1 for total access in with something that
charges $40 for one article. Does the currency it charges in make a
difference?

Books too have their fun aspects. Try automating judging the relative
accessibility of Birmingham's Electric Dustcarts and 7000 years of
jewelry.
--
geni
Loading...