- How are statistics generated in Denelezh?
- How can I get the number of biographies accross all Wikipedias?
- What is the history of Denelezh?
- What is the origin of the name Denelezh?
- Where can I find the source code of Denelezh?
- Do similar tools exist?
- Who are the authors and what are the licenses of the images and logos used on Denelezh?
How are statistics generated in Denelezh?
Denelezh uses Wikidata, the centralized knowledge base of Wikimedia projects, to generate statistics.
With each weekly dump of Wikidata, statistics are produced following these rules:
- A Wikidata item with a best value (not necessarily unique) for the property instance of equal to human is a human.
- A human with its (unique) best value for the property gender equal to female is a female, equal to male is a male, equal to any other value is an other. A human with zero or multiple values as best value for the property gender has no gender in Denelezh.
- A human has a year of birth if the property date of birth has a best value with the sufficient precision. In the case of several best values available, the year is used if it is equal for each value, otherwise the human has no year of birth.
- All normal and preferred values of the property country of citizenship are used as country of citizenship for each human.
- All normal and preferred values of the property occupation are used as occupation for each human.
- Parent occupations are deduced using the property subclass of. An occupation has to be directly used at least one time to appears in Denelezh.
- Countries used less than 200 times are discarded.
- Occupations used (directly or indirectly) less than 1,000 times are discarded.
- A sitelink represents a page in a Wikimedia project (including all Wikipedias, but also Wikisource, Wikiquote, ...) for a given item.
What are the ranks in Wikidata?
To sum up, each statement in Wikidata has a rank:
- deprecated: the value is incorrect;
- normal (the default rank): the value is correct;
- preferred: the value is the best among the correct values.
The best rank represent the best values that are available for a property in an item: the ones with the preferred rank if they exist, the ones with the normal rank otherwise.
How can I get the number of biographies accross all Wikipedias?
At the moment, Denelezh only provides statistics about a specific Wikimedia project or all Wikimedia projects together, not a subset of them.
What is the history of Denelezh?
The first version of Denelezh was released in March 2017. The second version, aka Denelezh 2.0, was released in April 2018, including many improvements:
- The gender gap by Wikimedia project is available.
- Statistics are made with all humans in Wikidata (and not only the ones with gender + year of birth + country of citizenship + occupation).
- There is no limit to the year of birth (statistics about humans born before 1600 are available).
- Occupations are deduced using the property « subclass of ».
Each major version was detailed in a blog post:
- First version, March 2017: A tool to estimate the gender gap in Wikidata and Wikipedia.
- Second version, April 2018: Denelezh 2.0, a transitional version.
Starting in March 2019, Denelezh is hosted by Wikimédia France.
What is the origin of the name Denelezh?
Denelezh means Humanity (the sum of all humans) in Breton.
Where can I find the source code of Denelezh?
Denelezh is a free and open source project released under AGPLv3 license. It is made up of two sub-projects:
Do similar tools exist?
Several tools providing statistics about the content of Wikimedia projects exist (sorted here by year of release):
- 2014 — Wikidata Human Gender Indicators (WHGI), state of the art from an academic point of view; tracks the number of biographies in Wikidata:
- gender gap by culture
- gender gap by country of birth
- gender gap by date of birth and by date of death
- gender gap by Wikipedia language
- 2017 — Denelezh, provides statistics about Wikidata items depicting human beings (all, with at least one sitelink, and the number of sitelinks):
- gender gap by Wikimedia project
- gender gap by country of citizenship
- gender gap by occupation
- gender gap by date of birth
- multidimensional analysis (the ability to combine previous axis)
- evolution of gender gap
- 2018 — WDCM Biases Dashboard, tracks the usage of Wikidata items (in how many pages each item is used in Wikimedia projects, not the number of sitelinks):
- gender gap by Wikimedia project
- gender gap by occupation
- gender gap by North-South divide
Who are the authors and what are the licenses of the images and logos used on Denelezh?
Images and logos used on Denelezh have various authors and licenses. The full list is available on the NOTICE file on the git repository of the project.