[R] Gender balance in R
skostysh at princeton.edu
Wed Nov 26 07:19:42 CET 2014
On Tue, Nov 25, 2014 at 1:15 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
> On 11/25/2014 04:11 AM, Scott Kostyshak wrote:
>> On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee <sarah.goslee at gmail.com>
>>> I took a look at apparent gender among list participants a few years ago:
>>> Same general thing: very few regular participants on the list were
>>> women. I don't see any sign that that has changed in the last three
>>> years. The bar to participation in the R-help list is much, much lower
>>> than that to become a developer.
>> I plotted the gender of posters on r-help over time. The plot is here:
>> The code to reproduce that plot is here:
>> The R file there will call devtools::install_github to install a
>> package from Github used for guessing the gender based on the first
>> name (https://github.com/scottkosty/gender).
> It would be great to include in your package the script that scraped author
> names from R-help archives (I guess that's what you did?). Presumably it
> easily applies to other mailing lists hosted at the same location (R-devel,
> further along the ladder from user to developer, and Bioconductor /
> Bioc-devel, in a different domain and perhaps confounded with a different
> 'feel' to the list). Also the R community is definitely international, so
> finding more versatile gender-assignment approaches seems important.
I just put the script up on https://github.com/scottkosty/genderAnalysis
I don't have much time at the moment to generalize it, but a pull
request is always welcome. Alternatively, anyone is welcome (at least
as far as I'm concerned) to take the script and modify it for any
> it might be interesting to ask about participation in mailing list forums
> versus other, and in particular the recent Bioconductor transition from
> mailing list to 'StackOverflow' style support forum
> (https://support.bioconductor.org) -- on the one hand the 'gamification'
> elements might seem to only entrench male participation, while on the other
> we have already seen increased (quantifiable) and broader (subjective)
> participation from the Bioconductor community. I'd be happy to make support
> site usage data available, and am interested in collaborating in an
> academically well-founded analysis of this data; any interested parties
> please feel free to contact me off-list.
I would be interested in collaborating on such a project in the future also.
Economics PhD Candidate
> Martin Morgan
>> Note also on that tweet that Gabriela de Queiroz posted it, who is the
>> founder of R-ladies; and that David Smith showed interest in
>> discussing the topic. So there is definitely demand for some data
>> analysis and discussion on the topic.
>>> It would be interesting to look at the stats for CRAN packages as well.
>>> The very low percentage of regular female participants is one of the
>>> things that keeps me active on this list: to demonstrate that it's not
>>> only men who use R and participate in the community.
>> Thank you for that!
>> Scott Kostyshak
>> Economics PhD Candidate
>> Princeton University
>>> (If you decide to do the stats for 2014, be aware that I've been out
>>> on medical leave for the past two months, so the numbers are even
>>> lower than usual.)
>>> On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw
>>> <maarten.blaauw at qub.ac.uk> wrote:
>>>> Hi there,
>>>> I can't help to notice that the gender balance among R developers and
>>>> ordinary members is extremely skewed (as it is with open source software
>>>> Have a look at http://www.r-project.org/foundation/memberlist.html - at
>>>> a handful of women are listed among the 'supporting members', and none
>>>> all among the 29 'ordinary members'.
>>>> On the other hand I personally know many happy R users of both genders.
>>>> My questions are thus: Should R developers (and users) be worried that
>>>> 'other half' is excluded? If so, how could female R users/developers be
>>>> persuaded to become more visible (e.g. added as supporting or ordinary
>>> Sarah Goslee
>>> R-help at r-project.org mailing list
>>> PLEASE do read the posting guide
>>> and provide commented, minimal, self-contained, reproducible code.
>> R-help at r-project.org mailing list
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
More information about the R-help