IMO in my area at least, if code for a paper isn't available, or someone hasn't replicated the paper with the code available, it's hard to get invested ('i.e. this paper is interesting....oh, code isn't available.').

Speaking of which, I was impressed by this repository:��https://github.com/MaximeVandegar/Papers-in-100-Lines-of-Code��(yeah ok, it's code golf, but nice to see a minimal representation of different papers).

There's also a point to note that open data and data sovereignty��seem (?) to have a lot of friction.

Cheers,

Matthew

On Fri, Nov 26, 2021 at 11:11 AM Lawrence D'Oliveiro <ldo@geek-central.gen.nz> wrote:

The fruits of scientific research are supposed to be open to all. A key
part of this is the need for reproducibility -- the idea that somebody
else can repeat the same experiments and analysis and (hopefully) come
to the same conclusions. It has long been a common expectation that
researchers will make their raw data available to others for this
purpose, but nowadays even that is likely no longer enough. The analysis
of the data usually requires some particular piece of computer software,
even if this was just some in-house scripting done on top of a
commonly-available toolkit or package.

Two different reports on this subject have come out recently, this one
<https://www.theregister.com/2021/11/25/research_software_inquiry/>
from the UK and this
<https://arstechnica.com/science/2021/11/keeping-science-reproducible-in-a-world-of-custom-code-and-data/>
with examples from the US and elsewhere. The latter goes into a lot more
detail, including good news (the rise of publicly-available data sets
which get heavily used for many different analyses), and bad:

�� From 2017 through 2019, Tsuyoshi Miyakawa, the editor-in-chief of
�� the journal Molecular Brain, replied to 41 article submissions by
�� requesting that the authors provide their complete source data for
�� review, as per the stated policy of the journal. Only one author
�� did so.

�� ...

�� Based on his efforts to replicate papers from other statisticians,
�� Thomas Lumley, a professor of biostatistics at the University of
�� Auckland in New Zealand, says of the phrase data available upon
�� request: "When people put it in their papers, what they typically
�� mean is 'data not available.'"

As for making code available, that has its own challenges: often the
scripts/programs are hastily thrown together, and the creators may be
embarrassed to have others see it in this state. Or it��s not
likely to work properly anyway, outside of the original systems where
it was developed.

The good news is, bodies that fund the research and the journals that
publish the results are becoming more aware of such issues, and
increasingly trying to ensure that procedures for dealing with them are
built into the projects from the beginning.
_______________________________________________
wlug mailing list -- wlug@list.waikato.ac.nz | To unsubscribe send an email to wlug-leave@list.waikato.ac.nz
Unsubscribe: https://list.waikato.ac.nz/postorius/lists/wlug.list.waikato.ac.nz