Abstract:
As large AI models demonstrate increasingly human-like performance
on complex tasks, many scientists are developing
or adapting these models to empower their research and
applications. Because of the substantial costs involved in
building, training, and running large AI models, closedsource
models can often offer performance that cannot be
matched by open-source counterparts, making them tempting
tools for researchers even if they are not transparent or
accessible according to conventional academic standards.
Moreover, even researchers who are developing their own AI
models may face special challenges when trying to publish
their work in an open and reproducible manner. In particular,
the very large datasets required to train AI models often
come with special challenges that make them inherently
hard to share—ranging from sheer size to tricky copyright
and privacy issues. In this editorial, we share some insights
and tips that we hope will help researchers in this field understand
our journal’s policies and prepare submissions for the
journal.