论文标题
PreprintMatch:用于分析科学出版中全球不平等的预印刷发布检测工具
PreprintMatch: a tool for preprint publication detection applied to analyze global inequities in scientific publishing
论文作者
论文摘要
预印本,即同行评审之前的科学手稿版本,在受欢迎程度上越来越受欢迎。他们提供了使研究民主化和加速研究的机会,因为他们没有发布成本或漫长的同行审查过程。预印本后来通常在同行评审的场地上发表,但是这些出版物和原始预印本通常不会以任何方式链接。为此,我们开发了一个工具PrepRintMatch,以查找预印本与其相应已发表论文之间的匹配(如果存在)。该工具在匹配性能和速度方面都优于现有技术以匹配预印本和论文。将PreprintMatch应用于搜索预印本(来自Biorxiv和MedRxiv)和PubMed之间的匹配。预印本的初步性质在相对较早的阶段为科学项目提供了独特的观点,并且在预印本和纸之间更好地匹配,我们探索了与研究不平等有关的问题。我们发现,低收入国家的预印本以同行评审的论文发表,其速度低于高收入国家(分别为39.6 \%和61.1 \%),我们的数据与以前的工作是一致的,这些工作引用了缺乏资源,缺乏稳定性和政策选择来解释这一差异。还发现,低收入国家的预印本更快地发布(178 vs 203天),与高收入国家相比,标题,摘要和作者相似。低收入国家比高收入国家(分别为0.42个作者和0.32)增加了从预印本到已发表版本的作者,这种做法在中国与类似国家相比要频繁得多。最后,我们发现一些出版商发布的与低收入国家的作者的合作要比其他人更频繁。 PrepRintMatch可在\ url {https://github.com/petereckmann1/preprint-match}上找到。
Preprints, versions of scientific manuscripts that precede peer review, are growing in popularity. They offer an opportunity to democratize and accelerate research, as they have no publication costs or a lengthy peer review process. Preprints are often later published in peer-reviewed venues, but these publications and the original preprints are frequently not linked in any way. To this end, we developed a tool, PreprintMatch, to find matches between preprints and their corresponding published papers, if they exist. This tool outperforms existing techniques to match preprints and papers, both on matching performance and speed. PreprintMatch was applied to search for matches between preprints (from bioRxiv and medRxiv), and PubMed. The preliminary nature of preprints offers a unique perspective into scientific projects at a relatively early stage, and with better matching between preprint and paper, we explored questions related to research inequity. We found that preprints from low income countries are published as peer-reviewed papers at a lower rate than high income countries (39.6\% and 61.1\%, respectively), and our data is consistent with previous work that cite a lack of resources, lack of stability, and policy choices to explain this discrepancy. Preprints from low income countries were also found to be published quicker (178 vs 203 days) and with less title, abstract, and author similarity to the published version compared to high income countries. Low income countries add more authors from the preprint to the published version than high income countries (0.42 authors vs 0.32, respectively), a practice that is significantly more frequent in China compared to similar countries. Finally, we find that some publishers publish work with authors from lower income countries more frequently than others. PreprintMatch is available at \url{https://github.com/PeterEckmann1/preprint-match}.