posted on 2014-08-12, 00:00authored byNeil R. Smalheiser, Jennifer L. D'Souza
We have created several novel journal metrics related directly or indirectly to author publication behavior. Our original motivation was to identify different ways of capturing the similarity of two journals, in a manner that will assist us in answering the question: Given any two articles in PubMed that share the same author name (lastname, first initial), how does knowing only the identity of the journals (in which the articles were published) predict the relative likelihood that they are written by the same person vs. different persons? We employed the 2009 Author-ity author name disambiguation dataset as a gold standard for estimating the author odds ratio, which gives a straightforward, intuitive answer to this question. However, the author odds ratio is subject to several minor limitations, so we also devised two complementary journal metrics. The MeSH odds ratio measures the topical similarity of any pair of journals, based on the major MeSH headings assigned to articles in MEDLINE. The article pair odds ratio detects the tendency of authors to publish repeatedly in the same journal, as well as in specific pairs of journals. The metrics can be applied not only to estimate similarity of journal pairs, but to provide novel profiles of individual journals as well. For example, for each journal, one can define the MeSH cloud as the number of other journals that are topically more similar to it than expected by chance, and the author cloud as the number of other journals that share more authors than expected by chance. These metrics for journal pairs and individual journals have been provided in the form of public datasets that can be readily studied and utilized by others.
Funding
National Institutes of Health grants R01LM010817 and P01AG039347
History
Publisher Statement
This dataset contains journal pair metrics for pairs of journals in PubMed that share at least one author. See the README file for details.