Albuquerque, NM, United States
Albuquerque, NM, United States

Time filter

Source Type

Ioannidis J.P.A.,Stanford University | Boyack K.W.,SciTech Strategies Inc. | Klavans R.,SciTech Strategies Inc.
PLoS ONE | Year: 2014

Background: The ability of a scientist to maintain a continuous stream of publication may be important, because research requires continuity of effort. However, there is no data on what proportion of scientists manages to publish each and every year over long periods of time. Methodology/Principal Findings: Using the entire Scopus database, we estimated that there are 15,153,100 publishing scientists (distinct author identifiers) in the period 1996-2011. However, only 150,608 (<1%) of them have published something in each and every year in this 16-year period (uninterrupted, continuous presence [UCP] in the literature). This small core of scientists with UCP are far more cited than others, and they account for 41.7% of all papers in the same period and 87.1% of all papers with >1000 citations in the same period. Skipping even a single year substantially affected the average citation impact. We also studied the birth and death dynamics of membership in this influential UCP core, by imputing and estimating UCP-births and UCP-deaths. We estimated that 16,877 scientists would qualify for UCP-birth in 1997 (no publication in 1996, UCP in 1997-2012) and 9,673 scientists had their UCP-death in 2010. The relative representation of authors with UCP was enriched in Medical Research, in the academic sector and in Europe/North America, while the relative representation of authors without UCP was enriched in the Social Sciences and Humanities, in industry, and in other continents. Conclusions: The proportion of the scientific workforce that maintains a continuous uninterrupted stream of publications each and every year over many years is very limited, but it accounts for the lion's share of researchers with high citation impact. This finding may have implications for the structure, stability and vulnerability of the scientific workforce. © 2014 Ioannidis et al.


Boyack K.W.,Inc. 8421 Manuel Cia Pl NE Albuquerque | Klavans R.,SciTech Strategies Inc.
Journal of the American Society for Information Science and Technology | Year: 2013

The majority of the effort in metrics research has addressed research evaluation. Far less research has addressed the unique problems of research planning. Models and maps of science that can address the detailed problems associated with research planning are needed. This article reports on the creation of an article-level model and map of science covering 16 years and nearly 20 million articles using cocitation-based techniques. The map is then used to define discipline-like structures consisting of natural groupings of articles and clusters of articles. This combination of detail and high-level structure can be used to address planning-related problems such as identification of emerging topics and the identification of which areas of science and technology are innovative and which are simply persisting. In addition to presenting the model and map, several process improvements that result in greater accuracy structures are detailed, including a bibliographic coupling approach for assigning current papers to cocitation clusters and a sequential hybrid approach to producing visual maps from models. © 2013 ASIS&T.


Boyack K.W.,SciTech Strategies Inc | Small H.,SciTech Strategies Inc. | Klavans R.,SciTech Strategies Inc.
Journal of the American Society for Information Science and Technology | Year: 2013

Historically, co-citation models have been based only on bibliographic information. Full-text analysis offers the opportunity to significantly improve the quality of the signals upon which these co-citation models are based. In this work we study the effect of reference proximity on the accuracy of co-citation clusters. Using a corpus of 270,521 full text documents from 2007, we compare the results of traditional co-citation clustering using only the bibliographic information to results from co-citation clustering where proximity between reference pairs is factored into the pairwise relationships. We find that accounting for reference proximity from full text can increase the textual coherence (a measure of accuracy) of a co-citation cluster solution by up to 30% over the traditional approach based on bibliographic information. © 2013 ASIS&T.


Boyack K.W.,SciTech Strategies Inc. | Klavans R.,SciTech Strategies Inc.
Journal of the Association for Information Science and Technology | Year: 2014

The majority of the effort in metrics research has addressed research evaluation. Far less research has addressed the unique problems of research planning. Models and maps of science that can address the detailed problems associated with research planning are needed. This article reports on the creation of an article-level model and map of science covering 16 years and nearly 20 million articles using cocitation-based techniques. The map is then used to define discipline-like structures consisting of natural groupings of articles and clusters of articles. This combination of detail and high-level structure can be used to address planning-related problems such as identification of emerging topics and the identification of which areas of science and technology are innovative and which are simply persisting. In addition to presenting the model and map, several process improvements that result in greater accuracy structures are detailed, including a bibliographic coupling approach for assigning current papers to cocitation clusters and a sequential hybrid approach to producing visual maps from models. © 2013 ASIS&T.


Klavans R.,SciTech Strategies Inc. | Boyack K.W.,SciTech Strategies Inc.
Journal of the American Society for Information Science and Technology | Year: 2011

We describe two general approaches to creating document-level maps of science. To create a local map, one defines and directly maps a sample of data, such as all literature published in a set of information science journals. To create a global map of a research field, one maps "all of science" and then locates a literature sample within that full context. We provide a deductive argument that global mapping should create more accurate partitions of a research field than does local mapping, followed by practical reasons why this may not be so. The field of information science is then mapped at the document level using both local and global methods to provide a case illustration of the differences between the methods. Textual coherence is used to assess the accuracies of both maps. We find that document clusters in the global map have significantly higher coherence than do those in the local map, and that the global map provides unique insights into the field of information science that cannot be discerned from the local map. Specifically, we show that information science and computer science have a large interface and that computer science is the more progressive discipline at that interface. We also show that research communities in temporally linked threads have a much higher coherence than do isolated communities, and that this feature can be used to predict which threads will persist into a subsequent year. Methods that could increase the accuracy of both local and global maps in the future also are discussed. © 2010 ASIS&T.


Boyack K.W.,SciTech Strategies Inc. | Klavans R.,SciTech Strategies Inc.
Journal of the American Society for Information Science and Technology | Year: 2010

In the past several years studies have started to appear comparing the accuracies of various science mapping approaches. These studies primarily compare the cluster solutions resulting from different similarity approaches, and give varying results. In this study we compare the accuracies of cluster solutions of a large corpus of 2,153,769 recent articles from the biomedical literature (2004"2008) using four similarity approaches: co-citation analysis, bibliographic coupling, direct citation, and a bibliographic coupling-based citation-text hybrid approach. Each of the four approaches can be considered a way to represent the research front in biomedicine, and each is able to successfully cluster over 92% of the corpus. Accuracies are compared using two metrics - within-cluster textual coherence as defined by the Jensen-Shannon divergence, and a concentration measure based on the grant-to-article linkages indexed in MEDLINE. Of the three pure citation-based approaches, bibliographic coupling slightly outperforms co-citation analysis using both accuracy measures; direct citation is the least accurate mapping approach by far. The hybrid approach improves upon the bibliographic coupling results in all respects. We consider the results of this study to be robust given the very large size of the corpus, and the specificity of the accuracy measures used. © 2010 ASIS&T.


Small H.,Thomson Reuters | Small H.,SciTech Strategies Inc
Scientometrics | Year: 2011

It is proposed that citation contexts, the text surrounding references in scientific papers, be analyzed in terms of an expanded notion of sentiment, defined to include attitudes and dispositions toward the cited work. Maps of science at both the specialty and global levels are used as the basis of this analysis. Citation context samples are taken at these levels and contrasted for the appearance of cue word sets, analyzed with the aid of methods from corpus linguistics. Sentiments are shown to vary within a specialty and can be understood in terms of cognitive and social factors. Within-specialty and between-specialty co-citations are contrasted and in some cases suggest a correlation of sentiment with structural location. For example, the sentiment of "uncertainty" is important in interdisciplinary co-citation links, while "utility" is more prevalent within the specialty. Suggestions are made for linking sentiments to technical terms, and for developing sentiment "baselines" for all of science. © 2011 Akadémiai Kiadó, Budapest, Hungary.


Klavans R.,SciTech Strategies Inc. | Boyack K.W.,SciTech Strategies Inc.
Scientometrics | Year: 2010

We compare a new method for measuring research leadership with the traditional method. Both methods are objective and reliable, utilize standard citation databases, and are easily replicated. The traditional method uses partitions of science based on journal categories, and has been extensively used to measure national leadership patterns in science, including those appearing in the NSF Science & Engineering Indicators Reports and in prominent journals such as Science and Nature. Our new method is based on co-citation techniques at the paper level. It was developed with the specific intent of measuring research leadership at a university, and was then extended to examine national patterns of research leadership. A comparison of these two methods provides compelling evidence that the traditional method grossly underestimates research leadership in most countries. The new method more accurately portrays the actual patterns of research leadership at the national level. © 2010 Akadémiai Kiadó, Budapest, Hungary.


Grant
Agency: NSF | Branch: Standard Grant | Program: | Phase: STAR Metrics | Award Amount: 55.45K | Year: 2015

Understanding the outcomes associated with R&D funding requires accurate linking of funding with the papers produced by the funded work. This proposal develops a methodology to accurately link grants with topics, an important challenge for the development of a rigorous, quantitative understanding and analysis of science policy. The research will provide methods to accurately and consistently identify coherent research areas and systematically link those topic areas to research funding and research output, such as scientific publications, using text and other features.

The ability to accurately link grants and topics in a consistent way would be an important advance in the usefulness of STAR METRICS data. In addition to developing a methodology, this project also establishes the accuracy of existing NIH grant-to-article linkage data, and develop grant-to-article linkage data for non-NIH grants -- data which are not readily available. More accurate grant-article and grant-topic linkages will facilitate other research and be important to the STAR METRICS platform.


Grant
Agency: NSF | Branch: Standard Grant | Program: | Phase: | Award Amount: 200.00K | Year: 2011

This exploratory project (EMERG) is aimed at developing the capability to predict emerging topics in science from a highly detailed global model of the scientific literature. Identification of emergent opportunities in science is a central issue in academia and practice. Applications range from a simple understanding of the broader context in which individual research is conducted to the direction of research funds toward emerging topics. Previous studies of emergence have had the following shortcomings: they are retrospective (the area of emergence is identified after the fact), narrowly defined (lacking the context of related scientific topics) and/or highly aggregated (field level rather than topic level).

The approach is based on a highly detailed global model of science, consisting of hundreds of thousands of micro-communities over a period of nine years (2000-2008). The average size of these micro-communities is 15 papers per year. Micro-communities are linked from year to year using co-citation methods. Some micro-communities are part of long thread-like structures while others may be isolated. At the micro-structure level, science appears to have a high level of discontinuity. The mixture of continuity and discontinuity makes it possible to see emergence at the topical level. A variety of indicators, some structural, some based on the micro-community contents (articles, authors, ages, etc.), and some based on full text analysis, are calculated for each micro-community. The hypothesis is that several, if not all, of the proposed indicators will correlate with emergence.

To test this hypothesis, a data from research funding agencies and foundations that identified emergent micro-communities will be collected, together with identifying and tracking the key articles responsible for emergence in those areas. This history will be compared with the results of indicators from the model of science. If successful, the indicators can be applied to a current (rather than retrospective) model, suggesting the particular current micro-communities in science that are emerging or that are likely to emerge in the next year or two.

This project provides a completely new method for developing useful knowledge about the micro-structure and dynamics of science and technology from literature databases, whether of scientific literature, patent, grant, or web resources. This work has the potential to transform the way the structure and dynamics of science and technology are understood, and to impact conduct and management of research at the scientists, students, general public and policy maker levels. Project results will be disseminated via web site (http://www.mapofscience.com/emerg.html) and publications.

Loading SciTech Strategies Inc. collaborators
Loading SciTech Strategies Inc. collaborators