Bibliometrics is the use of statistical methods to analyse books, articles and other publications. Bibliometric methods are frequently used in the field of library and information science. The sub-field of bibliometrics which concerns itself with the analysis of scientific publications is called scientometrics. Citation analysis is a commonly used bibliometric method which is based on constructing the citation graph, a network or graph representation of the citations between documents. Many research fields use bibliometric methods to explore the impact of their field, the impact of a set of researchers, the impact of a particular paper, or to identify particularly impactful papers within a specific field of research. Bibliometrics also has a wide range of other applications, such as in descriptive linguistics, the development of thesauri, and evaluation of reader usage.
Historically, bibliometric methods have been used to trace relationships amongst academic journal citations. Citation analysis, which involves examining an item's referring documents, is used in searching for materials and analyzing their merit. Citation indices, such as Institute for Scientific Information's Web of Science, allow users to search forward in time from a known article to more recent publications which cite the known item.
Data from citation indexes can be analyzed to determine the popularity and impact of specific articles, authors, and publications. Using citation analysis to gauge the importance of one's work, for example, is a significant part of the tenure review process. Information scientists also use citation analysis to quantitatively assess the core journal titles and watershed publications in particular disciplines; interrelationships between authors from different institutions and schools of thought; and related data about the sociology of academia. Some more pragmatic applications of this information includes the planning of retrospective bibliographies, "giving some indication both of the age of material used in a discipline, and of the extent to which more recent publications supersede the older ones"; indicating through high frequency of citation which documents should be archived; comparing the coverage of secondary services which can help publishers gauge their achievements and competition, and can aid librarians in evaluating "the effectiveness of their stock". There are also some limitations to the value of citation data. They are often incomplete or biased; data has been largely collected by hand (which is expensive), though citation indexes can also be used; incorrect citing of sources occurs continually; thus, further investigation is required to truly understand the rationale behind citing to allow it to be confidently applied.
Bibliometrics are now used in quantitative research assessment exercises of academic output which is starting to threaten practice based research. The UK government has considered using bibliometrics as a possible auxiliary tool in its Research Excellence Framework, a process which will assess the quality of the research output of UK universities and on the basis of the assessment results, allocate research funding. This has met with significant scepticism and, after a pilot study, looks unlikely to replace the current peer review process. Furthermore, excessive usage of bibliometrics in assessment of value of academic research encourages gaming the system in various ways including publishing large quantity of works with low new content (see least publishable unit), publishing premature research to satisfy the numbers, focusing on popularity of the topic rather than scientific value and author's interest, often with detrimental role to research. Some of these phenomena are addressed in a number of recent initiatives, including The San Francisco Declaration on Research Assessment.
Guidelines have been written on the using of bibliometrics in academic research, in disciplines such as Management, Education and Information Science. Other bibliometrics applications include: creating thesauri; measuring term frequencies; as metrics in scientometric analysis, exploring grammatical and syntactical structures of texts; measuring usage by readers; quantifying value of online media of communication; measuring Jaccard distance cluster analysis and text mining based on binary logistic regression.
In the context of the big deal cancellations by several library systems in the world, data analysis tools like Unpaywall Journals are used by libraries to assist with big deal cancellations: libraries can avoid subscriptions for materials already served by instant open access via open archives like PubMed Central.
The term bibliométrie was first used by Paul Otlet in 1934 and defined as "the measurement of all aspects related to the publication and reading of books and documents." The anglicised version bibliometrics was first used by Alan Pritchard in a paper published in 1969, titled "Statistical Bibliography or Bibliometrics?" He defined the term as "the application of mathematics and statistical methods to books and other media of communication".
Citation analysis has a long history, the Science Citation Index began publication in 1961 and Derek J. de Solla Price discussed the citation graph describing the network of citations in his 1965 article "Networks of Scientific Papers". However this was done initially manually until large scale electronic databases and associated computer algorithms were able to cope with the vast numbers of documents in most bibliometric collections. The first such algorithm for automated citation extraction and indexing was by CiteSeer. Google's PageRank is based on the principle of citation analysis. Patent citation maps are also based upon citation analysis (in this case, the citation of one patent by another). However, one has to keep in mind that humans have been publishing and citing since very early in history with individual works containing citations that date back as far as antiquity.