![]() | |
Type of site | Search engine |
---|---|
Created by | Allen Institute for Artificial Intelligence |
URL | semanticscholar |
Launched | November 2, 2015[1] |
Semantic Scholar is an artificial intelligence-powered research tool for scientific literature developed at the Allen Institute for AI and publicly released in November 2015.[2] It uses advances in natural language processing to provide summaries for scholarly papers.[3] The Semantic Scholar team is actively researching the use of artificial-intelligence in natural language processing, machine learning, Human-Computer interaction, and information retrieval.[4]
Semantic Scholar began as a database surrounding the topics of computer science, geoscience, and neuroscience.[5] However, in 2017 the system began including biomedical literature in its corpus.[5] As of September 2022, they now include over 200 million publications from all fields of science.[6]
Semantic Scholar provides a one-sentence summary of scientific literature. One of its aims was to address the challenge of reading numerous titles and lengthy abstracts on mobile devices.[7] It also seeks to ensure that the three million scientific papers published yearly reach readers, since it is estimated that only half of this literature are ever read.[8]
Artificial intelligence is used to capture the essence of a paper, generating it through an "abstractive" technique.[3] The project uses a combination of machine learning, natural language processing, and machine vision to add a layer of semantic analysis to the traditional methods of citation analysis, and to extract relevant figures, tables, entities, and venues from papers.[9][10]
In contrast with Google Scholar and PubMed, Semantic Scholar is designed to highlight the most important and influential elements of a paper.[11] The AI technology is designed to identify hidden connections and links between research topics.[12] Like the previously cited search engines, Semantic Scholar also exploits graph structures, which include the Microsoft Academic Knowledge Graph, Springer Nature's SciGraph, and the Semantic Scholar Corpus.[13]
Each paper hosted by Semantic Scholar is assigned a unique identifier called the Semantic Scholar Corpus ID (abbreviated S2CID). The following entry is an example:
Liu, Ying; Gayle, Albert A; Wilder-Smith, Annelies; Rocklöv, Joacim (March 2020). "The reproductive number of COVID-19 is higher compared to SARS coronavirus". Journal of Travel Medicine. 27 (2). doi:10.1093/jtm/taaa021. PMID 32052846. S2CID 211099356.
Semantic Scholar is free to use and unlike similar search engines (i.e. Google Scholar) does not search for material that is behind a paywall.[14][5]
One study compared the search abilities of Semantic Scholar through a systematic approach, and found the search engine to be 98.88% accurate when attempting to uncover the data.[14] The same study examined other Semantic Scholar functions, including tools to survey metadata as well as several citation tools.[14]
As of January 2018, following a 2017 project that added biomedical papers and topic summaries, the Semantic Scholar corpus included more than 40 million papers from computer science and biomedicine.[15] In March 2018, Doug Raymond, who developed machine learning initiatives for the Amazon Alexa platform, was hired to lead the Semantic Scholar project.[16] As of August 2019, the number of included papers metadata (not the actual PDFs) had grown to more than 173 million[17] after the addition of the Microsoft Academic Graph records.[18] In 2020, a partnership between Semantic Scholar and the University of Chicago Press Journals made all articles published under the University of Chicago Press available in the Semantic Scholar corpus.[19] At the end of 2020, Semantic Scholar had indexed 190 million papers.[20]
In 2020, users of Semantic Scholar reached seven million a month.[7]
...the publicly available corpus compiled by Semantic Scholar -- a tool set up in 2015 by the Allen Institute for Artificial Intelligence in Seattle, Washington -- amounting to around 200 million articles, including preprints.