MEDLINE: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Robert Badgett
mNo edit summary
 
(56 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{subpages}}
{{subpages}}
{{TOC|right}}
According to the U.S. [[National Library of Medicine]], "MEDLINE® (Medical Literature Analysis and Retrieval System Online) is the U.S. National Library of Medicine's® (NLM) premier bibliographic database that contains over 16 million references to journal articles in life sciences with a concentration on biomedicine. A distinctive feature of MEDLINE is that the records are indexed with NLM's [[Medical Subject Headings]] ([[MeSH]]®)."<ref name="titleMEDLINE Fact Sheet">{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/medline.html |title=MEDLINE Fact Sheet |accessdate=2008-01-22 |author= |authorlink= |coauthors= |date= |format= |work= |publisher=National Library of Medicine |pages= |language= |archiveurl= |archivedate= |quote=}}</ref>
According to the U.S. [[National Library of Medicine]], "MEDLINE® (Medical Literature Analysis and Retrieval System Online) is the U.S. National Library of Medicine's® (NLM) premier bibliographic database that contains over 16 million references to journal articles in life sciences with a concentration on biomedicine. A distinctive feature of MEDLINE is that the records are indexed with NLM's [[Medical Subject Headings]] ([[MeSH]]®)."<ref name="titleMEDLINE Fact Sheet">{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/medline.html |title=MEDLINE Fact Sheet |accessdate=2008-01-22 |author= |authorlink= |coauthors= |date= |format= |work= |publisher=National Library of Medicine |pages= |language= |archiveurl= |archivedate= |quote=}}</ref>


[[PubMed]] is the National Library of Medicine's free online search system for MEDLINE.
[[PubMed]] is the National Library of Medicine's free online search system for MEDLINE.
PubMed provides feedback relevance with its "See related" feature.<ref name="pmid17971238">{{cite journal|  author=Lin J, Wilbur WJ| title=PubMed related articles: a probabilistic  topic-based model for content similarity. | journal=BMC Bioinformatics |  year= 2007 | volume= 8 | issue=  | pages= 423 | pmid=17971238 |  doi=10.1186/1471-2105-8-423 | pmc=PMC2212667 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=17971238  }} </ref><ref>Anonymous (2011). [http://www.ncbi.nlm.nih.gov/books/NBK3827/ PubMed Help]: [http://www.ncbi.nlm.nih.gov/books/NBK3827/#pubmedhelp.Computation_of_Related_Citati  Computation of Related Citations]</ref>


==Structure==
==Structure==
MEDLINE® (Medical Literature Analysis and Retrieval System Online) is a database of predominantly biomedical bibliographic citations maintained by the U.S. [[National Library of Medicine]] (NLM).<ref>{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/medline.html |title=MEDLINE Fact Sheet |author=National Library of Medicine|accessdate=2007-11-09 |format= |work=}}</ref> The process sofr selecting journals is described.<ref name="urlMEDLINE® Journal Selection Fact Sheet">{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/jsel.html |title=MEDLINE® Journal Selection Fact Sheet |author=Anonymous |authorlink= |coauthors= |date=2007 |format= |work= |publisher=National Library of Medicine |pages= |language= |archiveurl= |archivedate= |quote= |accessdate=2010-04-04}}</ref> Each citation includes bibliographic data, abstract if available, links to full text of the article and keywords. The keywords are indexed with the NLM's Medical Subject Headings (MeSH®)<ref>{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/mesh.html |title=Medical Subject Headings (MESH®) Fact Sheet |author=National Library of Medicine |accessdate=2007-11-09 |format= |work=}}</ref> and subheadings<ref name="titleQualifiers - 2008">{{cite web |url=http://www.nlm.nih.gov/mesh/topsubscope2008.html |title=Qualifiers - 2008 |accessdate=2008-03-19 |author=Anonymous |authorlink= |coauthors= |date=2008 |format= |work= |publisher=National Library of Medicine |pages= |language= |archiveurl= |archivedate= |quote=}}</ref>.
MEDLINE® (Medical Literature Analysis and Retrieval System Online) is a database of predominantly biomedical bibliographic citations maintained by the U.S. [[National Library of Medicine]] (NLM).<ref>{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/medline.html |title=MEDLINE Fact Sheet |author=National Library of Medicine|accessdate=2007-11-09 |format= |work=}}</ref> Each citation includes bibliographic data, abstract if available, links to full text of the article and keywords.


The important MeSH terms “Randomized Controlled Trial” and “Clinical Controlled Trial” were introduced in 1991 and 1995, respectively.<ref name="pmid16636704">{{cite journal| author=Glanville JM, Lefebvre C, Miles JN, Camosso-Stefinovic J| title=How to identify randomized controlled trials in MEDLINE: ten years on. | journal=J Med Libr Assoc | year= 2006 | volume= 94 | issue= 2 | pages= 130-6 | pmid=16636704
The process for selecting journals is described.<ref name="urlMEDLINE® Journal Selection Fact Sheet">{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/jsel.html |title=MEDLINE® Journal Selection Fact Sheet |author=Anonymous |authorlink= |coauthors= |date=2007 |format= |work= |publisher=National Library of Medicine |pages= |language= |archiveurl= |archivedate= |quote= |accessdate=2010-04-04}}</ref>  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=16636704 | pmc=PMC1435857 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref> The [[Cochrane Collaboration]] helps MEDLINE correctly retag articles with these terms.<ref name="pmid16636704"/>


The National Library of Medicine's [http://ii.nlm.nih.gov/ Indexing Initiative] is trying to automate assignment of MeSH terms.
===MeSH terms===
The keywords are indexed with the NLM's Medical Subject Headings (MeSH®)<ref>{{cite web |url=http://www.nlm.nih.gov/pubs/factsheets/mesh.html |title=Medical Subject Headings (MESH®) Fact Sheet |author=National Library of Medicine |accessdate=2007-11-09 |format= |work=}}</ref> and subheadings<ref name="titleQualifiers - 2008">{{cite web |url=http://www.nlm.nih.gov/mesh/topsubscope2008.html |title=Qualifiers - 2008 |accessdate=2008-03-19 |author=Anonymous |authorlink= |coauthors= |date=2008 |format= |work= |publisher=National Library of Medicine |pages= |language= |archiveurl= |archivedate= |quote=}}</ref>.  


The National Library of Medicine is investigated whether indexing MeSH terms can be either fully or semi-automated.<ref name="titleIndexing Initiative">{{cite web |url=http://ii.nlm.nih.gov/ |title=Indexing Initiative |author=National Library of Medicine |accessdate=2007-11-25 |format= |work=}}</ref>
The important MeSH terms “Randomized Controlled Trial” and “Clinical  Controlled Trial” were introduced in 1991 and 1995, respectively.<ref name="pmid16636704">{{cite journal|  author=Glanville JM, Lefebvre C, Miles JN, Camosso-Stefinovic J|  title=How to identify randomized controlled trials in MEDLINE: ten years  on. | journal=J Med Libr Assoc | year= 2006 | volume= 94 | issue= 2 |  pages= 130-6 | pmid=16636704 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=16636704 | pmc=PMC1435857 }}</ref> The [[Cochrane Collaboration]] helps MEDLINE correctly retag articles with these terms.<ref name="pmid16636704"/>
 
The National Library of Medicine's [http://ii.nlm.nih.gov/ Indexing Initiative]  is trying to automate assignment of MeSH terms. The National Library of  Medicine is investigated whether indexing MeSH terms can be either fully or semi-automated.<ref name="titleIndexing Initiative">{{cite web |url=http://ii.nlm.nih.gov/ |title=Indexing Initiative |author=National Library of Medicine |accessdate=2007-11-25 |format= |work=}}</ref> Indexing of MESH terms by human is assisted by the Medical Text Indexer (MTI).<ref>Anonymous. [http://ii.nlm.nih.gov/mti.shtml Medical Text Indexer (MTI)]. National Library of Medicine</ref>
 
There is some inconsistency in assignment of MeSH terms. <ref name="pmid10106507">{{cite journal| author=Booth A| title=How consistent is MEDLINE indexing? A few reservations. | journal=Health Libr Rev | year= 1990 | volume= 7 | issue= 1 | pages= 22-6 | pmid=10106507 | doi= | pmc= | url= }} </ref><ref name="pmid6344946">{{cite journal| author=Funk ME, Reid CA| title=Indexing consistency in MEDLINE. | journal=Bull Med Libr Assoc | year= 1983 | volume= 71 | issue= 2 | pages= 176-83 | pmid=6344946 | doi= | pmc=PMC227138 | url= }} </ref><ref name="pmid18075808">{{cite journal| author=Portaluppi F| title=Consistency and accuracy of the Medical Subject Headings thesaurus for electronic indexing and retrieval of chronobiologic references. | journal=Chronobiol Int | year= 2007 | volume= 24 | issue= 6 | pages= 1213-29 | pmid=18075808 | doi=10.1080/07420520701791570 | pmc= | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=18075808  }} </ref>


==Methods to improve searching MEDLINE==
==Methods to improve searching MEDLINE==
There is much ongoing research into improving MEDLINE search results.
 
{| class="wikitable"
|+ Studies of searching MEDLINE.
<ref name="pmid19272463">{{cite journal| author=Tanaka LY, Herskovic JR, Iyengar MS, Bernstam EV| title=Sequential result refinement for searching the biomedical literature. | journal=J Biomed Inform | year= 2009 | volume= 42 | issue= 4 | pages= 678-84 | pmid=19272463 | doi=10.1016/j.jbi.2009.02.009 | pmc=PMC2722929 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=19272463  }} </ref>
<ref name="pmid20841667">{{cite journal| author=Bekhuis T, Demner-Fushman D| title=Towards automating the initial screening phase of a systematic review. | journal=Stud Health Technol Inform | year= 2010 | volume= 160 | issue= Pt 1 | pages= 146-50 | pmid=20841667 | doi= | pmc= | url= }} </ref>
<ref name="pmid20102628">{{cite journal| author=Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH| title=Semi-automated screening of biomedical citations for systematic reviews. | journal=BMC Bioinformatics | year= 2010 | volume= 11 | issue=  | pages= 55 | pmid=20102628 | doi=10.1186/1471-2105-11-55 | pmc=PMC2824679 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=20102628  }} </ref>
<ref name="pmid18952929">{{cite journal| author=Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB| title=Towards automatic recognition of scientifically rigorous clinical research evidence. | journal=J Am Med Inform Assoc | year= 2009 | volume= 16 | issue= 1 | pages= 25-31 | pmid=18952929 | doi=10.1197/jamia.M2996 | pmc=PMC2605595 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=18952929  }} </ref>
<ref name="pmid17600104">{{cite journal| author=Lin Y, Li W, Chen K, Liu Y| title=A document clustering and ranking system for exploring MEDLINE citations. | journal=J Am Med Inform Assoc | year= 2007 | volume= 14 | issue= 5 | pages= 651-61 | pmid=17600104 | doi=10.1197/jamia.M2215 | pmc=PMC1975797 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=17600104  }} </ref>
<ref name="pmid17911810" />
<ref name="pmid16622165">{{cite journal|  author=Aphinyanaphongs Y, Statnikov A, Aliferis CF| title=A comparison  of citation metrics to machine learning filters for the identification  of high quality MEDLINE documents. | journal=J Am Med Inform Assoc |  year= 2006 | volume= 13 | issue= 4 | pages= 446-55 | pmid=16622165 |  doi=10.1197/jamia.M2031 | pmc=PMC1513679 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=16622165  }} </ref>
<ref name="pmid16357352">{{cite journal| author=Cohen AM, Hersh WR, Peterson K, Yen PY| title=Reducing workload in systematic review preparation using automated citation classification. | journal=J Am Med Inform Assoc | year= 2006 | volume= 13 | issue= 2 | pages= 206-19 | pmid=16357352 | doi=10.1197/jamia.M1929 | pmc=PMC1447545 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=16357352  }} </ref>
<ref name="pmid16221938">{{cite journal |author=Bernstam EV, Herskovic JR, Aphinyanaphongs Y, Aliferis CF, Sriram MG, Hersh WR |title=Using citation data to improve retrieval from MEDLINE |journal=J Am Med Inform Assoc |volume=13 |issue=1 |pages=96–105 |year=2006 |pmid=16221938 |doi=10.1197/jamia.M1909}} ''This  study may have been biased towards ranking systems because 1) all  retrieval methods analyzed a "preliminary result set using simple PubMed  queries, 2) the boolean filters were developed in 1994 as the authors  probably completed the study prior to the 2005 update of PubMed filters"''</ref>
<ref name="pmid15561789">{{cite journal|  author=Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D,  Aliferis CF| title=Text categorization models for high-quality article  retrieval in internal medicine. | journal=J Am Med Inform Assoc | year=  2005 | volume= 12 | issue= 2 | pages= 207-16 | pmid=15561789 |  doi=10.1197/jamia.M1641 | pmc=PMC551552 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=15561789  }} </ref>
 
! Study!! Setting!!Method !!Results!!Comments
|-
| Tanaka<ref name="pmid19272463"/><br/>2011|| &nbsp; || &nbsp;|| A score based on MeSH terms, [[journal impact factor]], and number of authors can predict future citation patterns|| &nbsp;
|-
| Bekhuis et al.<ref name="pmid20841667"/><br/>2010|| Locating ''relevant'' studies for systematic reviews|| Supervised [[machine learning]] with evolutionary SVM<br/>&bull;&nbsp;Ensemble of four SVM classifiers (title;abstract;metadata)|| &nbsp;|| Mean precision ranges from 26% to 37%
|-
| Wallace et al.<ref name="pmid20102628"/><br/>2010|| Locating ''relevant'' studies for systematic reviews|| Supervised [[machine learning]] with SVM<br/>&bull;&nbsp;Ensemble of four SVM classifiers (title;abstract;MeSH;UMLS)|| &nbsp;|| Reduced the number of articles to manually review by 40% to 50%.
|-
| Kilicoglu<ref name="pmid18952929"/><br/>2009 || Identifying ''high quality'' studies of interventions ([[randomized controlled trial]]s) in Internal Medicine||Supervised [[machine learning]] with SVM<br/>&bull;&nbsp;Naïve Bayes<br/>&bull;&nbsp;Boosting<br/>&bull;&nbsp;Ensemble learning method (stacking) || 82.5% precision<br/>84.3% recall ||
|-
| Lin<ref name="pmid17600104"/><br/>2007|| Oncoloogy<br/>Trained with “annotated bibliography of important literature on common problems in surgical oncology”(SSO-AB)<br/>Last updated 2001<br/>(16% of SSO-AB are [[randomized controlled trial]]s)|| &nbsp;|| &nbsp;||  citation count per year (CCPY) outperformed citation count (CC) and [[journal impact factor]] (JIF)
|-
| Fu<ref name="pmid17911810"/><br/>2007|| &nbsp;|| &nbsp;|| &nbsp;|| "impact factor and clinical query filters are unstable for different  topics while a topic-specific impact factor and machine learning-based  filter models appear more robust"
|-
| Cohen<ref name="pmid16357352"/><br/>2006|| Locating ''relevant'' studies for updating systematic reviews|| &nbsp;|| &nbsp;|| &nbsp;
|-
| Aphinyanaphongs<ref name="pmid16622165"/>2006||  Identifying ''high quality'' studies in Internal Medicine<br/>Trained with articles cited by ACP Journal Club|| Supervised [[machine learning]] with SVM<br/>&bull;&nbsp;Citation metrics|| &nbsp;|| "machine learning filters outperform standard citation metrics...the filter models have to be built specifically for this task and gold standard...Previous research that claimed better performance of citation metrics  than machine learning in one of the corpora examined here is  attributed to using machine learning filters built for a different gold standard and task"
|-
| Bernstam<ref name="pmid16221938"/><br/>2006|| Oncoloogy “annotated bibliography of important literature on common problems in surgical oncology”(SSO-AB)<br/>Last updated 2001<br/>(16% of SSO-AB are [[randomized controlled trial]]s)<br/>Trained with ACP Journal Club high quality studies||Supervised [[machine learning]] with SVM<br/>versus<br/>[[Journal impact factor]], citation count per article, PageRank per article<br/>versus<br/>Pubmed's Clinical Queries<br/>(searches performed in 2004(?)|| Precision of first 50 citations:<br/>PageRank 9%<br/>Citation count 8%<br/>Impact factor 2%|| "Citation-based algorithms were more effective than noncitation-based algorithms"
|-
| Aphinyanaphongs<ref name="pmid15561789"/><br/>2005|| Identifying ''high quality'' studies in Internal Medicine<br/>Trained with articles cited by ACP Journal Club ||Supervised [[machine learning]] with SVM<br/>&bull;&nbsp;Naïve Bayes<br/>&bull;&nbsp;Boosting<br/>versus<br/>1994 PubMed Clinical Queries|| cell|| Machine learning was more precise.
|-
| &nbsp;|| &nbsp;|| &nbsp;|| &nbsp;|| &nbsp;
|-
| colspan=4|Notes:<br/>SVM. Support vector machine (see [[machine learning]])
|}
 
There is much ongoing research into improving MEDLINE search results.<ref name="pmid21245076">{{cite journal| author=Lu Z| title=PubMed and beyond: a survey of web tools for searching biomedical literature. | journal=Database (Oxford) | year= 2011 | volume= 2011 | issue=  | pages= baq036 | pmid=21245076 | doi=10.1093/database/baq036 | pmc=PMC3025693 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=21245076  }} </ref><ref name="pmid18660511">{{cite journal| author=Kim JJ, Rebholz-Schuhmann D| title=Categorization of services for seeking information in biomedical literature: a typology for improvement of practice. | journal=Brief Bioinform | year= 2008 | volume= 9 | issue= 6 | pages= 452-65 | pmid=18660511 | doi=10.1093/bib/bbn032 | pmc= | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=18660511  }} </ref>


===Citation tracking===
===Citation tracking===
Line 21: Line 70:


===Clustering===
===Clustering===
Clustering search results may help.<ref name="pmid17600104">{{cite journal |author=Lin Y, Li W, Chen K, Liu Y |title=A document clustering and ranking system for exploring MEDLINE citations |journal=J Am Med Inform Assoc |volume=14 |issue=5 |pages=651–61 |year=2007 |pmid=17600104 |doi=10.1197/jamia.M2215}}</ref>
Clustering search results may help.<ref name="pmid17600104" />


===Filters (hedges)===
===Filters (hedges)===
MEDLINE filters, also called hedges, are an optimal Boolean combination of search terms, both textword and MeSH terms, to search articles. Many filters have been made by the [http://hiru.mcmaster.ca/hiru/HIRU_Hedges_home.aspx Hedges Team] and are available as [http://www.ncbi.nlm.nih.gov/corehtml/query/static/clinical.shtml Clinical Queries] at [[PubMed]]. Filters have been criticized for being imperfect.<ref name="pmid16488353">{{cite journal| author=Leeflang MM, Scholten RJ, Rutjes AW, Reitsma JB, Bossuyt PM| title=Use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies. | journal=J Clin Epidemiol | year= 2006 | volume= 59 | issue= 3 | pages= 234-40 | pmid=16488353  
MEDLINE filters, also called hedges, are an optimal Boolean combination of search terms, both textword and MeSH terms, to search articles. Many filters have been made by the [http://hiru.mcmaster.ca/hiru/HIRU_Hedges_home.aspx Hedges Team] and are available as [http://www.ncbi.nlm.nih.gov/corehtml/query/static/clinical.shtml Clinical Queries] at [[PubMed]]. Filters may improve efficieny (precision) of searches by physicians.<ref name="pmid22249990">{{cite journal| author=Shariff SZ, Sontrop JM, Haynes RB, Iansavichus AV, McKibbon KA, Wilczynski NL et al.| title=Impact of PubMed search filters on the retrieval of evidence by physicians. | journal=CMAJ | year= 2012 | volume= 184 | issue= 3 | pages= E184-90 | pmid=22249990 | doi=10.1503/cmaj.101661 | pmc=PMC3281182 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=22249990  }} </ref> The Clinical Queries at PubMed may improve the quality of articles retrieved.<ref name="pmid21680559">{{cite journal| author=Lokker C, Haynes RB, Wilczynski NL, McKibbon KA, Walter SD| title=Retrieval of diagnostic and treatment studies for clinical use through PubMed and PubMed's Clinical Queries filters. | journal=J Am Med Inform Assoc | year= 2011 | volume=  | issue=  | pages=  | pmid=21680559 | doi=10.1136/amiajnl-2011-000233 | pmc= | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=21680559  }} </ref>
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=16488353 | doi=10.1016/j.jclinepi.2005.07.014 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref>
 
Filters have been criticized for being imperfect.<ref name="pmid16488353">{{cite journal| author=Leeflang MM, Scholten RJ, Rutjes AW, Reitsma JB, Bossuyt PM| title=Use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies. | journal=J Clin Epidemiol | year= 2006 | volume= 59 | issue= 3 | pages= 234-40 | pmid=16488353  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=16488353 | doi=10.1016/j.jclinepi.2005.07.014 }}</ref>


====Filters for article types====
====Filters for article types====
One filter is for identifying [[randomized controlled trial]]s. Many MEDLINE filters have been developed by the Hedges team<ref name="titleSearch Strategies">{{cite web |url=http://hiru.mcmaster.ca/hedges/ |title=Search Strategies |author=Hedges Team|accessdate=2007-11-25 |format= |work=}}</ref> supported by a grant from the National Library of Medicine.<ref name="5R01LM006866-07">{{cite web |url=http://projectreporter.nih.gov/project_info_details.cfm?aid=7286387 |title=Project Information - NIH RePORTER – NIH Research Portfolio Online Reporting Tool Expenditures and Results |accessdate=2007-11-25 |format= |work=}}</ref> Examples include filters for [[randomized controlled trial]]s<ref name="pmid19712211">{{cite journal| author=McKibbon KA, Wilczynski NL, Haynes RB| title=Retrieving randomized controlled trials from MEDLINE: a comparison of 38 published search filters. | journal=Health Info Libr J | year= 2009 | volume= 26 | issue= 3 | pages= 187-202 | pmid=19712211  
{| class="wikitable" border="1" align="right"
|+ Evolution of search filters
! Purpose category!! Strategy with<br/>high sensitivity!!Strategy with<br/>high specificity
|-
|colspan="3" style="text-align:center"|1994 (developed with articles from 10 major journals)<ref name="pmid7850570">{{cite journal| author=Haynes RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC| title=Developing optimal search strategies for detecting clinically sound studies in MEDLINE. | journal=J Am Med Inform Assoc | year= 1994 | volume= 1 | issue= 6 | pages= 447-58 | pmid=7850570 | doi= | pmc=PMC116228 | url= }} </ref>
|-
| Treatment||randomized controlled trial[Publication Type] OR drug therapy[MeSH Subheading] OR therapeutic use[MeSH Subheading] OR random*[Title/Abstract]<br/>
&bull;&nbsp;Sensitivity = 99%<br/>
&bull;&nbsp;Specificity = 74%<br/>
|| placebo*[Title/Abstract] OR (double[Title/Abstract] AND  blind*[Title/Abstract]
|-
| Diagnosis|| ||
|-
|colspan="3" style="text-align:center"|2005 (developed with articles from 160 journals)<ref>{{cite web |url= http://www.ncbi.nlm.nih.gov/books/NBK3827/#pubmedhelp.Clinical_Queries_Filters |title=PubMed Help |first= |last=Anonymous |work=National Center for Biotechnology Information |year=2011 [last update] |accessdate=August 15, 2011}}</ref><ref name="titleSearch Strategies">{{cite web |url=http://hiru.mcmaster.ca/hiru/HIRU_Hedges_MEDLINE_Strategies.aspx |title=Search Strategies |author=Hedges Team|accessdate=2011-03-015 |format= |work=}}</ref>
|-
| Treatment<ref name="pmid15894554">{{cite journal| author=Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Hedges Team| title=Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey. | journal=BMJ | year= 2005 | volume= 330 | issue= 7501 | pages= 1179 | pmid=15894554 | doi=10.1136/bmj.38446.498542.8F | pmc=PMC558012 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=15894554  }} </ref>||(clinical[Title/Abstract] AND trial[Title/Abstract]) OR clinical trials[MeSH Terms] OR clinical trial[Publication Type] OR random*[Title/Abstract] OR random allocation[MeSH Terms] OR therapeutic use[MeSH Subheading]<br/>
&bull;&nbsp;Sensitivity = 99%<br/>
&bull;&nbsp;Specificity = 70%<br/>
|| randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract])<br/>
&bull;&nbsp;Sensitivity = 93%<br/>
&bull;&nbsp;Specificity = 97%
|-
| Diagnosis||sensitiv*[Title/Abstract] OR sensitivity and specificity[MeSH Terms] OR diagnos*{Title/Abstract] OR diagnosis[MeSH:noexp] OR diagnostic * [MeSH:noexp] OR diagnosis,differential[MeSH:noexp] OR diagnosis[Subheading:noexp]|| specificity[Title/Abstract]
|}
 
One filter is for identifying [[randomized controlled trial]]s. Many MEDLINE filters have been developed by the Hedges team<ref name="titleSearch Strategies">{{cite web |url=http://hiru.mcmaster.ca/hiru/HIRU_Hedges_MEDLINE_Strategies.aspx |title=Search Strategies |author=Hedges Team|accessdate=2011-03-015 |format= |work=}}</ref> supported by a grant from the National Library of Medicine.<ref name="5R01LM006866-07">{{cite web |url=http://projectreporter.nih.gov/project_info_details.cfm?aid=7286387 |title=Project Information - NIH RePORTER – NIH Research Portfolio Online Reporting Tool Expenditures and Results |accessdate=2007-11-25 |format= |work=}}</ref> The filters were initially published in 1994<ref name="pmid7850570">{{cite journal| author=Haynes RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC| title=Developing optimal search strategies for detecting clinically sound studies in MEDLINE. | journal=J Am Med Inform Assoc | year= 1994 | volume= 1 | issue= 6 | pages= 447-58 | pmid=7850570 | doi= | pmc=PMC116228 | url= }} </ref> and then revised and published in 2005<ref name="pmid15894554">{{cite journal| author=Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Hedges Team| title=Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey. | journal=BMJ | year= 2005 | volume= 330 | issue= 7501 | pages= 1179 | pmid=15894554 | doi=10.1136/bmj.38446.498542.8F | pmc=PMC558012 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=15894554  }} </ref>.
 
Examples include filters for [[randomized controlled trial]]s<ref name="pmid19712211">{{cite journal| author=McKibbon KA, Wilczynski NL, Haynes RB| title=Retrieving randomized controlled trials from MEDLINE: a comparison of 38 published search filters. | journal=Health Info Libr J | year= 2009 | volume= 26 | issue= 3 | pages= 187-202 | pmid=19712211  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=19712211 | doi=10.1111/j.1471-1842.2008.00827.x  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=19712211 | doi=10.1111/j.1471-1842.2008.00827.x  
}} </ref> and [[systematic review]]s<ref name="pmid19712212">{{cite journal| author=Wilczynski NL, Haynes RB| title=Consistency and accuracy of indexing systematic review articles and meta-analyses in MEDLINE. | journal=Health Info Libr J | year= 2009 | volume= 26 | issue= 3 | pages= 203-10 | pmid=19712212  
}} </ref> and [[systematic review]]s<ref name="pmid19712212">{{cite journal| author=Wilczynski NL, Haynes RB| title=Consistency and accuracy of indexing systematic review articles and meta-analyses in MEDLINE. | journal=Health Info Libr J | year= 2009 | volume= 26 | issue= 3 | pages= 203-10 | pmid=19712212  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=19712212 | doi=10.1111/j.1471-1842.2008.00823.x  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=19712212 | doi=10.1111/j.1471-1842.2008.00823.x  
}} </ref>.
}} </ref>. Of note, the the filter for [[randomized controlled trial]]s and retrieves [[systematic review]]s very well.<ref name="pmid21775104">{{cite journal| author=Wilczynski NL, McKibbon KA, Haynes RB| title=Sensitive Clinical Queries retrieved relevant systematic reviews as well as primary studies: an analytic survey. | journal=J Clin Epidemiol | year= 2011 | volume=  | issue=  | pages=  | pmid=21775104 | doi=10.1016/j.jclinepi.2011.04.007 | pmc= | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=21775104  }} </ref>
 
Filters for studies of [[diagnostic test]] accuracy may<ref name="pmid19230607">{{cite journal| author=Kastner M, Wilczynski NL, McKibbon AK, Garg AX, Haynes RB| title=Diagnostic test systematic reviews: bibliographic search filters ("Clinical Queries") for diagnostic accuracy studies perform well. | journal=J Clin Epidemiol | year= 2009 | volume= 62 | issue= 9 | pages= 974-81 | pmid=19230607 | doi=10.1016/j.jclinepi.2008.11.006 | pmc=PMC2737707 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=19230607  }} </ref> or may not<ref name="pmid21075596">{{cite journal| author=Whiting P, Westwood M, Beynon R, Burke M, Sterne JA, Glanville J| title=Inclusion of methodological filters in searches for diagnostic test accuracy studies misses relevant studies. | journal=J Clin Epidemiol | year= 2011 | volume= 64 | issue= 6 | pages= 602-7 | pmid=21075596 | doi=10.1016/j.jclinepi.2010.07.006 | pmc= | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=21075596  }} </ref> perform well. The reasons for missing studies may be due to incomplete indexing or articles by databases such as MEDLINE.<ref name="pmid22118266">{{cite journal| author=Kastner M, Haynes RB, Wilczynski NL| title=Inclusion of methodological filters in searches for diagnostic test accuracy studies misses relevant studies. | journal=J Clin Epidemiol | year= 2012 | volume= 65 | issue= 1 | pages= 116-7 | pmid=22118266 | doi=10.1016/j.jclinepi.2011.02.011 | pmc= | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=22118266  }} </ref>


====Filters for subject types====
====Filters for subject types====
A filter have been developed for articles about kidney disease.<ref name="pmid19767336">{{cite journal| author=Garg AX, Iansavichus AV, Wilczynski NL, Kastner M, Baier LA, Shariff SZ et al.| title=Filtering Medline for a clinical discipline: diagnostic test assessment framework. | journal=BMJ | year= 2009 | volume= 339 | issue=  | pages= b3435 | pmid=19767336  
A filter have been developed for articles about kidney disease<ref name="pmid19767336">{{cite journal| author=Garg AX, Iansavichus AV, Wilczynski NL, Kastner M, Baier LA, Shariff SZ et al.| title=Filtering Medline for a clinical discipline: diagnostic test assessment framework. | journal=BMJ | year= 2009 | volume= 339 | issue=  | pages= b3435 | pmid=19767336 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=19767336 | doi=10.1136/bmj.b3435 }}</ref>, dentistry<ref>Niederman R, Chen L, Murzyn L, Conway S. [http://www.nature.com/ebd/journal/v3/n1/abs/6400095a.html Benchmarking the dental randomised controlled literature on MEDLINE]. Evidence-Based Dentistry.  2002;3:5-9 {{doi|10.1038/sj/ebd/4600095}}</ref>, and about specific age ranges<ref name="pmid17213044">{{cite journal| author=Kastner M, Wilczynski NL, Walker-Dilks C, McKibbon KA, Haynes B| title=Age-specific search strategies for Medline. | journal=J Med Internet Res | year= 2006 | volume= 8 | issue= 4 | pages= e25 | pmid=17213044 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&retmode=ref&cmd=prlinks&id=17213044 | doi=10.2196/jmir.8.4.e25 | pmc=PMC1794003 }} </ref> such as [[geriatrics]]<ref name="pmid21946235">{{cite journal| author=van de Glind EM, van Munster BC, Spijker R, Scholten RJ, Hooft L| title=Search filters to identify geriatric medicine in Medline. | journal=J Am Med Inform Assoc | year= 2012 | volume= 19 | issue= 3 | pages= 468-72 | pmid=21946235 | doi=10.1136/amiajnl-2011-000319 | pmc= | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=21946235  }} </ref>.
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=19767336 | doi=10.1136/bmj.b3435 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref>


===Relevancy ranking===
===Relevancy ranking===
Line 42: Line 121:


[http://etblast.org eTBLAST] uses text mining to search for similar publications.<ref name="pmid17452348">{{cite journal| author=Errami M, Wren JD, Hicks JM, Garner HR| title=eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications. | journal=Nucleic Acids Res | year= 2007 | volume= 35 | issue= Web Server issue | pages= W12-5 | pmid=17452348  
[http://etblast.org eTBLAST] uses text mining to search for similar publications.<ref name="pmid17452348">{{cite journal| author=Errami M, Wren JD, Hicks JM, Garner HR| title=eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications. | journal=Nucleic Acids Res | year= 2007 | volume= 35 | issue= Web Server issue | pages= W12-5 | pmid=17452348  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=17452348 | doi=10.1093/nar/gkm221 | pmc=PMC1933238 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref><ref name="pmid16926219">{{cite journal| author=Lewis J, Ossowski S, Hicks J, Errami M, Garner HR| title=Text similarity: an alternative way to search MEDLINE. | journal=Bioinformatics | year= 2006 | volume= 22 | issue= 18 | pages= 2298-304 | pmid=16926219
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=17452348 | doi=10.1093/nar/gkm221 | pmc=PMC1933238 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref><ref name="pmid16926219" />
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=16926219 | doi=10.1093/bioinformatics/btl388 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref>


===Citation analysis or PageRank===
===Citation analysis or PageRank===
There are conflicting results over the role of ranking results based on citation counts or [[PageRank]]. A study using [[Google]]'s own [[PageRank]] found PubMed's clinical queries to be better.<ref name="pmid17603909">{{cite journal |author=Haase A, Follmann M, Skipka G, Kirchner H |title=Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance |journal=BMC Med Res Methodol |volume=7 |issue= |pages=28 |year=2007 |pmid=17603909 |doi=10.1186/1471-2288-7-28}}</ref> However, a comparative study found better results for a metric analogous to PageRank for biomedical journals based on:<ref name="pmid16221938">{{cite journal |author=Bernstam EV, Herskovic JR, Aphinyanaphongs Y, Aliferis CF, Sriram MG, Hersh WR |title=Using citation data to improve retrieval from MEDLINE |journal=J Am Med Inform Assoc |volume=13 |issue=1 |pages=96–105 |year=2006 |pmid=16221938 |doi=10.1197/jamia.M1909}}</ref><ref name="pmid16779053">{{cite journal |author=Herskovic JR, Bernstam EV |title=Using incomplete citation data for MEDLINE results ranking |journal=AMIA Annu Symp Proc |volume= |issue= |pages=316–20 |year=2005 |pmid=16779053 |doi=}} [http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=citizendium&pubmedid=16779053 PubMed Central]</ref>
There are conflicting results over the role of ranking results based on citation counts or [[PageRank]]. A study using [[Google]]'s own [[PageRank]] found PubMed's clinical queries to be better.<ref name="pmid17603909">{{cite journal |author=Haase A, Follmann M, Skipka G, Kirchner H |title=Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance |journal=BMC Med Res Methodol |volume=7 |issue= |pages=28 |year=2007 |pmid=17603909 |doi=10.1186/1471-2288-7-28}}</ref> However, a comparative study found better results for a metric analogous to PageRank for biomedical journals based on:<ref name="pmid16221938" /><ref name="pmid16779053">{{cite journal |author=Herskovic JR, Bernstam EV |title=Using incomplete citation data for MEDLINE results ranking |journal=AMIA Annu Symp Proc |volume= |issue= |pages=316–20 |year=2005 |pmid=16779053 |doi=}} [http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=citizendium&pubmedid=16779053 PubMed Central]</ref>


:<math>\text{PageRank for the index article} = \frac{\text{the number of articles citing the index article }}{\text{the number of articles cited by the index article}}</math>
:<math>\text{PageRank for the index article} = \frac{\text{the number of articles citing the index article }}{\text{the number of articles cited by the index article}}</math>


===Machine learning===
===Machine learning===
Machine learning methods in which the search engine seeks articles that more resemble the included articles, may be more accurate than Boolean methods (see EBMSearch below).<ref name="pmid15561789">{{cite journal |author=Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF |title=Text categorization models for high-quality article retrieval in internal medicine |journal=J Am Med Inform Assoc |volume=12 |issue=2 |pages=207–16 |year=2005 |pmid=15561789 |doi=10.1197/jamia.M1641}}</ref><ref name="pmid18952929">{{cite journal |author=Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB |title=Towards automatic recognition of scientifically rigorous clinical research evidence |journal=J Am Med Inform Assoc |volume=16 |issue=1 |pages=25–31 |year=2009 |pmid=18952929 |pmc=2605595 |doi=10.1197/jamia.M2996 |url=http://www.jamia.org/cgi/pmidlookup?view=long&pmid=18952929 |issn=}}</ref>
Machine learning methods in which the search engine seeks articles that more resemble the included articles, may be more accurate than Boolean methods (see EBMSearch below).<ref name="pmid15561789"/><ref name="pmid17911810">{{cite journal| author=Fu LD, Wang L, Aphinyanagphongs Y, Aliferis CF| title=A comparison of impact factor, clinical query filters, and pattern recognition query filters in terms of sensitivity to topic. | journal=Stud Health Technol Inform | year= 2007 | volume= 129 | issue= Pt 1 | pages= 716-20 | pmid=17911810 | doi= | pmc= | url= }} ''This study may be biased due to using 1994 version of clinical filters.''</ref> However, the study by Aphinyanaphongs compared machine learning to the 1994 Boolean filters.<ref name="pmid15561789"/>
 
Machine learning may be improved by ensemble learning method using stacked generalization (or stacking) to emphasize the role of UMLS concepts and title words.<ref name="pmid18952929" />
 
Machine learning may<ref name="pmid16622165" /><ref name="pmid17911810" /> or may not<ref name="pmid16221938" /> be more accurate than citation based strategies. Citation or link strategies may improve upon text categorization.<ref name="pmid18538027">{{cite journal| author=Lin J| title=PageRank without hyperlinks: reranking with PubMed related article networks for biomedical text retrieval. | journal=BMC Bioinformatics | year= 2008 | volume= 9 | issue= | pages= 270 | pmid=18538027 | doi=10.1186/1471-2105-9-270 | pmc=PMC2442104 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=18538027  }} </ref>
 
Machine learning built for categorizing one gold standard may not work as well in another setting.<ref name="pmid16622165" />


==Research methods for comparative studies==
==Research methods for comparative studies==
Line 58: Line 142:


# If a complete test collection of articles is available that is already divided into articles of meeting inclusion criteria and articles that not meeting criteria, then each strategy is compared for its ability to successfully identify the articles meeting criteria (sensitivity) and to successfully exclude (specificity) the articles not meeting criteria. Sensitivity is also called "''[[Information retrieval#Precision and recall|recall]]''".<ref name="isbn0-387-78702-X">{{cite book |author=Hersh, William R. |authorlink= |editor= |others= |title=Information Retrieval: A Health and Biomedical Perspective (Health Informatics) |edition= |language= |publisher=Springer |location=Berlin |year=2008 |origyear= |pages= |quote= |isbn=0-387-78702-X |oclc= |doi= |url= |accessdate=}} [http://books.google.com/books?id=H3f9xsW0a_8C Google books]</ref>
# If a complete test collection of articles is available that is already divided into articles of meeting inclusion criteria and articles that not meeting criteria, then each strategy is compared for its ability to successfully identify the articles meeting criteria (sensitivity) and to successfully exclude (specificity) the articles not meeting criteria. Sensitivity is also called "''[[Information retrieval#Precision and recall|recall]]''".<ref name="isbn0-387-78702-X">{{cite book |author=Hersh, William R. |authorlink= |editor= |others= |title=Information Retrieval: A Health and Biomedical Perspective (Health Informatics) |edition= |language= |publisher=Springer |location=Berlin |year=2008 |origyear= |pages= |quote= |isbn=0-387-78702-X |oclc= |doi= |url= |accessdate=}} [http://books.google.com/books?id=H3f9xsW0a_8C Google books]</ref>
# If a partial test collection is available that ''only'' consists of articles meeting inclusion criteria (for example, article meeting inclusion criteria for [http://www.acpjc.org/shared/purpose_and_procedure.htm ACP Journal Club]<ref name="pmid15561789">{{cite journal |author=Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF |title=Text categorization models for high-quality article retrieval in internal medicine |journal=J Am Med Inform Assoc |volume=12 |issue=2 |pages=207–16 |year=2005 |pmid=15561789 |doi=10.1197/jamia.M1641}}</ref> or articles included in a [[systematic review]] of a clinical topic or articles in an annotated bibliography<ref name="pmid16779053">{{cite journal |author=Herskovic JR, Bernstam EV |title=Using incomplete citation data for MEDLINE results ranking |journal=AMIA Annu Symp Proc |volume= |issue= |pages=316–20 |year=2005 |pmid=16779053 |doi=}}</ref>), then the sensitivity is again the proportion of relevant articles identified by the strategy. However, the specificity is not computable. Instead, one of several related measures are calculated. These measures are all based on the positive predictive value (PPV) of the strategy. Analogous to PPV used in diagnostic testing, the PPV directly correlates with the prevalence of relevant articles in the collection and thus is not stable across prevalences.<ref name="pmid12386115">{{cite journal |author=Bachmann LM, Coray R, Estermann P, Ter Riet G |title=Identifying diagnostic studies in MEDLINE: reducing the number needed to read |journal=J Am Med Inform Assoc |volume=9 |issue=6 |pages=653–8 |year=2002 |pmid=12386115 |doi=}}</ref>
# If a partial test collection is available that ''only'' consists of articles meeting inclusion criteria (for example, article meeting inclusion criteria for [http://www.acpjc.org/shared/purpose_and_procedure.htm ACP Journal Club]<ref name="pmid15561789" /> or articles included in a [[systematic review]] of a clinical topic or articles in an annotated bibliography<ref name="pmid16779053" />), then the sensitivity is again the proportion of relevant articles identified by the strategy. However, the specificity is not computable. Instead, one of several related measures are calculated. These measures are all based on the positive predictive value (PPV) of the strategy. Analogous to PPV used in diagnostic testing, the PPV directly correlates with the prevalence of relevant articles in the collection and thus is not stable across prevalences.<ref name="pmid12386115">{{cite journal |author=Bachmann LM, Coray R, Estermann P, Ter Riet G |title=Identifying diagnostic studies in MEDLINE: reducing the number needed to read |journal=J Am Med Inform Assoc |volume=9 |issue=6 |pages=653–8 |year=2002 |pmid=12386115 |doi=}}</ref>
## ''[[Information retrieval#Precision and recall|Precision]]'' is "the proportion of retrieved articles that meet criteria" and thus is the same as the PPV.<ref name="pmid15073027">{{cite journal |author=Haynes RB, Wilczynski NL |title=Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey |journal=BMJ |volume=328 |issue=7447 |pages=1040 |year=2004 |pmid=15073027 |doi=10.1136/bmj.38068.557998.EE}}</ref><ref name="pmid16684359">{{cite journal |author=Zhang L, Ajiferuke I, Sampson M |title=Optimizing search strategies to identify randomized controlled trials in MEDLINE |journal=BMC Med Res Methodol |volume=6 |issue= |pages=23 |year=2006 |pmid=16684359 |pmc=1488863 |doi=10.1186/1471-2288-6-23 |url=http://www.biomedcentral.com/1471-2288/6/23 |issn=}}</ref>
## ''[[Information retrieval#Precision and recall|Precision]]'', also called efficiency<ref name="pmid22249990">{{cite journal| author=Shariff SZ, Sontrop JM, Haynes RB, Iansavichus AV, McKibbon KA, Wilczynski NL et al.| title=Impact of PubMed search filters on the retrieval of evidence by physicians. | journal=CMAJ | year= 2012 | volume= 184 | issue= 3 | pages= E184-90 | pmid=22249990 | doi=10.1503/cmaj.101661 | pmc=PMC3281182 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=22249990  }} </ref>, is "the proportion of retrieved articles that meet criteria" and thus is the same as the PPV.<ref name="pmid15073027">{{cite journal |author=Haynes RB, Wilczynski NL |title=Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey |journal=BMJ |volume=328 |issue=7447 |pages=1040 |year=2004 |pmid=15073027 |doi=10.1136/bmj.38068.557998.EE}}</ref><ref name="pmid16684359">{{cite journal |author=Zhang L, Ajiferuke I, Sampson M |title=Optimizing search strategies to identify randomized controlled trials in MEDLINE |journal=BMC Med Res Methodol |volume=6 |issue= |pages=23 |year=2006 |pmid=16684359 |pmc=1488863 |doi=10.1186/1471-2288-6-23 |url=http://www.biomedcentral.com/1471-2288/6/23 |issn=}}</ref>
##''Hit curve'' "is the number of important articles among the first n results."<ref name="pmid16469545">{{cite journal |author=Herskovic JR, Iyengar MS, Bernstam EV |title=Using hit curves to compare search algorithm performance |journal=J Biomed Inform |volume=40 |issue=2 |pages=93–9 |year=2007 |pmid=16469545 |doi=10.1016/j.jbi.2005.12.007}}</ref><ref name="pmid16221938">{{cite journal |author=Bernstam EV, Herskovic JR, Aphinyanaphongs Y, Aliferis CF, Sriram MG, Hersh WR |title=Using citation data to improve retrieval from MEDLINE |journal=J Am Med Inform Assoc |volume=13 |issue=1 |pages=96–105 |year=2006 |pmid=16221938 |doi=10.1197/jamia.M1909}}</ref>
##''Number Needed to Read'' (NNR) is 1/precision and is "how many papers in a journal have to be read to find one of adequate clinical quality and relevance."<ref name="pmid15910578">{{cite journal |author=Toth B, Gray JA, Brice A |title=The number needed to read-a new measure of journal value |journal=Health Info Libr J |volume=22 |issue=2 |pages=81–2 |year=2005 |pmid=15910578 |doi=10.1111/j.1471-1842.2005.00568.x}}</ref><ref name="pmid15350200">{{cite journal |author=McKibbon KA, Wilczynski NL, Haynes RB |title=What do evidence-based secondary journals tell us about the publication of clinically important articles in primary healthcare journals? |journal=BMC Med |volume=2 |issue= |pages=33 |year=2004 |pmid=15350200 |doi=10.1186/1741-7015-2-33}}</ref><ref name="pmid12386115">{{cite journal |author=Bachmann LM, Coray R, Estermann P, Ter Riet G |title=Identifying diagnostic studies in MEDLINE: reducing the number needed to read |journal=J Am Med Inform Assoc |volume=9 |issue=6 |pages=653–8 |year=2002 |pmid=12386115 |doi=}}</ref><ref name="pmid17603909">{{cite journal |author=Haase A, Follmann M, Skipka G, Kirchner H |title=Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance |journal=BMC Med Res Methodol |volume=7 |issue= |pages=28 |year=2007 |pmid=17603909 |doi=10.1186/1471-2288-7-28}}</ref> Of note, the NNR has been proposed as a metric to help libraries to decide which journals to subscribe to.<ref name="pmid15910578"/>
##''Number Needed to Read'' (NNR) is "how many papers in a journal have to be read to find one of adequate clinical quality and relevance."<ref name="pmid15910578">{{cite journal |author=Toth B, Gray JA, Brice A |title=The number needed to read-a new measure of journal value |journal=Health Info Libr J |volume=22 |issue=2 |pages=81–2 |year=2005 |pmid=15910578 |doi=10.1111/j.1471-1842.2005.00568.x}}</ref><ref name="pmid15350200">{{cite journal |author=McKibbon KA, Wilczynski NL, Haynes RB |title=What do evidence-based secondary journals tell us about the publication of clinically important articles in primary healthcare journals? |journal=BMC Med |volume=2 |issue= |pages=33 |year=2004 |pmid=15350200 |doi=10.1186/1741-7015-2-33}}</ref><ref name="pmid12386115">{{cite journal |author=Bachmann LM, Coray R, Estermann P, Ter Riet G |title=Identifying diagnostic studies in MEDLINE: reducing the number needed to read |journal=J Am Med Inform Assoc |volume=9 |issue=6 |pages=653–8 |year=2002 |pmid=12386115 |doi=}}</ref><ref name="pmid17603909">{{cite journal |author=Haase A, Follmann M, Skipka G, Kirchner H |title=Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance |journal=BMC Med Res Methodol |volume=7 |issue= |pages=28 |year=2007 |pmid=17603909 |doi=10.1186/1471-2288-7-28}}</ref> Of note, the NNR has been proposed as a metric to help libraries to decide which journals to subscribe to.<ref name="pmid15910578"/>
##''Hit curve'' "is the number of important articles among the first n results."<ref name="pmid16469545">{{cite journal |author=Herskovic JR, Iyengar MS, Bernstam EV |title=Using hit curves to compare search algorithm performance |journal=J Biomed Inform |volume=40 |issue=2 |pages=93–9 |year=2007 |pmid=16469545 |doi=10.1016/j.jbi.2005.12.007}}</ref><ref name="pmid16221938" />
## 11-point precision recall graph is similar to a [[receiver operating characteristic curve]]<ref name="pmid15561789" />


==Methods to access MEDLINE==
==Methods to access MEDLINE==
There are many third party interfaces to search MEDLINE such as OVID<ref>{{cite web |url=http://www.ovid.com/site/catalog/DataBase/901.jsp | author=Anonymous |title=MEDLINE® - Ovid's MEDLINE |accessdate=2007-11-09 |format= |work=}}</ref>. The National Library of Medicine's own search interface is PubMed (http://pubmed.gov).
There are many third party interfaces to search MEDLINE such as OVID<ref>{{cite web |url=http://www.ovid.com/site/catalog/DataBase/901.jsp | author=Anonymous |title=MEDLINE® - Ovid's MEDLINE |accessdate=2007-11-09 |format= |work=}}</ref>. The National Library of Medicine's own search interface is PubMed (http://pubmed.gov). The National Library of Medicine maintains a list of search engines at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/search/.
 
===PubMed===
===PubMed===
{{main|PubMed}}
{{main|PubMed}}
Line 73: Line 159:


===EBMSearch===
===EBMSearch===
EBMSearch (http://ebmsearch.org/) maintains its own copy of MEDLINE and uses machine learning to rank articles.<ref name="pmid15561789">{{cite journal |author=Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF |title=Text categorization models for high-quality article retrieval in internal medicine |journal=J Am Med Inform Assoc |volume=12 |issue=2 |pages=207–16 |year=2005 |pmid=15561789 |doi=10.1197/jamia.M1641}}</ref>
EBMSearch (http://ebmsearch.org/) maintains its own copy of MEDLINE and uses machine learning to rank articles.<ref name="pmid15561789" />


===eTBLAST===
===eTBLAST===
[http://etblast.org eTBLAST] uses text mining to search for similar publications.<ref name="pmid17452348">{{cite journal| author=Errami M, Wren JD, Hicks JM, Garner HR| title=eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications. | journal=Nucleic Acids Res | year= 2007 | volume= 35 | issue= Web Server issue | pages= W12-5 | pmid=17452348  
[http://etblast.org eTBLAST] uses text mining to search for similar publications.<ref name="pmid17452348">{{cite journal| author=Errami M, Wren JD, Hicks JM, Garner HR| title=eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications. | journal=Nucleic Acids Res | year= 2007 | volume= 35 | issue= Web Server issue | pages= W12-5 | pmid=17452348  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=17452348 | doi=10.1093/nar/gkm221 | pmc=PMC1933238 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref><ref name="pmid16926219">{{cite journal| author=Lewis J, Ossowski S, Hicks J, Errami M, Garner HR| title=Text similarity: an alternative way to search MEDLINE. | journal=Bioinformatics | year= 2006 | volume= 22 | issue= 18 | pages= 2298-304 | pmid=16926219  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=17452348 | doi=10.1093/nar/gkm221 | pmc=PMC1933238 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref><ref name="pmid16926219">{{cite journal| author=Lewis J, Ossowski S, Hicks J, Errami M, Garner HR| title=Text similarity: an alternative way to search MEDLINE. | journal=Bioinformatics | year= 2006 | volume= 22 | issue= 18 | pages= 2298-304 | pmid=16926219  
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=16926219 | doi=10.1093/bioinformatics/btl388 }} <!--Formatted by http://sumsearch.uthscsa.edu/cite/--></ref>
| url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=clinical.uthscsa.edu/cite&email=badgett@uthscdsa.edu&retmode=ref&cmd=prlinks&id=16926219 | doi=10.1093/bioinformatics/btl388 }}</ref>


===GoPubMed===
===GoPubMed===
Line 85: Line 171:
===HubMed===
===HubMed===
HubMed (http://www.hubmed.org/) does not maintain its own copy of MEDLILNE, but rather uses PubMed's  [http://www.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html EUtils web service] to retrieve MEDLINE records stored at PubMed.<ref name="pmid16845111">{{cite journal |author=Eaton AD |title=HubMed: a web-based biomedical literature search interface |journal=Nucleic acids research |volume=34 |issue=Web Server issue |pages=W745–7 |year=2006 |month=July |pmid=16845111 |pmc=1538859 |doi=10.1093/nar/gkl037 |url=http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=16845111 |issn=}}</ref>
HubMed (http://www.hubmed.org/) does not maintain its own copy of MEDLILNE, but rather uses PubMed's  [http://www.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html EUtils web service] to retrieve MEDLINE records stored at PubMed.<ref name="pmid16845111">{{cite journal |author=Eaton AD |title=HubMed: a web-based biomedical literature search interface |journal=Nucleic acids research |volume=34 |issue=Web Server issue |pages=W745–7 |year=2006 |month=July |pmid=16845111 |pmc=1538859 |doi=10.1093/nar/gkl037 |url=http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=16845111 |issn=}}</ref>
===Medline Ranker===
[http://cbdm.mdc-berlin.de/~medlineranker/cms/medline-ranker Medline Ranker] uses [[machine learning]].<ref name="pmid19429696">{{cite journal| author=Fontaine JF, Barbosa-Silva A, Schaefer M, Huska MR, Muro EM, Andrade-Navarro MA| title=MedlineRanker: flexible ranking of biomedical literature. | journal=Nucleic Acids Res | year= 2009 | volume= 37 | issue= Web Server issue | pages= W141-6 | pmid=19429696 | doi=10.1093/nar/gkp353 | pmc=PMC2703945 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=19429696  }} </ref>
===MScanner===
MScanner uses [[machine learning]] with Naïve Bayes classifier with  two feature spaces (Medical Subject Headings (MeSH) and the journal of publication).<ref name="pmid18284683">{{cite journal| author=Poulter GL, Rubin DL, Altman RB, Seoighe C| title=MScanner: a classifier for retrieving Medline citations. | journal=BMC Bioinformatics | year= 2008 | volume= 9 | issue=  | pages= 108 | pmid=18284683 | doi=10.1186/1471-2105-9-108 | pmc=PMC2263023 | url=http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=18284683  }} </ref>


===Ovid===
===Ovid===
Line 90: Line 182:


===SUMSearch===
===SUMSearch===
SUMSearch (http://sumsearch.uthscsa.edu/) is a federated medical search engine. It does not maintain its own copy of MEDLINE, but rather queries PubMed and revises searches too few or too many citations are retrieved. At the same time, SUMSearch queries the National Guidelines Clearinghouse, DARE, WikiPedia, and other resources.
SUMSearch (http://sumsearch.org/) is a federated medical search engine. It does not maintain its own copy of MEDLINE, but rather queries PubMed and revises searches too few or too many citations are retrieved. At the same time, SUMSearch queries the [http://guidelines.gov National Guidelines Clearinghouse] and other resources.


==References==
==References==
<references/>
<references/>


==External links==
[[Category:CZ Live]] [[Category:Health Sciences Workgroup]][[Category:Suggestion Bot Tag]]
* [http://www.ncbi.nlm.nih.gov/sites/entrez/ PubMed]
* [http://pubmedhh.nlm.nih.gov/nlm/ PubMed for Handhelds]
* [http://www.ncbi.nlm.nih.gov/About/tools/restable_stat_pubmed.html PubMed usage statistics]
* [http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html Entrez Programming Utilities]
* [http://spore.swmed.edu/dejavu/ Déjà vu: a Database of Duplicate Citations in the Scientific] Literature (See Déjà vu--a study of duplicate citations in Medline PMID 18056062)
 
 
[[Category:CZ Live]] [[Category:Health Sciences Workgroup]]

Latest revision as of 12:00, 14 September 2024

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

According to the U.S. National Library of Medicine, "MEDLINE® (Medical Literature Analysis and Retrieval System Online) is the U.S. National Library of Medicine's® (NLM) premier bibliographic database that contains over 16 million references to journal articles in life sciences with a concentration on biomedicine. A distinctive feature of MEDLINE is that the records are indexed with NLM's Medical Subject Headings (MeSH®)."[1]

PubMed is the National Library of Medicine's free online search system for MEDLINE.

PubMed provides feedback relevance with its "See related" feature.[2][3]

Structure

MEDLINE® (Medical Literature Analysis and Retrieval System Online) is a database of predominantly biomedical bibliographic citations maintained by the U.S. National Library of Medicine (NLM).[4] Each citation includes bibliographic data, abstract if available, links to full text of the article and keywords.

The process for selecting journals is described.[5]

MeSH terms

The keywords are indexed with the NLM's Medical Subject Headings (MeSH®)[6] and subheadings[7].

The important MeSH terms “Randomized Controlled Trial” and “Clinical Controlled Trial” were introduced in 1991 and 1995, respectively.[8] The Cochrane Collaboration helps MEDLINE correctly retag articles with these terms.[8]

The National Library of Medicine's Indexing Initiative is trying to automate assignment of MeSH terms. The National Library of Medicine is investigated whether indexing MeSH terms can be either fully or semi-automated.[9] Indexing of MESH terms by human is assisted by the Medical Text Indexer (MTI).[10]

There is some inconsistency in assignment of MeSH terms. [11][12][13]

Methods to improve searching MEDLINE

Studies of searching MEDLINE. [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]
Study Setting Method Results Comments
Tanaka[14]
2011
    A score based on MeSH terms, journal impact factor, and number of authors can predict future citation patterns  
Bekhuis et al.[15]
2010
Locating relevant studies for systematic reviews Supervised machine learning with evolutionary SVM
• Ensemble of four SVM classifiers (title;abstract;metadata)
  Mean precision ranges from 26% to 37%
Wallace et al.[16]
2010
Locating relevant studies for systematic reviews Supervised machine learning with SVM
• Ensemble of four SVM classifiers (title;abstract;MeSH;UMLS)
  Reduced the number of articles to manually review by 40% to 50%.
Kilicoglu[17]
2009
Identifying high quality studies of interventions (randomized controlled trials) in Internal Medicine Supervised machine learning with SVM
• Naïve Bayes
• Boosting
• Ensemble learning method (stacking)
82.5% precision
84.3% recall
Lin[18]
2007
Oncoloogy
Trained with “annotated bibliography of important literature on common problems in surgical oncology”(SSO-AB)
Last updated 2001
(16% of SSO-AB are randomized controlled trials)
    citation count per year (CCPY) outperformed citation count (CC) and journal impact factor (JIF)
Fu[19]
2007
      "impact factor and clinical query filters are unstable for different topics while a topic-specific impact factor and machine learning-based filter models appear more robust"
Cohen[21]
2006
Locating relevant studies for updating systematic reviews      
Aphinyanaphongs[20]2006 Identifying high quality studies in Internal Medicine
Trained with articles cited by ACP Journal Club
Supervised machine learning with SVM
• Citation metrics
  "machine learning filters outperform standard citation metrics...the filter models have to be built specifically for this task and gold standard...Previous research that claimed better performance of citation metrics than machine learning in one of the corpora examined here is attributed to using machine learning filters built for a different gold standard and task"
Bernstam[22]
2006
Oncoloogy “annotated bibliography of important literature on common problems in surgical oncology”(SSO-AB)
Last updated 2001
(16% of SSO-AB are randomized controlled trials)
Trained with ACP Journal Club high quality studies
Supervised machine learning with SVM
versus
Journal impact factor, citation count per article, PageRank per article
versus
Pubmed's Clinical Queries
(searches performed in 2004(?)
Precision of first 50 citations:
PageRank 9%
Citation count 8%
Impact factor 2%
"Citation-based algorithms were more effective than noncitation-based algorithms"
Aphinyanaphongs[23]
2005
Identifying high quality studies in Internal Medicine
Trained with articles cited by ACP Journal Club
Supervised machine learning with SVM
• Naïve Bayes
• Boosting
versus
1994 PubMed Clinical Queries
cell Machine learning was more precise.
         
Notes:
SVM. Support vector machine (see machine learning)

There is much ongoing research into improving MEDLINE search results.[24][25]

Citation tracking

Citation tracking may help identify relevant studies in MEDLINE.[26][27]

Clustering

Clustering search results may help.[18]

Filters (hedges)

MEDLINE filters, also called hedges, are an optimal Boolean combination of search terms, both textword and MeSH terms, to search articles. Many filters have been made by the Hedges Team and are available as Clinical Queries at PubMed. Filters may improve efficieny (precision) of searches by physicians.[28] The Clinical Queries at PubMed may improve the quality of articles retrieved.[29]

Filters have been criticized for being imperfect.[30]

Filters for article types

Evolution of search filters
Purpose category Strategy with
high sensitivity
Strategy with
high specificity
1994 (developed with articles from 10 major journals)[31]
Treatment randomized controlled trial[Publication Type] OR drug therapy[MeSH Subheading] OR therapeutic use[MeSH Subheading] OR random*[Title/Abstract]

• Sensitivity = 99%
• Specificity = 74%

placebo*[Title/Abstract] OR (double[Title/Abstract] AND blind*[Title/Abstract]
Diagnosis
2005 (developed with articles from 160 journals)[32][33]
Treatment[34] (clinical[Title/Abstract] AND trial[Title/Abstract]) OR clinical trials[MeSH Terms] OR clinical trial[Publication Type] OR random*[Title/Abstract] OR random allocation[MeSH Terms] OR therapeutic use[MeSH Subheading]

• Sensitivity = 99%
• Specificity = 70%

randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract])

• Sensitivity = 93%
• Specificity = 97%

Diagnosis sensitiv*[Title/Abstract] OR sensitivity and specificity[MeSH Terms] OR diagnos*{Title/Abstract] OR diagnosis[MeSH:noexp] OR diagnostic * [MeSH:noexp] OR diagnosis,differential[MeSH:noexp] OR diagnosis[Subheading:noexp] specificity[Title/Abstract]

One filter is for identifying randomized controlled trials. Many MEDLINE filters have been developed by the Hedges team[33] supported by a grant from the National Library of Medicine.[35] The filters were initially published in 1994[31] and then revised and published in 2005[34].

Examples include filters for randomized controlled trials[36] and systematic reviews[37]. Of note, the the filter for randomized controlled trials and retrieves systematic reviews very well.[38]

Filters for studies of diagnostic test accuracy may[39] or may not[40] perform well. The reasons for missing studies may be due to incomplete indexing or articles by databases such as MEDLINE.[41]

Filters for subject types

A filter have been developed for articles about kidney disease[42], dentistry[43], and about specific age ranges[44] such as geriatrics[45].

Relevancy ranking

Although MEDLINE is usually searched for exact matches using Boolean terms, relevancy ranking has been studied. In an early comparison, relevancy ranking performed well; however, the Boolean version of MEDLINE did not fully use MeSH terms.[46][47]

eTBLAST uses text mining to search for similar publications.[48][49]

Citation analysis or PageRank

There are conflicting results over the role of ranking results based on citation counts or PageRank. A study using Google's own PageRank found PubMed's clinical queries to be better.[50] However, a comparative study found better results for a metric analogous to PageRank for biomedical journals based on:[22][51]

Machine learning

Machine learning methods in which the search engine seeks articles that more resemble the included articles, may be more accurate than Boolean methods (see EBMSearch below).[23][19] However, the study by Aphinyanaphongs compared machine learning to the 1994 Boolean filters.[23]

Machine learning may be improved by ensemble learning method using stacked generalization (or stacking) to emphasize the role of UMLS concepts and title words.[17]

Machine learning may[20][19] or may not[22] be more accurate than citation based strategies. Citation or link strategies may improve upon text categorization.[52]

Machine learning built for categorizing one gold standard may not work as well in another setting.[20]

Research methods for comparative studies

For more information, see: Information retrieval.

In comparing the information retrieval of search strategies, there are two experimental methods.

  1. If a complete test collection of articles is available that is already divided into articles of meeting inclusion criteria and articles that not meeting criteria, then each strategy is compared for its ability to successfully identify the articles meeting criteria (sensitivity) and to successfully exclude (specificity) the articles not meeting criteria. Sensitivity is also called "recall".[53]
  2. If a partial test collection is available that only consists of articles meeting inclusion criteria (for example, article meeting inclusion criteria for ACP Journal Club[23] or articles included in a systematic review of a clinical topic or articles in an annotated bibliography[51]), then the sensitivity is again the proportion of relevant articles identified by the strategy. However, the specificity is not computable. Instead, one of several related measures are calculated. These measures are all based on the positive predictive value (PPV) of the strategy. Analogous to PPV used in diagnostic testing, the PPV directly correlates with the prevalence of relevant articles in the collection and thus is not stable across prevalences.[54]
    1. Precision, also called efficiency[28], is "the proportion of retrieved articles that meet criteria" and thus is the same as the PPV.[55][56]
    2. Number Needed to Read (NNR) is 1/precision and is "how many papers in a journal have to be read to find one of adequate clinical quality and relevance."[57][58][54][50] Of note, the NNR has been proposed as a metric to help libraries to decide which journals to subscribe to.[57]
    3. Hit curve "is the number of important articles among the first n results."[59][22]
    4. 11-point precision recall graph is similar to a receiver operating characteristic curve[23]

Methods to access MEDLINE

There are many third party interfaces to search MEDLINE such as OVID[60]. The National Library of Medicine's own search interface is PubMed (http://pubmed.gov). The National Library of Medicine maintains a list of search engines at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/search/.

PubMed

For more information, see: PubMed.

PubMed (http://pubmed.gov) is the National Library of Medicine's own free Internet access to MEDLINE. PubMed has been freely available since 1997.

EBM Search

EBM Search (http://www.ahsl.arizona.edu/ebmsearch/) is a federated medical search engine.[61]

EBMSearch

EBMSearch (http://ebmsearch.org/) maintains its own copy of MEDLINE and uses machine learning to rank articles.[23]

eTBLAST

eTBLAST uses text mining to search for similar publications.[48][49]

GoPubMed

GoPubMed (http://www.GoPubMed.org/) applies social networking to MEDLINE.[62]

HubMed

HubMed (http://www.hubmed.org/) does not maintain its own copy of MEDLILNE, but rather uses PubMed's EUtils web service to retrieve MEDLINE records stored at PubMed.[63]

Medline Ranker

Medline Ranker uses machine learning.[64]

MScanner

MScanner uses machine learning with Naïve Bayes classifier with two feature spaces (Medical Subject Headings (MeSH) and the journal of publication).[65]

Ovid

SUMSearch

SUMSearch (http://sumsearch.org/) is a federated medical search engine. It does not maintain its own copy of MEDLINE, but rather queries PubMed and revises searches too few or too many citations are retrieved. At the same time, SUMSearch queries the National Guidelines Clearinghouse and other resources.

References

  1. MEDLINE Fact Sheet. National Library of Medicine. Retrieved on 2008-01-22.
  2. Lin J, Wilbur WJ (2007). "PubMed related articles: a probabilistic topic-based model for content similarity.". BMC Bioinformatics 8: 423. DOI:10.1186/1471-2105-8-423. PMID 17971238. PMC PMC2212667. Research Blogging.
  3. Anonymous (2011). PubMed Help: Computation of Related Citations
  4. National Library of Medicine. MEDLINE Fact Sheet. Retrieved on 2007-11-09.
  5. Anonymous (2007). MEDLINE® Journal Selection Fact Sheet. National Library of Medicine. Retrieved on 2010-04-04.
  6. National Library of Medicine. Medical Subject Headings (MESH®) Fact Sheet. Retrieved on 2007-11-09.
  7. Anonymous (2008). Qualifiers - 2008. National Library of Medicine. Retrieved on 2008-03-19.
  8. 8.0 8.1 Glanville JM, Lefebvre C, Miles JN, Camosso-Stefinovic J (2006). "How to identify randomized controlled trials in MEDLINE: ten years on.". J Med Libr Assoc 94 (2): 130-6. PMID 16636704. PMC PMC1435857.
  9. National Library of Medicine. Indexing Initiative. Retrieved on 2007-11-25.
  10. Anonymous. Medical Text Indexer (MTI). National Library of Medicine
  11. Booth A (1990). "How consistent is MEDLINE indexing? A few reservations.". Health Libr Rev 7 (1): 22-6. PMID 10106507[e]
  12. Funk ME, Reid CA (1983). "Indexing consistency in MEDLINE.". Bull Med Libr Assoc 71 (2): 176-83. PMID 6344946. PMC PMC227138[e]
  13. Portaluppi F (2007). "Consistency and accuracy of the Medical Subject Headings thesaurus for electronic indexing and retrieval of chronobiologic references.". Chronobiol Int 24 (6): 1213-29. DOI:10.1080/07420520701791570. PMID 18075808. Research Blogging.
  14. 14.0 14.1 Tanaka LY, Herskovic JR, Iyengar MS, Bernstam EV (2009). "Sequential result refinement for searching the biomedical literature.". J Biomed Inform 42 (4): 678-84. DOI:10.1016/j.jbi.2009.02.009. PMID 19272463. PMC PMC2722929. Research Blogging.
  15. 15.0 15.1 Bekhuis T, Demner-Fushman D (2010). "Towards automating the initial screening phase of a systematic review.". Stud Health Technol Inform 160 (Pt 1): 146-50. PMID 20841667[e]
  16. 16.0 16.1 Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH (2010). "Semi-automated screening of biomedical citations for systematic reviews.". BMC Bioinformatics 11: 55. DOI:10.1186/1471-2105-11-55. PMID 20102628. PMC PMC2824679. Research Blogging.
  17. 17.0 17.1 17.2 Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB (2009). "Towards automatic recognition of scientifically rigorous clinical research evidence.". J Am Med Inform Assoc 16 (1): 25-31. DOI:10.1197/jamia.M2996. PMID 18952929. PMC PMC2605595. Research Blogging.
  18. 18.0 18.1 18.2 Lin Y, Li W, Chen K, Liu Y (2007). "A document clustering and ranking system for exploring MEDLINE citations.". J Am Med Inform Assoc 14 (5): 651-61. DOI:10.1197/jamia.M2215. PMID 17600104. PMC PMC1975797. Research Blogging.
  19. 19.0 19.1 19.2 19.3 Fu LD, Wang L, Aphinyanagphongs Y, Aliferis CF (2007). "A comparison of impact factor, clinical query filters, and pattern recognition query filters in terms of sensitivity to topic.". Stud Health Technol Inform 129 (Pt 1): 716-20. PMID 17911810[e] This study may be biased due to using 1994 version of clinical filters.
  20. 20.0 20.1 20.2 20.3 Aphinyanaphongs Y, Statnikov A, Aliferis CF (2006). "A comparison of citation metrics to machine learning filters for the identification of high quality MEDLINE documents.". J Am Med Inform Assoc 13 (4): 446-55. DOI:10.1197/jamia.M2031. PMID 16622165. PMC PMC1513679. Research Blogging.
  21. 21.0 21.1 Cohen AM, Hersh WR, Peterson K, Yen PY (2006). "Reducing workload in systematic review preparation using automated citation classification.". J Am Med Inform Assoc 13 (2): 206-19. DOI:10.1197/jamia.M1929. PMID 16357352. PMC PMC1447545. Research Blogging.
  22. 22.0 22.1 22.2 22.3 22.4 Bernstam EV, Herskovic JR, Aphinyanaphongs Y, Aliferis CF, Sriram MG, Hersh WR (2006). "Using citation data to improve retrieval from MEDLINE". J Am Med Inform Assoc 13 (1): 96–105. DOI:10.1197/jamia.M1909. PMID 16221938. Research Blogging. This study may have been biased towards ranking systems because 1) all retrieval methods analyzed a "preliminary result set using simple PubMed queries, 2) the boolean filters were developed in 1994 as the authors probably completed the study prior to the 2005 update of PubMed filters"
  23. 23.0 23.1 23.2 23.3 23.4 23.5 23.6 Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF (2005). "Text categorization models for high-quality article retrieval in internal medicine.". J Am Med Inform Assoc 12 (2): 207-16. DOI:10.1197/jamia.M1641. PMID 15561789. PMC PMC551552. Research Blogging.
  24. Lu Z (2011). "PubMed and beyond: a survey of web tools for searching biomedical literature.". Database (Oxford) 2011: baq036. DOI:10.1093/database/baq036. PMID 21245076. PMC PMC3025693. Research Blogging.
  25. Kim JJ, Rebholz-Schuhmann D (2008). "Categorization of services for seeking information in biomedical literature: a typology for improvement of practice.". Brief Bioinform 9 (6): 452-65. DOI:10.1093/bib/bbn032. PMID 18660511. Research Blogging.
  26. Bakkalbasi N, Bauer K, Glover J, Wang L (2006). "Three options for citation tracking: Google Scholar, Scopus and Web of Science". Biomed Digit Libr 3: 7. DOI:10.1186/1742-5581-3-7. PMID 16805916. Research Blogging.
  27. Kuper H, Nicholson A, Hemingway H (2006). "Searching for observational studies: what does citation tracking add to PubMed? A case study in depression and coronary heart disease". BMC Med Res Methodol 6: 4. DOI:10.1186/1471-2288-6-4. PMID 16483366. Research Blogging.
  28. 28.0 28.1 Shariff SZ, Sontrop JM, Haynes RB, Iansavichus AV, McKibbon KA, Wilczynski NL et al. (2012). "Impact of PubMed search filters on the retrieval of evidence by physicians.". CMAJ 184 (3): E184-90. DOI:10.1503/cmaj.101661. PMID 22249990. PMC PMC3281182. Research Blogging.
  29. Lokker C, Haynes RB, Wilczynski NL, McKibbon KA, Walter SD (2011). "Retrieval of diagnostic and treatment studies for clinical use through PubMed and PubMed's Clinical Queries filters.". J Am Med Inform Assoc. DOI:10.1136/amiajnl-2011-000233. PMID 21680559. Research Blogging.
  30. Leeflang MM, Scholten RJ, Rutjes AW, Reitsma JB, Bossuyt PM (2006). "Use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies.". J Clin Epidemiol 59 (3): 234-40. DOI:10.1016/j.jclinepi.2005.07.014. PMID 16488353. Research Blogging.
  31. 31.0 31.1 Haynes RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC (1994). "Developing optimal search strategies for detecting clinically sound studies in MEDLINE.". J Am Med Inform Assoc 1 (6): 447-58. PMID 7850570. PMC PMC116228[e]
  32. Anonymous (2011 [last update]). PubMed Help. National Center for Biotechnology Information. Retrieved on August 15, 2011.
  33. 33.0 33.1 Hedges Team. Search Strategies. Retrieved on 2011-03-015.
  34. 34.0 34.1 Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Hedges Team (2005). "Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey.". BMJ 330 (7501): 1179. DOI:10.1136/bmj.38446.498542.8F. PMID 15894554. PMC PMC558012. Research Blogging.
  35. Project Information - NIH RePORTER – NIH Research Portfolio Online Reporting Tool Expenditures and Results. Retrieved on 2007-11-25.
  36. McKibbon KA, Wilczynski NL, Haynes RB (2009). "Retrieving randomized controlled trials from MEDLINE: a comparison of 38 published search filters.". Health Info Libr J 26 (3): 187-202. DOI:10.1111/j.1471-1842.2008.00827.x. PMID 19712211. Research Blogging.
  37. Wilczynski NL, Haynes RB (2009). "Consistency and accuracy of indexing systematic review articles and meta-analyses in MEDLINE.". Health Info Libr J 26 (3): 203-10. DOI:10.1111/j.1471-1842.2008.00823.x. PMID 19712212. Research Blogging.
  38. Wilczynski NL, McKibbon KA, Haynes RB (2011). "Sensitive Clinical Queries retrieved relevant systematic reviews as well as primary studies: an analytic survey.". J Clin Epidemiol. DOI:10.1016/j.jclinepi.2011.04.007. PMID 21775104. Research Blogging.
  39. Kastner M, Wilczynski NL, McKibbon AK, Garg AX, Haynes RB (2009). "Diagnostic test systematic reviews: bibliographic search filters ("Clinical Queries") for diagnostic accuracy studies perform well.". J Clin Epidemiol 62 (9): 974-81. DOI:10.1016/j.jclinepi.2008.11.006. PMID 19230607. PMC PMC2737707. Research Blogging.
  40. Whiting P, Westwood M, Beynon R, Burke M, Sterne JA, Glanville J (2011). "Inclusion of methodological filters in searches for diagnostic test accuracy studies misses relevant studies.". J Clin Epidemiol 64 (6): 602-7. DOI:10.1016/j.jclinepi.2010.07.006. PMID 21075596. Research Blogging.
  41. Kastner M, Haynes RB, Wilczynski NL (2012). "Inclusion of methodological filters in searches for diagnostic test accuracy studies misses relevant studies.". J Clin Epidemiol 65 (1): 116-7. DOI:10.1016/j.jclinepi.2011.02.011. PMID 22118266. Research Blogging.
  42. Garg AX, Iansavichus AV, Wilczynski NL, Kastner M, Baier LA, Shariff SZ et al. (2009). "Filtering Medline for a clinical discipline: diagnostic test assessment framework.". BMJ 339: b3435. DOI:10.1136/bmj.b3435. PMID 19767336. Research Blogging.
  43. Niederman R, Chen L, Murzyn L, Conway S. Benchmarking the dental randomised controlled literature on MEDLINE. Evidence-Based Dentistry. 2002;3:5-9 DOI:10.1038/sj/ebd/4600095
  44. Kastner M, Wilczynski NL, Walker-Dilks C, McKibbon KA, Haynes B (2006). "Age-specific search strategies for Medline.". J Med Internet Res 8 (4): e25. DOI:10.2196/jmir.8.4.e25. PMID 17213044. PMC PMC1794003. Research Blogging.
  45. van de Glind EM, van Munster BC, Spijker R, Scholten RJ, Hooft L (2012). "Search filters to identify geriatric medicine in Medline.". J Am Med Inform Assoc 19 (3): 468-72. DOI:10.1136/amiajnl-2011-000319. PMID 21946235. Research Blogging.
  46. Hersh WR, Hickam DH (1992). "A comparison of retrieval effectiveness for three methods of indexing medical literature". Am. J. Med. Sci. 303 (5): 292–300. PMID 1580316[e]
  47. Hersh WR, Hickam DH, Haynes RB, McKibbon KA (1994). "A performance and failure analysis of SAPHIRE with a MEDLINE test collection". J Am Med Inform Assoc 1 (1): 51–60. PMID 7719787[e]
  48. 48.0 48.1 Errami M, Wren JD, Hicks JM, Garner HR (2007). "eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications.". Nucleic Acids Res 35 (Web Server issue): W12-5. DOI:10.1093/nar/gkm221. PMID 17452348. PMC PMC1933238. Research Blogging.
  49. 49.0 49.1 Lewis J, Ossowski S, Hicks J, Errami M, Garner HR (2006). "Text similarity: an alternative way to search MEDLINE.". Bioinformatics 22 (18): 2298-304. DOI:10.1093/bioinformatics/btl388. PMID 16926219. Research Blogging.
  50. 50.0 50.1 Haase A, Follmann M, Skipka G, Kirchner H (2007). "Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance". BMC Med Res Methodol 7: 28. DOI:10.1186/1471-2288-7-28. PMID 17603909. Research Blogging.
  51. 51.0 51.1 Herskovic JR, Bernstam EV (2005). "Using incomplete citation data for MEDLINE results ranking". AMIA Annu Symp Proc: 316–20. PMID 16779053[e] PubMed Central
  52. Lin J (2008). "PageRank without hyperlinks: reranking with PubMed related article networks for biomedical text retrieval.". BMC Bioinformatics 9: 270. DOI:10.1186/1471-2105-9-270. PMID 18538027. PMC PMC2442104. Research Blogging.
  53. Hersh, William R. (2008). Information Retrieval: A Health and Biomedical Perspective (Health Informatics). Berlin: Springer. ISBN 0-387-78702-X.  Google books
  54. 54.0 54.1 Bachmann LM, Coray R, Estermann P, Ter Riet G (2002). "Identifying diagnostic studies in MEDLINE: reducing the number needed to read". J Am Med Inform Assoc 9 (6): 653–8. PMID 12386115[e]
  55. Haynes RB, Wilczynski NL (2004). "Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey". BMJ 328 (7447): 1040. DOI:10.1136/bmj.38068.557998.EE. PMID 15073027. Research Blogging.
  56. Zhang L, Ajiferuke I, Sampson M (2006). "Optimizing search strategies to identify randomized controlled trials in MEDLINE". BMC Med Res Methodol 6: 23. DOI:10.1186/1471-2288-6-23. PMID 16684359. PMC 1488863. Research Blogging.
  57. 57.0 57.1 Toth B, Gray JA, Brice A (2005). "The number needed to read-a new measure of journal value". Health Info Libr J 22 (2): 81–2. DOI:10.1111/j.1471-1842.2005.00568.x. PMID 15910578. Research Blogging.
  58. McKibbon KA, Wilczynski NL, Haynes RB (2004). "What do evidence-based secondary journals tell us about the publication of clinically important articles in primary healthcare journals?". BMC Med 2: 33. DOI:10.1186/1741-7015-2-33. PMID 15350200. Research Blogging.
  59. Herskovic JR, Iyengar MS, Bernstam EV (2007). "Using hit curves to compare search algorithm performance". J Biomed Inform 40 (2): 93–9. DOI:10.1016/j.jbi.2005.12.007. PMID 16469545. Research Blogging.
  60. Anonymous. MEDLINE® - Ovid's MEDLINE. Retrieved on 2007-11-09.
  61. Bracke PJ, Howse DK, Keim SM (April 2008). "Evidence-based Medicine Search: a customizable federated search engine". J Med Libr Assoc 96 (2): 108–13. DOI:10.3163/1536-5050.96.2.108. PMID 18379665. PMC 2268222. Research Blogging.
  62. Doms A, Schroeder M (July 2005). "GoPubMed: exploring PubMed with the Gene Ontology". Nucleic acids research 33 (Web Server issue): W783–6. DOI:10.1093/nar/gki470. PMID 15980585. PMC 1160231. Research Blogging.
  63. Eaton AD (July 2006). "HubMed: a web-based biomedical literature search interface". Nucleic acids research 34 (Web Server issue): W745–7. DOI:10.1093/nar/gkl037. PMID 16845111. PMC 1538859. Research Blogging.
  64. Fontaine JF, Barbosa-Silva A, Schaefer M, Huska MR, Muro EM, Andrade-Navarro MA (2009). "MedlineRanker: flexible ranking of biomedical literature.". Nucleic Acids Res 37 (Web Server issue): W141-6. DOI:10.1093/nar/gkp353. PMID 19429696. PMC PMC2703945. Research Blogging.
  65. Poulter GL, Rubin DL, Altman RB, Seoighe C (2008). "MScanner: a classifier for retrieving Medline citations.". BMC Bioinformatics 9: 108. DOI:10.1186/1471-2105-9-108. PMID 18284683. PMC PMC2263023. Research Blogging.