Information about XplorMed
XplorMed Home Tutorial About Contact
About XplorMed

The authors of XplorMed are Carolina Perez-Iratxeta, Peer Bork, and Miguel A. Andrade

Introduction

The XplorMed server allows you to explore a set of abstracts derived from a MEDLINE search. The system gives you the main associations between the words in groups of abstracts. Then, you can select a subset of your abstracts based on selected groups of related words and iterate your analisis on them.

XplorMed is recommended for cases in which you do not know exactly what are you expecting to find. Your interests may be modified by the results obtained, or you may want to enquire new questions as the analysis develops. Also, the results may suggest you additional words that should be used to expand your query in MEDLINE (e.g., unexpected abbreviations of a protein name, or synonyms of a disease).

The XplorMed server is running at OHRI (Ottawa, Canada), and can be accessed via the URLs http://xplormed.ogic.ca/. The only input needed is a set of abstracts (in one of the currently accepted input formats) or a definition of how to obtain them.

Generate input for XplorMed

There are three ways of entering XplorMed:
  • At the yellow gate you are required simply to type a query in MEDLINE (such as "mip AND protein AND 1998 [Entrez Date]").
  • At the red gate you have to provide the identifier if one entry from a database with links to MEDLINE. For example "TETX_CLOTE" from SwissProt. You can provide multiple MEDLINE identifiers, for example, "9278503 8366047 9298646".
  • At the green gate you have the maximum control over the input because you are asked to provide a file containing the abstracts of your choice. This file must conform to one of the input formats that we are currently considering. Check out our tutorial for an example of how to prepare such a file using the NCBI Entrez server.
Minimum number of abstracts recommended. We recommend a minimum of 20 abstracts for the computation of relations. XplorMed makes statistics over word co-occurrence that require that words appear together a minimum number of times. The less data you have, the less relations can be reliably detected (in the limit, you cannot compute anything out of one single abstract). Obviously, the more structured is your query, the smaller the number of documents you will need in order to get meaningful results.

Limitations of use of the server. There is a limit in the amount of abstracts submitted to the server (for computational resource limitations) which is currently set on 500 abstracts. If you need more, contact us.

Limiting your search. If your results contain too many abstracts you can either try another more specific query in PubMed by adding more words to the query, or use the limit options (e.g., search only papers published in one year). That may also give you a good overview.

If your results contain too few abstracts then you can make your query in PubMed less specific by using less terms in the query. You may as well add terms by using boolean operators, e.g., `(mobile OR cellular) AND (phone OR telephone)'.

Use the XplorMed server

Open the URL http://xplormed.ogic.ca in your web browser. Select your entry point (Yellow/Green/Red) and provide accordingly, either a query to MEDLINE, a database entry identifier, or a file with abstracts. Hit the `Sort abstracts by MeSH categories' button. The next window displays the classification of your abstracts based on the top hierarchy of the MeSH terms associated to each MEDLINE entry. Select the categories of your interest (or all of them if you do not know). Then hit the `Compute relationships between words in abstracts' button. This step is the one that takes longer since it produces the computation of the relationships between words.

In the next window you will get the list of important words (words that are significantly related to others) and therefore, likely to form part of ensembles of words repeated across several of your abstracts. You can already begin to explore the context of those words by clicking in the word itself. This will lead you to a new window indicating all sentences containing the word. Alternatively, you can follow the [R] links that show related words. For a more general contextual analysis, you can open an exploration window by hitting the `Explore context of any word' button. From there you can find the context and dependencies of any word.

Before performing the next step, you can control two variables that modulate how focused will be your analysis: alpha (that controls the cut on the fuzzy relations used for the following steps) and `Score' (that is applied as a threshold on the K score for word selection). Low alpha values produce more and longer word chains. High K scores limits the number of words used for producing the word chains.

By clicking the `Compute chains of related words' button, you will be shown chains of related words extracted according to the relations and the words selected. Select one or more of those chains and `Rank abstracts by word-chain usage' in order to rate abstracts according to the presence of those words therein. Alternatively, you can type a chain of words in a window.

In the next window you will obtain the list of abstracts ordered by their relation to the selected chains of words. You can select a subset of abstracts and the `Compute relations between words in abstracts' button leads you to the next iteration of the analysis which is done in the selected subset of abstracts. You can also expand your selection with a number of MEDLINE neighbours of the selected abstracts.

Additional Information

We have used a list of stop-words to filter out words that are considered non-informative. This list is based on the one used at NCBI with some additions (e.g., including measure units).

Literature about XplorMed

Description of the web server.
    Perez-Iratxeta C, Bork P, Andrade MA. (2001)
    XplorMed: a tool for exploring MEDLINE abstracts.
    Trends Biochem. Sci. 26, 573-575.

Detailed description of the algorithm and a benchmark.

    Perez-Iratxeta C, Keer HS, Bork P, Andrade MA. (2002)
    Computing fuzzy associations for the analysis of biological literature.
    Biotechniques. 32, 1380-1385.

Detailed description of the usage.

    Perez-Iratxeta C, Bork P, Andrade MA. (2002)
    Exploring MEDLINE abstracts with XplorMed.
    Drugs of Today. 38, 381-389.

New XplorMed features and an example of usage.

    Perez-Iratxeta, C., Pérez AJ, Bork P, Andrade MA. (2003)
    Update on XplorMed: a web server for exploring scientific literature.
    Nucleic Acids Research. 31, 3866-3868.

Acknowledgements

Thanks to Dr. Helmut Schmid (Institute of Natural Language Processing, Stuttgart University) for providing TreeTagger. TreeTagger is used for part-of-speech disambiguation and stemming of the free text from the abstracts submitted.

Thanks to Brigitte Boeckmann from the SwissProt group at Geneva for suggestions to improve XplorMed.

Contact

Send e-mail to Carolina Perez-Iratxeta cperez-iratxeta@ohri.ca

OHRI Bioinformatics Group XplorMed About XplorMed Learning by Example