kb/data/en.wikipedia.org/wiki/Biomedical_text_mining-2.md

---
title: "Biomedical text mining"
chunk: 3/3
source: "https://en.wikipedia.org/wiki/Biomedical_text_mining"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T14:01:43.230057+00:00"
instance: "kb-cron"
---

=== Search engines ===
Search engines designed to retrieve biomedical literature relevant to a user-provided query frequently rely upon text mining approaches. Publicly available tools specific for research literature include PubMed search, Europe PubMed Central search, GeneView, and APSE Similarly, search engines and indexing systems specific for biomedical data have been developed, including DataMed and OmicsDI.
Some search engines, such as Essie, OncoSearch, PubGene, and GoPubMed were previously public but have since been discontinued, rendered obsolete, or integrated into commercial products.

=== Medical record analysis systems ===
Electronic medical records (EMRs) and electronic health records (EHRs) are collected by clinical staff in the course of diagnosis and treatment. Though these records generally include structured components with predictable formats and data types, the remainder of the reports are often free-text and difficult to search, leading to challenges with patient care. Numerous complete systems and tools have been developed to analyse these free-text portions. The MedLEE system was originally developed for analysis of chest radiology reports but later extended to other report topics. The clinical Text Analysis and Knowledge Extraction System, or cTAKES, annotates clinical text using a dictionary of concepts. The CLAMP system offers similar functionality with a user-friendly interface.

=== Frameworks ===
Computational frameworks have been developed to rapidly build tools for biomedical text mining tasks. SwellShark is a framework for biomedical NER that requires no human-labeled data but does make use of resources for weak supervision (e.g., UMLS semantic types). The SparkText framework uses Apache Spark data streaming, a NoSQL database, and basic machine learning methods to build predictive models from scientific articles.

=== APIs ===
Some biomedical text mining and natural language processing tools are available through application programming interfaces, or APIs. NOBLE Coder performs concept recognition through an API.

== Conferences ==
The following academic conferences and workshops host discussions and presentations in biomedical text mining advances. Most publish proceedings.

== Journals ==

A variety of academic journals publishing manuscripts on biology and medicine include topics in text mining and natural language processing software. Some journals, including the Journal of the American Medical Informatics Association (JAMIA) and the Journal of Biomedical Informatics are popular publications for these topics.

== References ==

== Further reading ==

== External links ==
Bio-NLP resources, systems and application database collection Archived 2009-05-04 at the Wayback Machine
The BioNLP mailing list archives
Corpora for biomedical text mining Archived 2011-07-24 at the Wayback Machine
The BioCreative evaluations of biomedical text mining technologies
Directory of people involved in BioNLP Archived 2011-08-09 at the Wayback Machine