Baseline frequencies for MeSH descriptors computed from a local PostgreSQL mirror of PubMed (April 2026). For each descriptor, counts reflect the number of distinct PMIDs indexed with that term; proportions use the full PubMed corpus of 39,703,112 PMIDs as denominator. Descriptor UI and canonical name are joined from the NLM MeSH thesaurus. Intended as a baseline for MeSH term enrichment analyses against arbitrary PubMed subsets.

data_mesh_frequencies

Format

A data.table with 30,521 rows and 4 columns:

DescriptorUI

MeSH descriptor unique identifier (e.g., D000001)

DescriptorName

Canonical MeSH descriptor name

n_pmids

Number of distinct PubMed records indexed with this descriptor

prop_total

Proportion of all 39,703,112 PubMed PMIDs indexed with this descriptor

Source

Computed from mesh_descriptor table in a local PubMed PostgreSQL mirror; descriptor metadata from the NLM MeSH Thesaurus (April 2026).