Python

healthsim

HealthSim generates realistic synthetic healthcare data for testing EMR systems, claims processing, pharmacy benefits, and analytics. Use for ANY request involving: (1) synthetic patients, clinical data, or medical records, (2) healthcare claims, billing, or adjudication, (3) pharmacy prescriptions, formularies, or drug utilization, (4) HL7v2, FHIR, X12, or NCPDP formatted output, (5) healthcare testing scenarios or sample data generation.

$ Installer

git clone https://github.com/mark64oswald/healthsim-workspace ~/.claude/skills/healthsim-workspace

// tip: Run this command in your terminal to install the skill


name: healthsim description: "HealthSim generates realistic synthetic healthcare data for testing EMR systems, claims processing, pharmacy benefits, and analytics. Use for ANY request involving: (1) synthetic patients, clinical data, or medical records, (2) healthcare claims, billing, or adjudication, (3) pharmacy prescriptions, formularies, or drug utilization, (4) HL7v2, FHIR, X12, or NCPDP formatted output, (5) healthcare testing scenarios or sample data generation."

HealthSim - Synthetic Healthcare Data Generation

Overview

HealthSim generates realistic synthetic healthcare data through natural conversation. Rather than writing code or configuration files, describe what you need and Claude generates appropriate data.

Products:

ProductDomainWhat It GeneratesStatus
PatientSimClinical/EMRPatients, encounters, diagnoses, procedures, labs, vitals, medicationsActive
MemberSimPayer/ClaimsMembers, professional claims, facility claims, payments, accumulatorsActive
RxMemberSimPharmacy/PBMPrescriptions, pharmacy claims, formularies, DUR alerts, prior authsActive
TrialSimClinical TrialsStudies, sites, subjects, visits, adverse events, efficacy, CDISC outputActive
PopulationSimDemographics/SDOHPopulation profiles, cohort specifications, health disparities, SVI/ADI analysisActive
NetworkSimProvider NetworksProviders, facilities, pharmacies, networks, benefit structuresActive

Quick Start

Generate Clinical Data

Request: "Generate a 65-year-old diabetic patient with hypertension"

Claude will produce a patient with:

  • Demographics (age 65, realistic name/address)
  • Diagnoses (E11.9 Type 2 diabetes, I10 hypertension)
  • Medications (metformin, lisinopril)
  • Labs (A1C, BMP with values in expected ranges)
  • Comorbidities (likely hyperlipidemia, possible obesity)

Generate Claims

Request: "Create a professional claim for an office visit"

Claude will produce:

  • Claim header (provider NPI, member ID, service date)
  • Service lines (CPT 99213/99214, charges)
  • Diagnoses (ICD-10 codes)
  • Adjudication (allowed, paid, patient responsibility)

Generate Pharmacy Data

Request: "Generate a pharmacy claim that triggers a drug interaction alert"

Claude will produce:

  • Prescription details (NDC, quantity, days supply)
  • Pharmacy claim (BIN, PCN, cardholder ID)
  • DUR alert (DD code, clinical significance, recommendation)
  • Claim response (approved with warning or rejected)

Scenario Skills

PatientSim Scenarios

Load these for clinical data generation:

ScenarioUse WhenKey Elements
ADT Workflowadmission, discharge, transfer, patient movementA01/A02/A03 events, bed management, census
Diabetes Managementdiabetic, A1C, glucose, metformin, insulinDisease progression, medication escalation, complications
Heart FailureCHF, HFrEF, BNP, ejection fractionNYHA classification, GDMT therapy, decompensation
Chronic Kidney DiseaseCKD, eGFR, dialysis, nephrologyCKD staging, progression, comorbidities
Sepsis/Acute Caresepsis, infection, ICU, criticalSepsis criteria, antibiotic protocols, ICU stay
Orders & Resultslab order, radiology, ORM, ORU, resultsOrders, specimens, lab panels, radiology reports
ED Chest Painchest pain, emergency, ACS, troponinRisk stratification, HEART score, workup
Elective Jointhip replacement, knee replacement, arthroplastyPre-op, surgery, recovery, PT
Maternal Healthpregnancy, prenatal, L&D, postpartumPrenatal visits, GDM, preeclampsia, delivery
Oncologycancer, tumor, chemotherapy, breast/lung/colorectalStaging, treatment protocols, tumor markers

See: skills/patientsim/ for detailed skills

MemberSim Scenarios

Load these for claims and payer data:

ScenarioUse WhenKey Elements
Plan & Benefitsplan, benefit plan, HMO, PPO, HDHPPlan types, cost sharing, pharmacy tiers
Enrollment & Eligibilityenrollment, eligibility, 834, 270/271Member add/term, coverage verification, QLE
Professional Claimsoffice visit, 837P, physician claimE&M coding, place of service, adjudication
Facility Claimshospital, inpatient, 837I, DRGRevenue codes, DRG assignment, LOS
Prior Authorizationprior auth, pre-cert, authorizationRequest/response workflow, approval criteria
Accumulator Trackingdeductible, OOP, accumulatorYear-to-date tracking, family vs individual
Value-Based Carequality measures, VBC, HEDISAttribution, measure compliance, incentives
Behavioral Healthmental health, psychiatry, SUD, therapyPsychotherapy, medication management, PHP/IOP

See: skills/membersim/ for detailed skills

RxMemberSim Scenarios

Load these for pharmacy and PBM data:

ScenarioUse WhenKey Elements
Retail Pharmacyprescription fill, retail, copayNew/refill, pricing, patient pay
Specialty Pharmacyspecialty drug, biologics, hubLimited distribution, PA, patient support
DUR Alertsdrug interaction, DUR, therapeutic dupAlert types, severity, override
Formulary Managementformulary, tier, coverageTier structure, PA requirements, alternatives
Rx Enrollmentrx enrollment, pharmacy member, BIN PCNPharmacy benefit eligibility, plan assignment
Rx Prior Authpharmacy PA, step therapyClinical criteria, approval workflow
Rx Accumulatorsrx deductible, rx OOP, Part D phasesPharmacy cost sharing tracking, TrOOP
Manufacturer Programscopay card, patient assistanceCopay cards, PAPs, hub services

See: skills/rxmembersim/ for detailed skills

TrialSim Scenarios

Load these for clinical trial data generation:

ScenarioUse WhenKey Elements
Clinical Trials Domaintrial concepts, phases, CDISCPhase definitions, regulatory, standards
Recruitment & Enrollmentscreening, enrollment, consentScreening funnel, I/E criteria, randomization
Phase 1 Dose EscalationPhase 1, FIH, MTD, 3+3, BOIN, CRMDose escalation, DLT, PK sampling
Phase 2 Proof-of-ConceptPhase 2, Simon's, MCP-ModFutility stopping, dose-response
Phase 3 PivotalPhase 3, pivotal, registration trialMulti-site, endpoints, safety monitoring
Oncology Trialsoncology trial, tumor endpointsRECIST, survival endpoints, biomarkers
Cardiovascular TrialsCV outcomes, MACECardiac events, biomarkers
CNS TrialsCNS, Alzheimer's, MSCognitive scales, imaging
Cell & Gene TherapyCAR-T, gene therapy, CGTLong-term follow-up, CRS, ICANS
Dimensional Analyticstrial analytics, star schema, dashboardfact/dim tables, DuckDB, Databricks

See: skills/trialsim/ for detailed skills

PopulationSim Scenarios

Load these for population intelligence and cohort definition:

ScenarioUse WhenKey Elements
Geographic Profilecounty profile, demographics for, MSACounty/tract/metro demographics, health indicators
Health Patternsdiabetes rate, prevalence, disparitiesCDC PLACES measures, age-adjusted rates
SDOH AnalysisSVI, ADI, social vulnerability, deprivationSVI themes, ADI rankings, barriers
Cohort Definitiondefine cohort, population segmentCohortSpecification for generation products
Trial Supportdiversity planning, site selection, feasibilityFDA diversity, site ranking, enrollment projections

Key Differentiator: PopulationSim analyzes real population data (Census, CDC) and outputs specifications, not synthetic records. These specs drive realistic generation in PatientSim, MemberSim, and TrialSim.

See: skills/populationsim/ for detailed skills

NetworkSim Scenarios

Load these for provider network knowledge and entity generation:

ScenarioUse WhenKey Elements
Network TypesHMO, PPO, EPO, POS, HDHPNetwork definitions, cost/flexibility tradeoffs
Plan Structuresdeductible, copay, coinsurance, OOPBenefit design, cost sharing, accumulators
Pharmacy Benefitstier structure, formulary, PBMTier design, formulary types, pharmacy networks
PBM OperationsBIN, PCN, claims processing, rebatesClaim flow, adjudication, manufacturer rebates
Utilization Managementprior auth, step therapy, QLPA process, step requirements, quantity limits
Specialty Pharmacyspecialty drugs, hub model, REMSLimited distribution, specialty services
Network Adequacyaccess standards, time distanceTime/distance, provider ratios, ECPs
Provider Generationgenerate provider, NPI, physicianSynthetic providers with taxonomy, credentials
Facility Generationgenerate hospital, facility, CCNSynthetic facilities with beds, services
Pharmacy Generationgenerate pharmacy, NCPDPSynthetic pharmacies with type, chain

See: skills/networksim/ for detailed skills

Generative Framework

Load these for batch generation and specification-driven data creation:

ScenarioUse WhenKey Elements
Profile Builderbatch generation, cohort profile, population specDemographics, conditions, coverage definitions
Journey Buildertemporal patterns, event sequences, care pathwaysTimelines, triggers, branching
Quick Generatesimple single-entity generationFast single patient/member/claim
Distributionscustomize statistical patternsAge, cost, utilization distributions
Templatesstart from common patternsPre-built profiles and journeys

Key Differentiator: The Generative Framework enables specification-driven generation at scale. Build a profile, define a journey, then execute to generate hundreds or thousands of correlated entities across all products.

See: skills/generation/ for detailed skills

Common Skills

Cross-product infrastructure skills:

SkillUse WhenKey Elements
State Managementsave scenario, load scenario, list scenariosDuckDB persistence, scenario naming
Identity Correlationlink patient, find member, cross-productSSN correlation, entity linking
DuckDB Skillquery database, SQL, schemaDirect database operations

See: skills/common/ for detailed skills

Output Formats

Default: JSON

By default, Claude outputs data as JSON objects that match the canonical data model.

Healthcare Standards

Request specific formats:

FormatRequest PhrasesUse Case
FHIR R4"as FHIR", "FHIR bundle", "FHIR resources"Interoperability, modern APIs
C-CDA"as C-CDA", "as CCD", "discharge summary", "referral note"Clinical documents, HIE
HL7v2"as HL7", "ADT message", "HL7v2"Legacy EMR integration
X12 834"as 834", "X12 enrollment", "enrollment file"Benefit enrollment
X12 270/271"as 270", "eligibility inquiry", "eligibility check"Eligibility verification
X12 837"as 837", "X12 claim", "EDI format"Claims submission
X12 835"as 835", "remittance", "ERA"Payment posting
NCPDP D.0"as NCPDP", "pharmacy claim format"Pharmacy transactions
CDISC SDTM"as SDTM", "SDTM domains"Clinical trial regulatory submission
CDISC ADaM"as ADaM", "analysis datasets"Clinical trial statistical analysis

See: formats/ for transformation skills

Analytics Formats

FormatRequest PhrasesUse Case
Dimensional (DuckDB)"star schema for DuckDB", "dimensional model", "for analytics"Local BI development
Dimensional (Databricks)"star schema for Databricks", "load to Databricks", "Unity Catalog"Enterprise analytics

See: formats/dimensional-analytics.md for star schema details

Export Formats

FormatRequest Phrases
CSV"as CSV", "save to CSV", "spreadsheet"
Parquet"as Parquet", "for analytics"
SQL INSERT"as SQL", "INSERT statements"

Generation Parameters

Demographics

ParameterDefaultOptions
age_range18-90Any range, e.g., "pediatric (0-17)", "senior (65+)"
genderweighted (49% M, 51% F)"male", "female", specific distribution
count1Any number, batches for large counts

Clinical (PatientSim)

ParameterOptions
conditionsdiabetes, heart failure, CKD, hypertension, COPD, etc.
severitymild, moderate, severe, well-controlled, poorly-controlled
complicationswith/without specific complications

Claims (MemberSim)

ParameterOptions
claim_typeprofessional, institutional, dental
claim_statuspaid, denied, pending, partial
network_statusin-network, out-of-network

Pharmacy (RxMemberSim)

ParameterOptions
fill_typenew, refill
drug_typegeneric, brand, specialty
dur_alertsnone, warning, reject

Reproducibility

For consistent results across sessions:

Request: "Generate 10 patients using seed 42"

Claude will:

  1. Use seed 42 for all random selections
  2. Generate identical output if same parameters used
  3. Note the seed in output for reference

Validation

Claude automatically validates generated data for:

  • Structural: Required fields, data types, formats
  • Temporal: Date ordering (discharge after admission, etc.)
  • Referential: Foreign key relationships
  • Clinical: Age-appropriate conditions, gender-appropriate conditions
  • Business: Valid code combinations, realistic pricing

Request explicit validation: "Validate this patient data"

Reference Data

For code lookups and documentation:

ReferenceDescription
Code SystemsICD-10, CPT, HCPCS, LOINC, NDC, RxNorm
TerminologyHealthcare terminology and abbreviations
Clinical RulesClinical business rules and guidelines
Validation RulesAll validation rules and constraints
HL7v2 SegmentsHL7v2 segment definitions (MSH, PID, OBR, OBX, etc.)

Format Transformations

Transform generated data to healthcare standards:

Healthcare Standards

FormatSkillUse Case
FHIR R4formats/fhir-r4.mdModern interoperability, REST APIs
C-CDAformats/ccda-format.mdCCD, Discharge Summary, clinical documents
HL7v2 ADTformats/hl7v2-adt.mdAdmit/Discharge/Transfer messages
HL7v2 ORMformats/hl7v2-orm.mdOrder messages (lab, rad, meds)
HL7v2 ORUformats/hl7v2-oru.mdResults/observation messages
X12 837formats/x12-837.mdClaims submission (P/I)
X12 835formats/x12-835.mdRemittance/ERA
NCPDP D.0formats/ncpdp-d0.mdPharmacy transactions

Export Formats

FormatSkillUse Case
CSVformats/csv.mdSpreadsheets, data analysis, bulk export
SQLformats/sql.mdDatabase INSERT statements, data loading
Dimensionalformats/dimensional-analytics.mdStar schema for DuckDB, Databricks, BI tools

Integration & Testing

GuideDescription
Integration GuideCross-skill data flows and consistency
Testing PatternsValidation and test patterns

Examples

Example 1: Simple Patient

Request: "Generate a patient"

Output:

{
  "mrn": "MRN00000001",
  "name": {
    "given_name": "John",
    "family_name": "Smith"
  },
  "birth_date": "1975-03-15",
  "gender": "M",
  "address": {
    "street_address": "123 Main Street",
    "city": "Springfield",
    "state": "IL",
    "postal_code": "62701"
  }
}

Example 2: Clinical Scenario

Request: "Generate a 58-year-old female with poorly controlled Type 2 diabetes"

Output:

{
  "patient": {
    "mrn": "MRN00000002",
    "name": { "given_name": "Maria", "family_name": "Garcia" },
    "birth_date": "1967-08-22",
    "gender": "F"
  },
  "diagnoses": [
    { "code": "E11.65", "description": "Type 2 diabetes with hyperglycemia" },
    { "code": "I10", "description": "Essential hypertension" },
    { "code": "E78.5", "description": "Hyperlipidemia" }
  ],
  "medications": [
    { "name": "Metformin", "dose": "1000 mg", "frequency": "BID" },
    { "name": "Glipizide", "dose": "10 mg", "frequency": "BID" },
    { "name": "Lisinopril", "dose": "20 mg", "frequency": "QD" }
  ],
  "labs": [
    { "test": "HbA1c", "value": "9.8", "unit": "%", "flag": "H" },
    { "test": "Glucose", "value": "245", "unit": "mg/dL", "flag": "H" },
    { "test": "Creatinine", "value": "1.2", "unit": "mg/dL" }
  ]
}

Example 3: Professional Claim

Request: "Generate a paid professional claim for an office visit"

Output:

{
  "claim": {
    "claim_id": "CLM20250115000001",
    "claim_type": "PROFESSIONAL",
    "member_id": "MEM001234",
    "provider_npi": "1234567890",
    "service_date": "2025-01-15",
    "place_of_service": "11",
    "principal_diagnosis": "E11.9",
    "claim_lines": [
      {
        "line_number": 1,
        "procedure_code": "99214",
        "charge_amount": 175.00,
        "units": 1
      }
    ]
  },
  "adjudication": {
    "status": "paid",
    "allowed_amount": 125.00,
    "paid_amount": 100.00,
    "deductible": 0.00,
    "copay": 25.00,
    "coinsurance": 0.00
  }
}

Example 4: Pharmacy Claim with DUR

Request: "Generate a pharmacy claim that gets rejected for early refill"

Output:

{
  "claim": {
    "claim_id": "RX20250115000001",
    "transaction_code": "B1",
    "ndc": "00071015523",
    "drug_name": "Atorvastatin 20mg",
    "quantity": 30,
    "days_supply": 30,
    "service_date": "2025-01-15"
  },
  "response": {
    "status": "rejected",
    "reject_code": "79",
    "reject_message": "Refill Too Soon",
    "dur_alert": {
      "type": "ER",
      "message": "Refill 12 days early (before 80% used)",
      "previous_fill_date": "2024-12-27",
      "days_early": 12
    }
  }
}

Tips

  1. Be specific: "diabetic patient with A1C of 9.5" beats "sick patient"
  2. Request format early: "Generate as FHIR..." rather than converting after
  3. Use seeds: For reproducible test data across sessions
  4. Batch large requests: "Generate 100 in batches of 20"
  5. Validate sensitive data: Request validation for production-like scenarios

Disclaimer

HealthSim generates synthetic test data only. It is not a clinical decision support system and does not provide medical advice, diagnosis recommendations, or treatment guidance.

Intended uses:

  • Software development and testing
  • System integration validation
  • Training and educational demonstrations
  • Performance and load testing

Not intended for:

  • Clinical decision support
  • Medical advice or treatment recommendations
  • Actual patient care
  • Processing real PHI

The clinical patterns, medication regimens, and lab values reflect general healthcare conventions suitable for test data. They do not account for individual patient circumstances or the full complexity of clinical practice.