← All Scripts
Demo Data Script

Hospital Patient Visits

Generate 1,500 realistic patient records with age-correlated vitals, 10 departments, and 9 diagnosis groups. No external data needed.

GENERATED DATA 1,500 ROWS HEALTHCARE SCATTER + CITY VIEW

The Script

This script generates synthetic hospital data entirely in DuckDB. Vitals are loosely correlated with age: older patients trend toward higher blood pressure and more variable heart rates. Length of stay follows a log-normal distribution.

-- demo_medical.lua — Hospital patient visits (1,500 visits over 3 months)
--
-- Realistic vital signs with age-correlated patterns. Excellent for
-- scatter plots (age vs heart_rate) and City View (height=length_of_stay,
-- color=department, district=diagnosis_group).

ds.log("=== Hospital Patient Visits Demo ===")

ds.query([[
    SELECT
        date '2025-09-01' + INTERVAL (floor(random() * 90)::int) DAY   AS visit_date,
        10000 + i                                                        AS patient_id,

        -- Age: realistic distribution (18-95, clustered around 45-70)
        LEAST(GREATEST(
            round(45 + (random() - 0.5) * 50 + (random() - 0.5) * 20)::int,
            18), 95)                                                     AS age,

        -- Sex
        CASE WHEN random() < 0.48 THEN 'Male' ELSE 'Female' END         AS sex,

        -- Department (10 departments)
        CASE
            WHEN random() < 0.18 THEN 'Emergency'
            WHEN random() < 0.32 THEN 'Internal Medicine'
            WHEN random() < 0.44 THEN 'Cardiology'
            WHEN random() < 0.54 THEN 'Orthopedics'
            WHEN random() < 0.63 THEN 'Neurology'
            WHEN random() < 0.71 THEN 'Oncology'
            WHEN random() < 0.79 THEN 'Pediatrics'
            WHEN random() < 0.86 THEN 'Pulmonology'
            WHEN random() < 0.93 THEN 'Surgery'
            ELSE                      'Psychiatry'
        END AS department,

        -- Diagnosis group
        CASE
            WHEN random() < 0.15 THEN 'Cardiovascular'
            WHEN random() < 0.28 THEN 'Respiratory'
            WHEN random() < 0.40 THEN 'Musculoskeletal'
            WHEN random() < 0.50 THEN 'Infectious'
            WHEN random() < 0.60 THEN 'Neurological'
            WHEN random() < 0.70 THEN 'Gastrointestinal'
            WHEN random() < 0.80 THEN 'Endocrine'
            WHEN random() < 0.88 THEN 'Mental Health'
            ELSE                      'Trauma'
        END AS diagnosis_group,

        -- Vitals: loosely correlated with age
        -- Heart rate: higher variance in older patients
        LEAST(GREATEST(
            round(72 + (random() - 0.5) * 30
                  + CASE WHEN random() < 0.3 THEN 15 ELSE 0 END)::int,
            50), 140)                                                    AS heart_rate,

        -- Blood pressure systolic: trends up with age
        LEAST(GREATEST(
            round(110 + random() * 30
                  + (LEAST(GREATEST(round(45 + (random()-0.5)*50)::int, 18), 95) - 40) * 0.5
                  + (random() - 0.5) * 20)::int,
            85), 195)                                                    AS bp_systolic,

        -- Blood pressure diastolic
        LEAST(GREATEST(round(65 + random() * 25 + (random() - 0.5) * 15)::int,
            50), 110)                                                    AS bp_diastolic,

        -- Temperature (mostly normal, some fevers)
        round(
            CASE
                WHEN random() < 0.75 THEN 36.4 + random() * 0.8
                WHEN random() < 0.90 THEN 37.5 + random() * 1.5
                ELSE                      38.5 + random() * 2.0
            END, 1
        )                                                                AS temperature_c,

        -- Length of stay in hours (log-normal-ish)
        round(
            CASE
                WHEN random() < 0.30 THEN 1 + random() * 5        -- outpatient
                WHEN random() < 0.65 THEN 6 + random() * 42       -- short stay
                WHEN random() < 0.90 THEN 48 + random() * 120     -- multi-day
                ELSE                      168 + random() * 336     -- extended
            END, 1
        )                                                                AS length_of_stay_hrs,

        -- Readmission within 30 days
        CASE WHEN random() < 0.12 THEN 'Yes' ELSE 'No' END             AS readmitted

    FROM generate_series(0, 1499) AS t(i)
    ORDER BY visit_date
]])

ds.log("Generated " .. ds.data.row_count .. " patient visits")
ds.log("Columns: " .. table.concat(ds.data.column_names, ", "))

-- Scatter: age vs heart rate
ds.chart.type  = "scatter"
ds.chart.x     = 3      -- age
ds.chart.y     = {7}    -- heart_rate
ds.chart.title = "Heart Rate by Patient Age"

ds.log("=== Chart ready — switch to City View for 3D exploration ===")
ds.log("    Height: length_of_stay | Color: department | District: diagnosis_group")

Columns

12 fields: visit_date, patient_id, age, sex, department, diagnosis_group, heart_rate, bp_systolic, bp_diastolic, temperature, length_of_stay, readmitted.

Realistic Patterns

Age clusters around 45-70. Blood pressure rises with age. Temperature mostly normal with some fevers. Length of stay is log-normal.

Chart

Scatter: heart rate (Y) vs age (X). You can see the wider variance in older patients and the occasional tachycardia spike.

City View

Set height to length_of_stay_hrs, color to department, district to diagnosis_group. Extended stays tower over outpatient visits.


Run this script in ColumnLens

Paste it into the Script Console. No data download needed.

Download on the Mac App Store
Or try the free version (up to 100 MB)