QMidiGen 0.1.2: Fifty-Five Styles and Several Hard Lessons

#rust#midi#qmidigen#fluidsynth#cxx-qt#procedural-generation#music

TL;DR: QMidiGen 0.1.2 is out today. The JRPG preset grew from roughly 30 styles to 55, calibrated against a serious study of classic and modern JRPG soundtracks. The generator got a motif system, a call-and-response voice, and an instrument coherence filter. Three FluidSynth bugs — one making seeks silently land at tick 0, one occasionally corrupting the C heap, one quietly breaking loop sections — turned out to have been there since 0.1.0. And the soundfont profile system learned to disqualify programs that sound categorically wrong, not just loud. The wins are real; a few of the pain points were educational in the worst way.

Section	Summary
The Measurement Project	Why I studied actual JRPG soundtracks instead of guessing
Fifty-Five Styles	What the preset expansion actually looks like
Generator Quality Pass	Motifs, call-and-response, coherence filters
Soundfont Profiles Got Teeth	`avoid_programs` and the Timbres of Heaven overhaul
Three FluidSynth Bugs	Seek ordering, heap corruption, and what the C source says
The cxx-qt Signal Problem	Property setters from Rust that refuse to wake QML
The UI Additions Nobody Will Notice	File → Save, Import MIDI, soundfont persistence
Lessons	What I’d do differently

The Measurement Project

The JRPG preset in 0.1.0 and 0.1.1 had reasonable-sounding values — tempos around where you’d expect them, density factors that produced something recognizable. “Reasonable-sounding” and “actually calibrated” are different things, though, and expanding the preset to cover 30+ more styles made that difference impossible to ignore.

For 0.1.2 I stopped guessing and started actually studying source material. The process: pick a representative cross-section of classic and modern JRPG soundtracks, work through them carefully noting tempo, density, and structural choices, and compare those observations against what the generator was producing.

Some of what came back was not what I expected.

Last Dungeon was supposed to be slow and oppressive. That’s the genre convention: you’re in the final area, dread is mounting, everything is slower and heavier. Actual JRPG soundtracks disagreed — endgame areas trend toward some of the fastest, densest tracks in their respective games. The intuition “final area = slow and foreboding” is a feeling, not a musical fact. The generator had been producing Last Dungeon at 92 BPM with density 0.80. Corrected to 136 BPM and 1.10.

Victory Fanfare was the busiest thing I found. By note density, victory fanfares are consistently some of the most active cues in the genre — which makes sense if you think about it for five seconds, but I hadn’t. The generator was producing it at density 0.85, lower than a standard battle theme. Corrected to 1.35. Now it actually sounds like something happened.

Final Boss is not always minor. Some of the most iconic JRPG final boss themes are in major keys — the “vast and hopeful” finale rather than the “oppressive and dark” one. I had the Final Boss scale palette excluding major entirely. Intuition overridden.

The pattern is consistent: intuitions about what JRPG music sounds like are shaped by memory and genre convention, and memory is not calibrated. Studying the actual music and correcting against it produced a preset that generates things you’d recognize as belonging in a game, rather than a generator’s approximation of that.

Fifty-Five Styles

The preset went from the original general-purpose styles (Pop, Jazz, Blues, Classical, Funk, Ambient) plus roughly 20 JRPG subtypes to 55 JRPG subtypes across nine categories. The new subtypes fill in gaps the original set left obvious.

Emotional gained Desolation (near-static, drumless, grief-register) and Theme of Love (88 BPM, A minor — the slow, warm romantic ballad archetype). Reminiscence and Drifter were already there; the expansion completes the palette.

Uplifting is a new category. AscendingDawn (86 BPM, C major) is the slow-grand hopeful finale — the major-key “this is actually enormous and triumphant” archetype rather than the minor-key dread variant. DragonCalling covers mythic and destiny-fulfilled moments. The category got its own UI group because these styles don’t fit under any of the existing ones without feeling wrong.

Airship & Space split off from Overworld to give CelestialVoyage its own home. Density 0.18 — space is the texture. Sustained pads, near-zero attack instruments, silence as a compositional element. Getting this to sound like something rather than nothing was more work than the numbers suggest.

Tension & Escape arrived for FleetingEscape — the urgent-chase subtype, the classic “you are running and the game will not let you forget it” archetype. High tempo, dense rhythm, short sections, no resolution. The category also houses the existing Tension style, which had been floating without a parent.

Each new subtype required calibrating five core parameters: default tempo, tempo range, density factor, drum intensity, and scale palette. The new UI category groups keep the picker navigable as the list grows.

Generator Quality Pass

Adding styles without improving the underlying generation would just produce more variations of the same mediocrity. Several significant changes went into the generator this cycle.

Motif system

The generator now picks a seed motif at the start of each song — a short sequence of pitch intervals and durations. Across sections, it reappears in two variations: augmentation (durations doubled, the shape stretched out) and retrograde (intervals and durations reversed). This isn’t subtle compositional theory; it’s the basic move that ties a piece together. Without it, sections generate independently and stack — they share instrumentation and key, but they don’t share anything that a listener would recognize as a common thread. With the motif system, you hear the same shape recurring in different forms, which is what musical coherence actually is.

The motif is fixed for a given generation seed, so regenerating the full song gives you the same motif in a different arrangement. Regenerating a single section inherits the motif from the rest of the song rather than picking a new one and breaking the thread.

Call-and-response voice (channel 7)

Non-battle styles now generate a response voice on MIDI channel 7. While the primary melody (channel 0) is resting between phrases, the response voice plays an inverted version of the motif in a contrasting register. Six rotating response templates per phrase window prevent it from settling into a predictable pattern.

This solved a problem I’d been aware of since 0.1.0 but hadn’t named clearly: non-battle styles had a melody, a harmony, and a bass, but the space between melody phrases was just pad sustain and drums ticking away. The response voice fills that space without competing when the melody is active.

Instrument coherence filter

The programs_compatible function blocks jarring instrument pairings — synth leads over orchestral strings, rock guitar alongside a woodwind section, that kind of thing. Implemented as a set of GM program ranges that don’t belong in the same arrangement:

fn programs_compatible(melody: u8, harmony: u8) -> bool {
    let synth_lead = 80..=95_u8;
    let orchestral_strings = 40..=47_u8;
    if synth_lead.contains(&melody) && orchestral_strings.contains(&harmony) {
        return false;
    }
    // additional known-bad pairings
    true
}

The generator picks new harmony instruments until it finds one that passes, with a fallback to the original pick if the pool is exhausted. The blocklist is deliberately conservative — it blocks things that are actively bad, not things that are merely unusual.

Sustained instrument handling

Strings, woodwinds, and horns now generate differently from attack instruments (piano, plucked strings, percussion-adjacent programs). Sustained instruments get notes held 1.4× longer at lower cell density. A string section playing thirty-second notes sounds like a malfunction, not orchestral writing. The distinction is made by GM program number range, not by category inference.

Songs sound better than they did in 0.1.0. This happens every cycle and it’s still slightly surprising every cycle. I don’t know what I expected.

Soundfont Profiles Got Teeth

The soundfont profile system launched with velocity_scale (a global volume multiplier) and velocity_ceilings (per-program max velocity). These are loudness controls — they fix instruments that are too hot relative to the rest of the font.

What they can’t fix is a timbre problem. If a soundfont’s violin samples sound shrill regardless of velocity, capping their velocity makes them quieter and still shrill. The right answer is not to use that program.

0.1.2 adds avoid_programs to the profile format. From assets/soundfonts/profiles/timbres_of_heaven.yaml:

# assets/soundfonts/profiles/timbres_of_heaven.yaml
avoid_programs: [25, 40, 56, 57, 61, 80, 81]
  # 25 = Steel Guitar — bizarre timbre in this font
  # 40 = Violin       — shrill at any velocity
  # 56 = Trumpet      — clips and distorts
  # 57 = Trombone     — same
  # 61 = Brass Sect.  — clips and distorts
  # 80 = Square Lead  — wrong sonic category for orchestral arrangements
  # 81 = Saw Wave     — same

The generator checks avoid_programs when picking instruments and routes around the listed programs, picking an alternative from the same pool. If the entire pool is on the avoid list, it falls back to the full pool rather than produce silence.

Timbres of Heaven (XGM) 4.00(G) got the most thorough overhaul: velocity_scale down to 0.90, 18 per-program velocity ceilings, and seven programs added to avoid_programs. Before this, generating with Timbres of Heaven landed on something that made you want to immediately regenerate roughly 20% of the time. After: it’s a legitimately usable font with known problem areas automatically routed around.

DSoundFontV4 and Roland SC-55 Up got more targeted fixes. Brass Section 61 clips and distorts in DSoundFontV4 at any scale — sample defect, not a loudness issue — so it’s avoided entirely. Roland SC-55 Up’s Synth Brass 1 runs 3× louder than everything else in the font; a single velocity ceiling of 80 handles it.

Three FluidSynth Bugs

Two of these had been present since 0.1.0. One was intermittent enough to blame on the environment until it wasn’t.

Bug 1: seek always landed at tick 0

The original seek implementation:

Start playback: fluid_player_play(player)
Seek to position: fluid_player_seek(player, tick)

This is backwards. FluidSynth’s audio driver runs on a separate thread. The moment fluid_player_play is called, that thread is live and processing ticks. Calling fluid_player_seek after that creates a race: the audio thread may have already processed the first callback before the seek arrives, at which point FluidSynth drops the seek silently and plays from tick 0. No error. No warning. Audio just starts from the beginning.

Calling fluid_player_seek while the player is still in FLUID_PLAYER_READY state — before fluid_player_play — is what actually works. In that state there’s no audio thread competing for the position (src/audio/engine.rs):

pub unsafe fn play_midi_bytes(&mut self, bytes: &[u8], start_tick: i64) -> Result<()> {
    // ... load bytes into player ...
    if start_tick > 0 {
        fluid_player_seek(self.player, start_tick as i32);  // must be while READY
    }
    fluid_player_play(self.player);                          // THEN start
    Ok(())
}

Every “seek to position and play” action — section switching, scrubbing, jumping to a section in the list — had been silently rewinding to the beginning. The UI showed the right position. The audio played from the start. These two facts coexisted for the entire 0.1.0–0.1.1 cycle without producing an error.

Bug 2: stopping caused occasional heap corruption

The crash message, when it appeared:

malloc(): unsorted double linked list corrupted

That’s a C heap report with no meaningful pointer to the actual site. What was happening: fluid_player_stop() signals the player to stop, but the audio driver thread is still running. It keeps processing for a few milliseconds while the player’s state winds down. Deleting the player while the audio thread is still active means the audio thread continues writing into freed memory.

Silencing the synth before stopping the player is the move (src/audio/engine.rs):

pub unsafe fn stop_player(&mut self) {
    for chan in 0..16 {
        fluid_synth_all_sounds_off(self.synth, chan);  // kill voices on audio thread's synth
    }
    fluid_player_stop(self.player);
    fluid_player_join(self.player);   // block until player thread confirms stopped
    delete_fluid_player(self.player);
}

fluid_synth_all_sounds_off kills active voices immediately on the synth side, so by the time the player is deleted the audio thread is writing silence and the synth state is stable. fluid_player_join makes the “stopped” part actually synchronous.

The corruption was ornery — intermittent and more common on faster machines, where the audio thread had more time to run between the stop signal and the delete. The kind of bug that’s easy to attribute to the wrong thing — a QML state issue, a build flag, memory elsewhere — before you read the FluidSynth source and understand what the threads are actually doing.

Bug 3: BPM set after play (not a crash, just quietly wrong)

A smaller one. Calling fluid_player_set_bpm after fluid_player_play in FluidSynth 2.3+ can trigger an internal re-seek on the first processed tick, interfering with where loop sections restart. Same root cause: the audio thread is live, and the BPM change arrives mid-stream. Same pattern as Bug 1 — call it before fluid_player_play, while the player is still in FLUID_PLAYER_READY. No crash, but section-loop behavior was quietly wrong in ways that looked like a position tracking bug until it wasn’t.

The cxx-qt Signal Problem

This is the one that cost the most time.

cxx-qt generates C++ QObject bindings from annotated Rust structs. When a field is exposed as a Qt property, changes to it emit a NOTIFY signal, which QML Connections handlers listen for. The contract seems clear: Rust setter emits signal, QML handler fires.

In practice:

Setting a property from QML fires the Connections handler reliably. QML does controller.someProperty = newValue, handler fires.

Setting a property from Rust often does not fire the Connections handler. A Rust method — invoked from QML — calls self.as_mut().set_some_property(value). The NOTIFY signal IS emitted. But the Connections { function onSomePropertyChanged() { ... } } handler may not fire. “May not” meaning “usually doesn’t, with no error, no warning, and no indication that anything went wrong.”

This turned up in three separate places during 0.1.2:

onDefaultSoundfontPathChanged set from Rust — fired correctly when set from QML, silently did nothing when set from Rust
Two other handlers in the soundfont and preset systems — neither fired when set from Rust

The pattern, once you’ve seen it enough times, is consistent enough to be a rule: never rely on a QML Connections handler reacting to a property set from Rust. Alternatives that actually work:

Fire a different signal at the method boundary — signals emitted explicitly from the invoked method seem to propagate more reliably than property-change notifications from setters called inside that method
Call a QML function directly after the Rust method returns, rather than waiting for a signal
Read the property directly from QML at call time instead of caching via a signal handler

Why does this happen? My best current understanding: the generated C++ emits the property-change signal through a queued connection when called from a Rust method executing on the Qt thread. The event loop needs to process the queued signal before the Connections handler fires. If there’s a phase mismatch between when the signal is enqueued and when the Connections binding is evaluated, the handler is skipped. I haven’t read enough of cxx-qt’s generated C++ to be fully confident in that explanation. The workarounds work regardless.

The UI Additions Nobody Will Notice

File → Save (Ctrl+S): if a file is already open, saves in place; otherwise opens Save As. This is the behavior everyone expects from every application that handles files, and it was missing in 0.1.1. That’s on me.

File → Save As… (Ctrl+Shift+S): always prompts. Separately bound from Save so saving a copy doesn’t overwrite the working file.

File → Import MIDI…: takes an external MIDI file and adds its tracks to the current song. Useful for importing reference tracks, combining separately-generated sections, or bringing in something you want to edit. It appends, it doesn’t replace.

Default soundfont persistence: the active soundfont is now written to settings on change and restored on next launch. Previously, every launch required re-selecting your soundfont. The app also probes a list of standard system soundfont paths on first launch so it has a reasonable chance of finding something useful automatically. What the scanner finds can quietly affect more than the available calibration profiles.

None of these are interesting to write about, which is roughly proportional to how much it matters that they work.

Lessons

Study the source material. “I know what this should sound like” is not calibration. The Last Dungeon correction (92 → 136 BPM) came directly from listening to a lot of JRPG endgame tracks and noticing they’re consistently among the fastest in their respective games — the opposite of the assumption baked into the original preset. That’s not something intuition produces, and the intuition pointed in the wrong direction. If you’re modeling a genre, study actual examples and correct against what you find.

FluidSynth’s threading model is not optional. It’s a C library with an audio thread that is live the moment you call fluid_player_play. Every operation that touches player or synth state after that point is a potential race. The documentation mentions this, but “mentions” is different from “makes you feel the consequence” — the seek bug was there for the full 0.1.0–0.1.1 cycle because it was consistent enough to look like a feature and not obviously broken until you measured the behavior. Read the source for any C audio library you’re wrapping in Rust. The safety boundary you build in Drop is only as good as your understanding of what the C side is doing on its threads.

cxx-qt signals set from Rust are unreliable; work around them from the QML side. This may be a design constraint rather than a bug, but either way: if your UI depends on reacting to Rust-initiated property changes via Connections handlers, you will spend time debugging something that produces no error output. Push updates from QML; don’t wait for Rust to pull.

Still Unanswered

The cxx-qt signal explanation I landed on is still a guess — I described a queued-connection phase mismatch, but I haven’t read enough of the generated C++ to know if that’s actually what’s happening or just a story that fits the symptoms. Anyone who knows whether this is a documented constraint or a bug worth filing, I’m curious. On the calibration side: studying real soundtracks fixed the obvious intuition errors, but how much of “what JRPG music sounds like” is measurable versus things I’m still getting wrong because I don’t know to measure them? And avoid_programs routes around bad timbres per soundfont — is hand-curating that list per font sustainable, or does it become something the analysis pipeline generates automatically at some point?

Sustained instruments are a different problem than attack instruments. A piano plays a note and the decay handles itself. A violin plays a note and holds it — for as long as the note lasts, and possibly longer if the release is slow. A generator tuned for piano-family instruments produces string writing that sounds like a pianist who got stuck holding the keys down. GM program number ranges are a good enough proxy for the distinction; they don’t need to be perfect to be useful.

The Measurement Project#

Fifty-Five Styles#

Generator Quality Pass#

Motif system#

Call-and-response voice (channel 7)#

Instrument coherence filter#

Sustained instrument handling#

Soundfont Profiles Got Teeth#

Three FluidSynth Bugs#

Bug 1: seek always landed at tick 0#

Bug 2: stopping caused occasional heap corruption#

Bug 3: BPM set after play (not a crash, just quietly wrong)#

The cxx-qt Signal Problem#

The UI Additions Nobody Will Notice#

Lessons#

Still Unanswered#