QMidiGen 0.1.3: The Generator Got a Stress Test
TL;DR: QMidiGen 0.1.3 is out. The headlining addition is a rotation test tool that generates all 51 JRPG subtypes across hundreds of seeds — no UI, no audio hardware — and reports structural quality metrics in about 103 seconds for 51k songs. Battle generation got the most substantive overhaul yet: stinger outro archetypes, a double-channel-3 bug that had been quietly duplicating tracks, and a two-tier velocity ceiling that stopped flat dynamics from washing out the driving energy of 16th-note combat cells. Raising the base velocity from 65 to 82 to match measured JRPG source material was the right call, and also immediately broke all 27 soundfont profiles, which then each needed individual recalibration. The build system also hit a Cargo 1.77 regression that took some creative build.rs surgery to survive.
| Section | Summary |
|---|---|
| The Rotation Test | Generating thousands of songs fast to find structural bugs |
| The Velocity Problem | Why 65 was wrong, and the cascade that followed |
| Battle Generation Overhaul | Outro archetypes, the ch3 double bug, guitars that don’t belong, and texture additions |
| Soundfont Profile Recalibration | Recalibrating all 27 profiles after the base velocity change |
| Build System Pain | Cargo 1.77+, lib+bin conflicts, and lib_stub.rs |
| UI Additions | Per-track randomize buttons and a transport visualizer |
| CI/CD: SHA256 Checksums | Artifact checksums across three platforms |
| The Future: LLM Audio Analysis | The idea behind the rotation test, taken one step further |
The Rotation Test
The fundamental problem with a procedural music generator is that it has a combinatorially large output space and you can only listen to one song at a time. Manual testing means: generate a song, notice something weird, trace it, fix it, generate another song. That works fine when the bug is obvious and reproducible. It works badly when the bug only shows up in certain subtypes, certain seed ranges, or only after a specific combination of stanza roles fires. By the time 0.1.2 shipped, there were 51 subtypes and “does this still generate correctly” had become a genuinely open question for any generator change.
The rotation_test tool solves this by generating every subtype across N seeds — no UI, no audio driver, no FluidSynth — and analyzing each song structurally. It lives in tools/rotation_test/ as a standalone workspace member, pulls the pure-Rust music modules directly via #[path], and runs via cargo run -p rotation_test. At 1000 seeds (51k songs), it finishes in about 103 seconds. At 10 seeds — the quick sanity check — it takes a few.
What it actually checks, per stanza:
- Crashes — caught via
std::panic::catch_unwind. Zero panics across 51k songs at 0.1.3 ship time. - Missing melody track or zero notes — turned up one subtype where a density edge case was silently producing empty bars.
- Battle: Melody 2 presence and pitch similarity —
melody_similarity()computes per-bar Jaccard overlap between the primary melody and the second voice. A score above 65% triggers a warning. Before fixing the ch3 double bug (more below), octave-double mode was occasionally landing on the same register and reading as high similarity. - Bar repetition rate — fraction of non-empty melody bars whose pitch fingerprint appears more than once. Above 75% triggers a warning. The 0.1.2 generator was sitting around 40% for most subtypes; measured JRPG source material sits around 76%. Raising the repetition rate to match source meant deliberately returning to literal motif restatement more often, which felt counterintuitive but turned out to be correct.
- Velocity range — max minus min velocity across the melody. Below 5 with 8+ notes is the “flat dynamics” warning. This is what flagged the two-tier velocity ceiling bug (discussed in the battle section).
The output for a clean run looks like this:
[OK] Combat 10/10 m2:22% rep:54% vel:31
[OK] BossFight 10/10 m2:19% rep:61% vel:28
[!!] ModernBattle 10/10 m2:18% rep:48% vel: 3
→ flat dynamics (vel range = 3)That vel: 3 on ModernBattle was real. The unconditional velocity ceiling on all notes — including the fast 16th-note cells — was compressing everything to the same value. The rotation test caught it; I would not have caught it in manual listening without specifically generating a lot of ModernBattle and paying close attention to that specific problem.
The Velocity Problem
A comparison of 51k generated songs against 228 source JRPG MIDIs — Final Fantasy IV, V, and VI, Chrono Trigger, Suikoden II, Lufia 2, Breath of Fire III — turned up a stark discrepancy:
| Metric | Generated | Source |
|---|---|---|
| Mean velocity | 67 | 101 |
| Bar repetition | 40% | 76% |
| Tracks per song | ~8 | ~13 |
| Counter melody threshold | 8 bars | — |
Generated songs were about 34 velocity points too quiet across the board. The source composers were hitting hard, repeating their hooks far more than felt comfortable, and stacking more voices than the generator was building. These aren’t aesthetic preferences — they’re measurable facts about what the source material actually sounds like.
The fix was straightforward in concept: raise phrase_velocity base from 65 to 82. The measured output mean of 67 is higher than the base because velocity variation pushes some notes above the floor — the base is what actually needed moving. In practice, raising it immediately invalidated every soundfont profile that had velocity ceilings calibrated against the old 65 base. The GeneralUser GS profile went from 5 ceilings to 47. All 27 included profiles needed to be remeasured and adjusted to avoid double-compression — a ceiling of 80 that was meaningfully capping velocity at the old base was now catching notes too aggressively and squashing dynamics in the wrong direction.
The repetition fix meant changing melody generation to return to literal motif restatement more often, even within a section. The counter melody threshold dropped from 8 bars to 4 so more songs actually get the second voice. Neither of these felt right intuitively but both matched source behavior.
Battle Generation Overhaul
Battle was the most-changed area in 0.1.3, which is fitting because battle generation had accumulated the most technical debt from the earlier “just get something working” era.
Outro archetypes
Battle stingers (the intro-role stanza that fires when combat starts) now have an OutroArchetype enum that mirrors the wind-up gesture system introduced in 0.1.2. Three archetypes:
- QuickCutoff (1–2 bars): melody stabs once then stops. Default for non-boss combat.
- Decay (2–3 bars): melody steps down from its peak, getting quieter.
- TensionHold (2–4 bars): melody freezes on a suspension — the dramatic boss pause.
The bias rules matter: non-boss subtypes are 60% QuickCutoff and capped hard at 2 bars, because a long outro drains urgency from the loop. Boss subtypes get the full range. Pursuit always QuickCutoffs — it literally can’t afford to breathe. The render_outro_anacrusis seam fix runs on the last outro bar to build the rising gesture back into the next wind-up, so the loop transition doesn’t clunk.
The channel 3 double bug
Channel 3 hosts either the Melody 2 double (battle) or the Counter Melody (non-battle) — never both. Before 0.1.3, battle subtypes that happened to have a counter: pool entry in jrpg.yaml were generating both tracks, with both trying to write to the same MIDI channel. The result was instrument-change conflicts mid-song and, occasionally, a doubled lead that read as absurdly loud in the velocity analysis.
The fix is one guard: add_counter is gated by !use_battle_melody. If battle melody is active, counter melody doesn’t run. The rotation test’s M2 similarity metric is what made the symptom measurable rather than just a vague sense that something sounded off.
Two-tier velocity ceiling
The velocity ceiling system was applying the per-program profile cap unconditionally to all notes. This is correct for sustained tones — a Trumpet held for two beats at 112 velocity will blow the face off the mix. It’s wrong for short attack notes. A 16th-note driving cell at 90 BPM with every note capped at the same value produces what the rotation test correctly calls “flat dynamics”: velocity range of 3, which sounds robotic and lifeless.
The fix:
let vceil = velocity_ceiling_for(program, &cfg.soundfont_profile);
let vel = if d >= 0.75 {
vel.min(vceil)
} else {
vel.min(vceil.saturating_add(20))
};Sustained notes (d ≥ 0.75) get the hard ceiling. Short attack notes get ceiling+20 headroom. The rotation test’s flat dynamics warning is what surfaced this — without it, the only way to catch it was staring at a piano roll and noticing that all velocities were converging on the same number.
Guitars do not belong here
Every guitar program (24–31) was removed from all battle subtype melody pools. Guitars return false from is_sustained_instrument, which means battle melody rendering doesn’t apply the 1.4× note-length extension for them. On a dense 8th-note driving cell at 160 BPM, that produces rapid staccato clicks rather than anything resembling a melody. This is the kind of thing that sounds wrong immediately when you hear it but is easy to miss when you’re staring at instrument list indexes. Replacements: Choir (52), Synth Brass (62), Violin (40), Harmonica (22) for Showdown’s western flavor.
The BossFight bass pool also had Slap Bass in it. The FinalBoss bass pool had Distortion Guitar. Both were removed without ceremony.
Phrygian cadences
EpicConfrontation and AncientEvil always promised Phrygian color in their design notes. They now actually deliver it. Chord progressions [1, 0, 5, 4] and [1, 0, 4, 0] implement the â™II→i cadence that makes that characteristic dark-theatrical resolution work. These are the only two subtypes with Phrygian cadences — putting them anywhere else would be wrong for the subtype’s intended sound.
Texture additions
A few smaller generator changes worth noting:
Per-section seed — each stanza now draws its chord progression from song_seed.wrapping_add(i * GOLDEN) so sections don’t repeat the same root sequence. Previously all sections were pulling from the same seed position, which meant verse, chorus, and bridge could end up with identical harmonic motion.
Rhythmic pad pulse — ModernBattle, EpicConfrontation, and GrandFinale pad tracks now use a beat-synced 8th-note pulse pattern instead of holding a static chord. This matters more than it sounds; a sustained pad underneath a driving 16th-note battle line reads as glue, but the same pad in a modern/orchestral context was sitting there doing nothing interesting.
Grace notes, ghost notes, neighbor tones, anticipations — added to melody and harmony tracks for textural variety. Not dramatic, but the difference between a generator that places pitches and one that phrases them.
Soundfont Profile Recalibration
All 27 bundled profiles were remeasured after the base velocity change. The GeneralUser GS profile went from 5 ceilings to 47, calibrated against newly-pooled bright programs that were now hitting at 82 instead of 65. Two new profiles were added: colombo_mt32.yaml and phoenix_mt32.yaml.
The Timbres of Heaven overhaul from 0.1.2 also needed a follow-up: the velocity_scale: 0.90 multiplier had been a dead field — the code that applied it didn’t exist yet. It now actually runs in 0.1.3, which meant Timbres of Heaven’s effective velocities dropped by 10% globally, requiring ceiling adjustments to avoid double-compressing the programs that already had caps. Trumpet (56) had been in avoid_programs but turned out to be fine once the scale was active and the ceiling was tuned — it’s available again for Castle and VictoryFanfare use.
The pattern for all profiles: velocity_scale multiplies vel_boost at generation time; ceilings are applied after scaling. If both exist on the same program, they stack, which is almost always too aggressive. The recalibration work was mostly finding and removing that double-compression.
Build System Pain
Adding a src/lib.rs for the rotation test crate created a lib target alongside the existing bin. In Cargo 1.77+, when a build script emits any cargo:: (double-colon) directive, cargo: (single-colon) directives from cxx-qt-build stop reaching the linker. The symptom is undefined symbol: cxx_qt_init_qml_module_com_qmidigen at link time — not a compile error, a link error, which takes slightly longer to trace back to “oh, Cargo changed how it handles mixed directive styles.”
The fix in build.rs is to explicitly re-emit the critical archives using cargo::rustc-link-arg-bins after calling CxxQtBuilder::build(). Platform-conditional:
// Linux
"-Wl,--push-state,--whole-archive",
"-lcxx_qt_lib_qml_plugin",
"-Wl,--pop-state",
// Windows MSVC
"/WHOLEARCHIVE:cxx_qt_lib_qml_plugin.lib",
// macOS
"-Wl,-force_load,libcxx_qt_lib_qml_plugin.a",lib_stub.rs is the companion piece: the actual src/lib.rs points at the stub with required-features = ["_internal_lib_target"], which suppresses the lib target from every normal build invocation. All playbook cargo build calls use --bin qmidigen for the same reason.
This is not elegant. It’s the kind of thing you write when Cargo changes behavior on you and you need the build to work on Monday.
UI Additions
Two UI additions in 0.1.3:
Per-track randomize buttons in the piano roll header — ↻ randomizes the instrument, ?♪ randomizes the notes. Both are lock-aware: locked tracks skip randomization silently. The motivation was making it easier to explore variations on a generated song without regenerating the entire stanza.
Audio visualizer on the transport bar — an 8-bar beat-synchronized canvas animation. Uses fillRect, not roundRect — the latter was added in Qt 6.4 and silently does nothing on earlier versions. Uses a Connections block to react to isPlayingChanged from outside the Canvas rather than a local onIsPlayingChanged property handler, which would be unreliable under cxx-qt’s signal model. Both of these are documented gotchas at this point.
CI/CD: SHA256 Checksums
Release artifacts now ship with checksums. Each build job generates per-artifact .sha256 files using the platform-native tool (sha256sum on Linux, shasum on macOS, Get-FileHash on Windows). Both create-release and update-latest-release concatenate them into a single SHA256SUMS file in standard format, upload to the Generic Package Registry, and link it from the release page.
The pipeline also got deduplication rules to stop double-builds on push+tag events, and the latest tag is now suppressed from pipeline triggers so a latest tag update doesn’t immediately re-trigger its own release pipeline. Neither of these were showstoppers but both were producing confusing CI behavior that was worth cleaning up.
The Future: LLM Audio Analysis
The rotation test is one half of an idea. The other half isn’t implemented yet, but it’s worth stating: the longer-term goal is to render each generated song as a WAV file at a significantly elevated BPM — fast enough that hundreds of songs can be processed quickly — slow the audio back down to listenable speed, and feed the result to an LLM to scan for repetitive melodic patterns, structural bugs, and quality regressions that structural metrics alone can’t catch.
The rotation test as it stands catches crashes, missing tracks, flat dynamics, and samey bar content. What it can’t catch is “this counter melody pattern keeps landing in the wrong register” or “the call-and-response voice sounds like it’s answering the wrong question.” Those are listening problems, not counting problems.
There’s an intermediate version that doesn’t need an LLM at all: render 200 WAVs, run a pitch tracker and onset detector to extract audio-domain features, and compare those against the same statistical targets used for the rotation test. The WAV rendering path already exists — FluidSynth’s export API is functional and the export path is exercised in production. Dumping hundreds of WAV files in under a second is feasible today. The analysis layer is the unsolved part.