SKILL.md (8955B)
1 --- 2 name: fluidstudio 3 description: Programmable music notation with audio+video frame sync — write music as text, compile to MIDI/WAV, generate per-frame CSVs for video sync with PIL, ffmpeg, Blender, and more. 4 category: media 5 --- 6 7 # FluidStudio — Interactive Music + Visual Sync 8 9 ## What This Is 10 11 FluidStudio is a programmable music notation system with **frame-accurate audio+video synchronization**. You write music as text (`.flsp` files), compile it to MIDI and WAV stems, then generate per-frame CSVs that drive visual output — shapes, text, glitch effects, 3D scenes — all locked to the beat. 12 13 **The goal:** interactive, real-time sync between audio and visuals. Not pre-rendered video — a system where every frame knows what every instrument is doing. 14 15 ## Setup 16 17 ```bash 18 # System deps (Debian) 19 sudo apt install fluidsynth ffmpeg 20 21 # Python 22 python3 -m venv .venv 23 source .venv/bin/activate 24 pip install -r requirements.txt 25 ``` 26 27 Requires: `mido`, `numpy`, `Pillow`. Optional: Blender with MCP for 3D scenes. 28 29 ## The Project (You Are Here) 30 31 ``` 32 fluidstudio/ 33 ├── asset/soundfonts/ # SF2 files — check INDEX.md for ranges 34 ├── tools/ # Python package 35 │ ├── __init__.py # from tools import compile_project 36 │ ├── parser.py # .flsp notation parser 37 │ ├── midi_gen.py # MIDI generation + FluidSynth rendering 38 │ ├── compiler.py # partition → stems → mix 39 │ ├── clips.py # per-frame CSV generator (the sync engine) 40 │ ├── fps.py # BPM → FPS calculator 41 │ └── video.py # basic video from clips (reference impl) 42 ├── projects/ # songs go here 43 └── SKILL.md # this file 44 ``` 45 46 ## Project Structure (Per Song) 47 48 ``` 49 projects/<song_name>/ 50 ├── partition/ 51 │ ├── header.json # bpm, key, octave, mode, channels 52 │ └── <voice>.flsp # one file per stem (filename = channel name) 53 ├── stems/ # compiled WAV stems (exact beat duration) 54 ├── mix/ # mixed WAV 55 ├── midi/ # compiled MIDI files 56 ├── clip/ # per-frame CSVs (the sync data) 57 └── render/ # final output (video, etc.) 58 ``` 59 60 ## Header Format (`partition/header.json`) 61 62 ```json 63 { 64 "project": "my_song", 65 "bpm": 120, 66 "key": 0, 67 "octave": 4, 68 "mode": "ionian", 69 "channels": [ 70 {"name": "piano", "sf2": "rkhive/jrhodes3a-looped/jrhodes3a-looped.sf2", "octave": 5}, 71 {"name": "drums", "sf2": "rkhive/ghosdrum/ghosdrum.sf2", "octave": 1} 72 ] 73 } 74 ``` 75 76 - `key`: semitone root (0=C, 7=G). Combined with `mode` → chord suggestions. 77 - `octave`: base octave (4 = C4 = MIDI 60). 78 - `mode`: chord suggestion only — fully chromatic, no restrictions. 79 - `sf2`: path relative to `asset/soundfonts/`. Check `INDEX.md` for ranges. 80 81 **Choosing instruments:** Check `asset/soundfonts/INDEX.md` for optimal ranges. Some SF2s only sound good in specific octaves. Always verify before assigning. 82 83 ## Notation Format (`.flsp`) 84 85 Chromatic semitone system — `0=C, 1=C#, 2=D, ..., 11=B` relative to octave in header. 86 87 ``` 88 # melody 89 - 0:b/2 (legato: on) // C, portamento 90 - 4:b/2 // E 91 --- 92 - 7:b // G 93 --- 94 - 9:b/2 // A 95 - 11:b/2 // B 96 --- 97 - 0:b (legato: off) // C 98 ``` 99 100 - `---` = beat divider. Lines between dividers = subdivisions. 101 - Duration: `b` = 1 beat, `b/2` = half, `b/3` = triplet, `2b` = 2 beats. 102 - Chords: `0xm69` = C as m69 chord. Velocity: `(v:60)`. Legato: `(legato: on)`. 103 - Rests: empty section between `---` = full beat rest. 104 105 ## Compiling 106 107 ```bash 108 python -m tools.compiler projects/my_song 109 ``` 110 111 Or from Python: 112 ```python 113 from tools.compiler import compile_project 114 compile_project("projects/my_song") 115 ``` 116 117 ## The Frame Sync System 118 119 This is the core of FluidStudio. The frame sync system generates per-frame CSVs that tell you exactly what every instrument is doing at every frame. 120 121 ```bash 122 python -m tools.clips projects/my_song 123 # Output: projects/my_song/clip/*.csv (one per bar) 124 ``` 125 126 **Each CSV row = one frame. Columns = instruments. Cell values = MIDI notes playing at that frame.** 127 128 ### Frame Rate 129 130 Auto-calculated from BPM for clean sync: 131 132 ``` 133 fps × 60 = frames per minute 134 frames_per_beat = (fps × 60) / BPM 135 ``` 136 137 For clean sync, `frames_per_beat` must be integer. `tools/fps.py` calculates this: 138 139 ```bash 140 python -m tools.fps 120 # → fps=24, frames/beat=12 141 python -m tools.fps 100 # → fps=40, frames/beat=24 142 ``` 143 144 Override with `--fps N` if you need a specific rate. 145 146 ### What You Can Do With Frame Data 147 148 The clip CSVs are the bridge between audio and visuals. Each frame knows which notes are active, which instruments are playing, which beats are hitting. **Use this to drive anything:** 149 150 **PIL / Pillow** — Generate frames programmatically. Draw shapes that pulse with instruments, colors that shift with chords, particles that spawn on note-on events. The `tools/video.py` is a reference implementation (disks on note-on, fade over 1s) — go beyond it. 151 152 **ffmpeg filters** — Apply glitch effects at precise moments. `rgbashift` on snare hits, `curve` shifts on chord changes, `zoompan` on bass drops. The frame CSV tells you exactly which frames to target. 153 154 **Text / Matrix effects** — Scroll text at beat subdivisions. Spawn characters on note-on. Change font size with velocity. Glitch text on off-beats. 155 156 **Blender + MCP** — Set Blender's frame rate to match the project's fps. Read the clip CSVs to keyframe object positions, material colors, light intensities, camera moves. Flash lights on kick drums. Rotate objects on chord changes. Spawn geometry on note-on events. 157 158 **Any visual tool** — TouchDesigner, Processing, p5.js, shader uniforms, LED strips, projection mapping. The CSVs are just data — (frame, instrument, notes). Map it to anything. 159 160 ### Blender Workflow 161 162 1. Set Blender scene fps to match project fps (from `tools/fps.py`) 163 2. Read clip CSVs in a script or via MCP 164 3. Keyframe objects per frame: position, rotation, scale, material, visibility 165 4. Sync camera moves to beat structure 166 5. Render frames → combine with audio in ffmpeg 167 168 ## Music Theory — Be Creative 169 170 **Key + mode are suggestions, not rules.** The notation is fully chromatic — any note works anywhere. Key and mode exist to suggest which chords feel "at home" together, but dissonance, tension, and outside notes are powerful tools. 171 172 **Don't default to basic major/minor.** If the user hasn't specified a key/scale preference, explore: 173 174 - **Misheberak** — Phrygian dominant, Middle Eastern flavor, great for tension 175 - **Freygish** — Similar to misheberak, Jewish folk scale, haunting 176 - **Hungarian minor** — Double harmonic, dramatic, cinematic 177 - **Whole tone** — Dreamy, unresolved, floating 178 - **Lyidan #4** — Bright but strange, sci-fi feel 179 - **Persian** — Exotic, contemplative 180 - **Diminished / augmented** — Symmetrical, unsettling, great for transitions 181 182 Use `tools/chromatic_check.py` to list scale-pure chords per degree: 183 184 ```bash 185 python tools/chromatic_check.py 0 hungarian_minor 186 ``` 187 188 **As the user works on more projects, learn their preferences.** Do they favor minor keys? Modal interchange? Chromatic runs? Adjust your suggestions over time. 189 190 ## Available Chord Types 191 192 M, M6, M69, M7m, M7M, M9, Madd2, Madd9, m, m6, m7m, m7M, m9, m69, madd2, madd9, sus2, sus26, sus27, sus27M, sus4, sus46, sus47, sus47M, dimb3, dim, dim7, b5, aug, aug7, p4, TT, p5 193 194 ## Available Modes 195 196 ionian, dorian, phrygian, lydian, mixolydian, aeolian, locrian, misheberak, freygish, hungarian_minor, hungarian_major, persian, whole_tone 197 198 ## 48-Tick Resolution 199 200 48 ticks per bar = LCM(16, 12): 201 - Binary (16th notes): 3 ticks each 202 - Ternary (triplets): 4 ticks each 203 - Both land on integer ticks → polyrhythm locked 204 205 At 120 BPM: 48 ticks/bar = 12 ticks/beat. 120 × 12 = 1440 ticks/min = 24 fps × 60. One tick = one frame at 24fps/120BPM. 206 207 ## Pitfalls 208 209 - **Stem length mismatch:** `amix` truncates to shortest input. Loop patterns to match longest part. 210 - **WAV for stitching:** MP3 adds padding (~0.05s). Keep stems as WAV until final delivery. 211 - **Trim accuracy:** Use `ffmpeg -af "atrim=end=N"` (sample-accurate), not `-t` (ms off). 212 - **Relative imports:** All tools use relative imports. Use `python -m tools.compiler`, not `python tools/compiler.py`. 213 214 ## Design Philosophy 215 216 **Simplification over special cases.** Chromatic semitones + octave offset eliminated the need for separate percussion channels — drums use the same system. 217 218 **Unified systems.** All channels (melodic, chord, percussion) use the same notation and resolution logic. 219 220 **Interactive, not pre-rendered.** Frame sync is a live data feed. The CSVs know what's happening at every frame. Use that to drive any visual system — don't just render video, build an instrument.