fluidstudio

Programmable music notation — write music as text, compile to MIDI + WAV stems + mix
Log | Files | Refs | README

SKILL.md (8955B)


      1 ---
      2 name: fluidstudio
      3 description: Programmable music notation with audio+video frame sync — write music as text, compile to MIDI/WAV, generate per-frame CSVs for video sync with PIL, ffmpeg, Blender, and more.
      4 category: media
      5 ---
      6 
      7 # FluidStudio — Interactive Music + Visual Sync
      8 
      9 ## What This Is
     10 
     11 FluidStudio is a programmable music notation system with **frame-accurate audio+video synchronization**. You write music as text (`.flsp` files), compile it to MIDI and WAV stems, then generate per-frame CSVs that drive visual output — shapes, text, glitch effects, 3D scenes — all locked to the beat.
     12 
     13 **The goal:** interactive, real-time sync between audio and visuals. Not pre-rendered video — a system where every frame knows what every instrument is doing.
     14 
     15 ## Setup
     16 
     17 ```bash
     18 # System deps (Debian)
     19 sudo apt install fluidsynth ffmpeg
     20 
     21 # Python
     22 python3 -m venv .venv
     23 source .venv/bin/activate
     24 pip install -r requirements.txt
     25 ```
     26 
     27 Requires: `mido`, `numpy`, `Pillow`. Optional: Blender with MCP for 3D scenes.
     28 
     29 ## The Project (You Are Here)
     30 
     31 ```
     32 fluidstudio/
     33 ├── asset/soundfonts/        # SF2 files — check INDEX.md for ranges
     34 ├── tools/                   # Python package
     35 │   ├── __init__.py          # from tools import compile_project
     36 │   ├── parser.py            # .flsp notation parser
     37 │   ├── midi_gen.py          # MIDI generation + FluidSynth rendering
     38 │   ├── compiler.py          # partition → stems → mix
     39 │   ├── clips.py             # per-frame CSV generator (the sync engine)
     40 │   ├── fps.py               # BPM → FPS calculator
     41 │   └── video.py             # basic video from clips (reference impl)
     42 ├── projects/                # songs go here
     43 └── SKILL.md                 # this file
     44 ```
     45 
     46 ## Project Structure (Per Song)
     47 
     48 ```
     49 projects/<song_name>/
     50 ├── partition/
     51 │   ├── header.json          # bpm, key, octave, mode, channels
     52 │   └── <voice>.flsp         # one file per stem (filename = channel name)
     53 ├── stems/                   # compiled WAV stems (exact beat duration)
     54 ├── mix/                     # mixed WAV
     55 ├── midi/                    # compiled MIDI files
     56 ├── clip/                    # per-frame CSVs (the sync data)
     57 └── render/                  # final output (video, etc.)
     58 ```
     59 
     60 ## Header Format (`partition/header.json`)
     61 
     62 ```json
     63 {
     64   "project": "my_song",
     65   "bpm": 120,
     66   "key": 0,
     67   "octave": 4,
     68   "mode": "ionian",
     69   "channels": [
     70     {"name": "piano", "sf2": "rkhive/jrhodes3a-looped/jrhodes3a-looped.sf2", "octave": 5},
     71     {"name": "drums", "sf2": "rkhive/ghosdrum/ghosdrum.sf2", "octave": 1}
     72   ]
     73 }
     74 ```
     75 
     76 - `key`: semitone root (0=C, 7=G). Combined with `mode` → chord suggestions.
     77 - `octave`: base octave (4 = C4 = MIDI 60).
     78 - `mode`: chord suggestion only — fully chromatic, no restrictions.
     79 - `sf2`: path relative to `asset/soundfonts/`. Check `INDEX.md` for ranges.
     80 
     81 **Choosing instruments:** Check `asset/soundfonts/INDEX.md` for optimal ranges. Some SF2s only sound good in specific octaves. Always verify before assigning.
     82 
     83 ## Notation Format (`.flsp`)
     84 
     85 Chromatic semitone system — `0=C, 1=C#, 2=D, ..., 11=B` relative to octave in header.
     86 
     87 ```
     88 # melody
     89 - 0:b/2 (legato: on)  // C, portamento
     90 - 4:b/2               // E
     91 ---
     92 - 7:b                 // G
     93 ---
     94 - 9:b/2               // A
     95 - 11:b/2              // B
     96 ---
     97 - 0:b (legato: off)   // C
     98 ```
     99 
    100 - `---` = beat divider. Lines between dividers = subdivisions.
    101 - Duration: `b` = 1 beat, `b/2` = half, `b/3` = triplet, `2b` = 2 beats.
    102 - Chords: `0xm69` = C as m69 chord. Velocity: `(v:60)`. Legato: `(legato: on)`.
    103 - Rests: empty section between `---` = full beat rest.
    104 
    105 ## Compiling
    106 
    107 ```bash
    108 python -m tools.compiler projects/my_song
    109 ```
    110 
    111 Or from Python:
    112 ```python
    113 from tools.compiler import compile_project
    114 compile_project("projects/my_song")
    115 ```
    116 
    117 ## The Frame Sync System
    118 
    119 This is the core of FluidStudio. The frame sync system generates per-frame CSVs that tell you exactly what every instrument is doing at every frame.
    120 
    121 ```bash
    122 python -m tools.clips projects/my_song
    123 # Output: projects/my_song/clip/*.csv (one per bar)
    124 ```
    125 
    126 **Each CSV row = one frame. Columns = instruments. Cell values = MIDI notes playing at that frame.**
    127 
    128 ### Frame Rate
    129 
    130 Auto-calculated from BPM for clean sync:
    131 
    132 ```
    133 fps × 60 = frames per minute
    134 frames_per_beat = (fps × 60) / BPM
    135 ```
    136 
    137 For clean sync, `frames_per_beat` must be integer. `tools/fps.py` calculates this:
    138 
    139 ```bash
    140 python -m tools.fps 120    # → fps=24, frames/beat=12
    141 python -m tools.fps 100    # → fps=40, frames/beat=24
    142 ```
    143 
    144 Override with `--fps N` if you need a specific rate.
    145 
    146 ### What You Can Do With Frame Data
    147 
    148 The clip CSVs are the bridge between audio and visuals. Each frame knows which notes are active, which instruments are playing, which beats are hitting. **Use this to drive anything:**
    149 
    150 **PIL / Pillow** — Generate frames programmatically. Draw shapes that pulse with instruments, colors that shift with chords, particles that spawn on note-on events. The `tools/video.py` is a reference implementation (disks on note-on, fade over 1s) — go beyond it.
    151 
    152 **ffmpeg filters** — Apply glitch effects at precise moments. `rgbashift` on snare hits, `curve` shifts on chord changes, `zoompan` on bass drops. The frame CSV tells you exactly which frames to target.
    153 
    154 **Text / Matrix effects** — Scroll text at beat subdivisions. Spawn characters on note-on. Change font size with velocity. Glitch text on off-beats.
    155 
    156 **Blender + MCP** — Set Blender's frame rate to match the project's fps. Read the clip CSVs to keyframe object positions, material colors, light intensities, camera moves. Flash lights on kick drums. Rotate objects on chord changes. Spawn geometry on note-on events.
    157 
    158 **Any visual tool** — TouchDesigner, Processing, p5.js, shader uniforms, LED strips, projection mapping. The CSVs are just data — (frame, instrument, notes). Map it to anything.
    159 
    160 ### Blender Workflow
    161 
    162 1. Set Blender scene fps to match project fps (from `tools/fps.py`)
    163 2. Read clip CSVs in a script or via MCP
    164 3. Keyframe objects per frame: position, rotation, scale, material, visibility
    165 4. Sync camera moves to beat structure
    166 5. Render frames → combine with audio in ffmpeg
    167 
    168 ## Music Theory — Be Creative
    169 
    170 **Key + mode are suggestions, not rules.** The notation is fully chromatic — any note works anywhere. Key and mode exist to suggest which chords feel "at home" together, but dissonance, tension, and outside notes are powerful tools.
    171 
    172 **Don't default to basic major/minor.** If the user hasn't specified a key/scale preference, explore:
    173 
    174 - **Misheberak** — Phrygian dominant, Middle Eastern flavor, great for tension
    175 - **Freygish** — Similar to misheberak, Jewish folk scale, haunting
    176 - **Hungarian minor** — Double harmonic, dramatic, cinematic
    177 - **Whole tone** — Dreamy, unresolved, floating
    178 - **Lyidan #4** — Bright but strange, sci-fi feel
    179 - **Persian** — Exotic, contemplative
    180 - **Diminished / augmented** — Symmetrical, unsettling, great for transitions
    181 
    182 Use `tools/chromatic_check.py` to list scale-pure chords per degree:
    183 
    184 ```bash
    185 python tools/chromatic_check.py 0 hungarian_minor
    186 ```
    187 
    188 **As the user works on more projects, learn their preferences.** Do they favor minor keys? Modal interchange? Chromatic runs? Adjust your suggestions over time.
    189 
    190 ## Available Chord Types
    191 
    192 M, M6, M69, M7m, M7M, M9, Madd2, Madd9, m, m6, m7m, m7M, m9, m69, madd2, madd9, sus2, sus26, sus27, sus27M, sus4, sus46, sus47, sus47M, dimb3, dim, dim7, b5, aug, aug7, p4, TT, p5
    193 
    194 ## Available Modes
    195 
    196 ionian, dorian, phrygian, lydian, mixolydian, aeolian, locrian, misheberak, freygish, hungarian_minor, hungarian_major, persian, whole_tone
    197 
    198 ## 48-Tick Resolution
    199 
    200 48 ticks per bar = LCM(16, 12):
    201 - Binary (16th notes): 3 ticks each
    202 - Ternary (triplets): 4 ticks each
    203 - Both land on integer ticks → polyrhythm locked
    204 
    205 At 120 BPM: 48 ticks/bar = 12 ticks/beat. 120 × 12 = 1440 ticks/min = 24 fps × 60. One tick = one frame at 24fps/120BPM.
    206 
    207 ## Pitfalls
    208 
    209 - **Stem length mismatch:** `amix` truncates to shortest input. Loop patterns to match longest part.
    210 - **WAV for stitching:** MP3 adds padding (~0.05s). Keep stems as WAV until final delivery.
    211 - **Trim accuracy:** Use `ffmpeg -af "atrim=end=N"` (sample-accurate), not `-t` (ms off).
    212 - **Relative imports:** All tools use relative imports. Use `python -m tools.compiler`, not `python tools/compiler.py`.
    213 
    214 ## Design Philosophy
    215 
    216 **Simplification over special cases.** Chromatic semitones + octave offset eliminated the need for separate percussion channels — drums use the same system.
    217 
    218 **Unified systems.** All channels (melodic, chord, percussion) use the same notation and resolution logic.
    219 
    220 **Interactive, not pre-rendered.** Frame sync is a live data feed. The CSVs know what's happening at every frame. Use that to drive any visual system — don't just render video, build an instrument.