gary v4 is stable: stable-audio-3 is now inside the DAW.
a VST3/AU plugin for musicians who want AI to meet them where they actually live. for me, that's ableton. for you, it might be fl studio or, if you're insane, reaper lol
https://thepatch.gumroad.com/l/gary4juce
latest stable releases:
recommended local companions:
- windows: gary4local v0.1.17
- macOS: gary4local mac v0.1.10
localhost backend source repos:
- windows: gary4local
- macOS: gary4local mac
videos about it here when i can: https://youtube.com/@thepatch_dev
If you just want pure text-to-music, you probably do not need a VST. This project exists for people who want AI to sit inside the session with them.
gary4juce now gives you seven AI music models directly in your DAW:
- sa3 (stable-audio-3) - text-to-audio, loops, transforms, continuations, LoRA blending, seed recall, key/BPM-aware prompting
- gary (musicgen) - continuation/anti-looper. Extends your audio in creative directions
- jerry (stable-audio-open-small) - BPM-aware 12-second loop generation in under a second
- rc-jerry (foundation-1) - BPM and key-aware 4/8-bar loop generation with structured prompt assembly
- carey (ace-step) - stem generation, extraction, audio continuation, and remix/cover with lyrics and multilingual support
- terry (melodyflow) - audio transformation. Turn your guitar into an orchestra
- darius (magenta-realtime) - high-quality 48 kHz continuations with style control
Put it on your master, press play, record some audio, and start iterating.
SA3 now runs on the remote backend, the Windows gary4local SA3 service, and gary4local mac.
v4.0.3 adds reproducible seed controls to Carey so supported Carey workflows can reuse a known seed and show the last seed returned by the backend.
This release also fixes two SA3 UI edge cases. Restored sessions that reopen directly to the SA3 subtab now refresh available LoRAs correctly, and dragging audio in with SA3 active now keeps the longer SA3/Carey-style selection window instead of falling back to the shorter model limit.
The Windows VST3 ZIP now nests license and Corresponding Source files inside
gary4juce.vst3, so Windows Extract All can target a VST3 folder without
leaving loose files beside the plugin.
v4.0.2 explicitly releases gary4juce under the GNU Affero General Public License v3.0 only. The repository now includes the canonical AGPLv3 text, copyright and SPDX notices, pinned JUCE 8.0.8 licensing information, and an in-plugin About dialog with direct access to the source and license.
Release packages now include the applicable license and third-party notice files plus exact links to the Corresponding Source used for the build. This release does not change any music models, request formats, or backend requirements.
v4.0.1 is a focused maintenance release. It does not add or change any music models. Its purpose is to make the plugin remember its UI settings when the editor is closed and reopened, including when a DAW temporarily removes the editor while navigating between plugins.
Settings now persist across editor reopens throughout Gary, Jerry, SA3, Terry, Carey, Darius, and Foundation-1. This includes prompts, generation controls, selected tabs and models, advanced sections, SA3 seeds and LoRAs, and the shared recording/output source selector.
The local service status also survives editor recreation. On Windows, local health checks now update per service and bypass the slower shared HTTP path, so a running Gary, Terry, Jerry, Carey, Foundation-1, or SA3 service is reflected in the UI immediately instead of waiting for every offline port to time out.
gary4juce has entered v4 with a new sa3 sub-tab inside Jerry, positioned alongside the original SAOS and Foundation-1 workflows.
SA3 currently includes:
- generate - text-to-audio up to 300 seconds, plus 4/8/16-bar loop mode
- transform - restyle the recording buffer or current output audio
- continue - continue the recording buffer or current output audio to a target total duration
- seed recall - random generations show the backend-returned seed so a take can be reproduced
- key/scale prompting - optional Carey-style key and mode dropdown appended to the final prompt
- LoRA sliders - one strength slider per available SA3 LoRA, defaulting to 0
- smart dice - prompt rolls come from the default pool or from every LoRA whose slider is above 0
Read the practical guide: SA3.md
Launch notes:
- SA3 outputs can be hot, especially with LoRAs. Treat gain staging like part of the instrument for now.
- Continue results can leave a quiet/fading tail near the end of longer continuations. This is being audited against the upstream SA3 UI.
- Local SA3 is available in gary4local on Windows and macOS, including LoRAs and both continuation modes.
v3 brought Carey, Foundation-1, and the first pass at the modern multi-model workflow:
- Carey joined with lego, complete, cover, extract, lyrics, language, key/scale, time signature, LoRA selection, LoRA dice captions, and caption popouts.
- Foundation-1 became
rc-jerry, a structured BPM/key-aware loop generator inside the Jerry tab. - Foundation-1 landed in gary4local mac on Apple silicon.
- Plugin-safe update checks and editor lifecycle hardening made the app much harder to crash during in-flight requests.
Carey guide: CAREY.md
- add SA3 to gary4local on Windows
- add SA3 to gary4local mac on macOS
- ship local SA3 training on Windows and macOS using underfit as the source of truth
- release the mac AU/VST3 build
- add the first SA3 usage guide: SA3.md
- add SA3's experimental
latent_prefixcontinuation mode - clean up and release a proper standalone app
- improve SA3 LoRA loudness handling on the backend
- add ACE-Step training directly into the UI by vendoring Side-Step
- revisit Carey complete mode so it can do the upstream-style accompaniment workflow
- enable the Carey
xl-sftmodel on the remote backend
- Close your DAW.
- Use Extract All on the ZIP.
- Choose
C:\Program Files\Common Files\VST3\as the destination, or extract somewhere else and copy thegary4juce.vst3folder there. - Reopen your DAW and rescan plugins if needed.
You can put the VST3 literally anywhere as long as your DAW scans that
location. C:\Program Files\Common Files\VST3\ is just the default path most
DAWs already check.
If permission errors appear, run Command Prompt as admin:
xcopy "path\to\extracted\gary4juce.vst3" "C:\Program Files\Common Files\VST3\gary4juce.vst3" /E /I /YLMMS support is not working yet. We did an initial VST2/LV2 compatibility pass and documented the exact Windows alpha environment here: LMMS compatibility notes.
- Quit your DAW.
- Open the DMG.
- Drag to the matching folder:
Gary4Juce.component-> Components (Audio Unit)Gary4Juce.vst3-> VST3
- Reopen your DAW and rescan.
GarageBand and Logic use AU. Ableton, FL, Reaper, Cubase, and Bitwig can use VST3.
The plugin can use either:
- remote backend - my server, free, on a spot VM, limited to the models I have loaded
- localhost - your machine, requires GPU, full control
Remote base URL: https://g4l.thecollabagepatch.com
Use the dedicated apps for localhost:
- windows: gary4local
- macOS: gary4local mac
They manage local envs for gary, terry, jerry, carey, foundation-1, and SA3. Model coverage varies by platform, but SA3 is available in both companion apps.
Recommended hardware:
- 10 GB+ GPU VRAM minimum
- 16 GB+ recommended for heavier local models
- 24 GB+ recommended for Darius-style separate backends
SA3 can run on the remote backend:
https://g4l.thecollabagepatch.com/sa3
or the local gary4local service on Windows/macOS:
http://localhost:8006
Public upstream repo: https://github.com/stability-ai/stable-audio-3
Local SA3 training source of truth: https://github.com/dada-bots/underfit
Darius is too heavy to run alongside the other gary4local services.
Easiest path: duplicate this Hugging Face Space:
https://huggingface.co/spaces/thecollabagepatch/magenta-retry
Use an L40s or A100-class runtime. Enter the duplicated Space URL in the Darius tab.
Local Docker option:
git clone https://github.com/betweentwomidnights/magenta-rt
cd magenta-rt
docker build -f Dockerfile.cuda -t magenta-rt .
docker run --gpus all -p 7860:7860 magenta-rt- Put gary4juce on your master track, a bus, or any track you want to listen to.
- Press play in your DAW.
- Save the recording buffer when you have audio you want a model to react to.
- Generate, transform, continue, crop, drag, and repeat.
The Jerry tab now has three sub-tabs:
- Generate text-to-audio up to 300 seconds.
- Toggle loop mode for 4/8/16-bar loop generation.
- Transform either the saved recording buffer or the current output audio.
- Continue either the saved recording buffer or the current output audio.
- Use optional key/scale and automatic DAW BPM prompting.
- Use seed recall for reproducible takes.
- Use LoRA sliders in advanced settings. Sliders default to 0.
- Use dice prompts from the default pool, or from selected LoRA pools when one or more LoRA sliders are above 0.
Full guide: SA3.md
- Generates short loops aligned to your DAW BPM.
- Smart loop mode can bias toward drums or instruments.
- Localhost supports custom finetunes through the model picker.
Learn more: https://huggingface.co/stabilityai/stable-audio-open-small
- Generates 4 or 8 bar loops synced to BPM and key.
- Uses structured prompt assembly through knobs, toggles, and tag controls.
- Randomize builds coherent presets through the backend prompt engine.
- Presets can be saved/loaded as
.f1presetfiles. - Available on remote, gary4local windows, and gary4local mac on Apple silicon.
Learn more: https://huggingface.co/RoyalCities/Foundation-1
Gary uses MusicGen continuation models.
- Uses the first selected seconds of your recording buffer.
- Generates continuation audio.
- Output can be cropped, continued, retried, or dragged to the DAW timeline.
Model lists are fetched dynamically from the backend, so the menu depends on what is loaded locally or remotely.
Learn more: https://github.com/facebookresearch/audiocraft
Carey uses ACE-Step.
- lego - generate vocals/backing vocals over your audio
- complete - extend audio into a full continuation
- cover - remix/restyle with caption guidance
- extract - attempt target stem extraction from your recording buffer
Shared lyrics editor, 50-language support, key/scale/time signature selection, caption popouts, and LoRA support live here.
Full guide: CAREY.md
Learn more: https://github.com/ace-step/ACE-Step-1.5
- Transforms audio with style presets or custom prompts.
- Can use either the recording buffer or current output audio.
- Undo is available after transforms.
Learn more: https://huggingface.co/spaces/facebook/Melodyflow
- High-quality 48 kHz continuations.
- Style steering with prompts, base model, or custom weights.
- Works on the recording buffer or current output audio.
- Requires a separate backend.
Learn more: https://github.com/magenta/magenta-realtime
gary4juce gets better as more finetunes exist.
Train through Audiocraft:
https://github.com/facebookresearch/audiocraft
As of late 2025, Google Colab is painful for Audiocraft training due to dependency conflicts. Local training is the practical path.
Train with stable-audio-tools:
https://github.com/Stability-AI/stable-audio-tools
Encode, train, select checkpoints, upload to Hugging Face, then load through the Jerry localhost finetune picker.
Local SA3 training for both gary4local companions now uses dadabots' underfit as the source of truth.
SA3 LoRA workflows are new and still settling. The v4 plugin UI is already shaped around multi-LoRA strength sliders and LoRA-aware dice pools so the backend can grow into that workflow cleanly.
Upstream repo: https://github.com/stability-ai/stable-audio-3
Magenta Realtime is one of the friendlier finetuning paths:
https://github.com/magenta/magenta-realtime
Upload weights to a Hugging Face model repo and point the Darius tab at it.
gary4juce/
+-- Source/
| +-- PluginProcessor.cpp/h
| +-- PluginEditor.cpp/h
| +-- Components/
| | +-- Gary/GaryUI.cpp/h
| | +-- Jerry/JerryUI.cpp/h
| | +-- Jerry/SA3UI.cpp/h
| | +-- Foundation/FoundationUI.cpp/h
| | +-- Carey/CareyUI.cpp/h
| | +-- Terry/TerryUI.cpp/h
| | \-- Darius/DariusUI.cpp/h
| \-- Utils/
| +-- Theme.h
| +-- IconFactory.cpp/h
| \-- BarTrim.cpp/h
+-- CAREY.md
+-- SA3.md
+-- docs/
\-- gary4juce.jucer
Requirements:
- JUCE 8.0.8
- Visual Studio 2022 on Windows
- Xcode on macOS
Steps:
- Open
gary4juce.jucerin Projucer. - Save the project to regenerate build files.
- Open the generated IDE project.
- Build release configuration.
- SA3 launch notes: output loudness and continuation tails are still being tuned.
- SA3 local backend: available through gary4local on Windows and gary4local mac.
- Windows Defender: not codesigned, so Windows may complain.
- Darius hardware: 24 GB+ VRAM is strongly recommended.
- Terry variability: Melodyflow is experimental and can be wonderfully strange.
- Carey complete mode: useful, but the upstream-style accompaniment workflow still needs a proper UI/backend pass.
discord: https://discord.gg/VECkyXEnAd
musicgen community: https://discord.gg/Mxd3nYQre9
email: kev@thecollabagepatch.com
twitter/x: https://twitter.com/@thepatch_kev
- stable-audio-3: Stability AI (repo)
- underfit: dadabots (repo)
- musicgen: Meta AI / Audiocraft team
- stable-audio-open-small: Stability AI
- foundation-1: RoyalCities (model, tools)
- ace-step: ACE-Step team (repo)
- melodyflow: Meta AI / Audiocraft team
- magenta-realtime: Google Magenta team
- JUCE: JUCE framework
- community finetunes: lyra, vanya, hoenn, CZ-84, and everyone contributing models
Special thanks to Zach and the guys at Stability for letting me be part of the Stable Audio 3 beta while this integration came together.
The source code and other original material in this repository, including
earlier gary4juce versions authored by the copyright holder, are free software
licensed under the
GNU Affero General Public License v3.0 only (AGPL-3.0-only).
You may use, study, modify, and redistribute that material under the license.
Redistribution and network use of modified versions are subject to the AGPLv3's
notice and Corresponding Source requirements.
Copyright (C) 2025-2026 Kevin Griffing. Developed and published by the collabage patch, inc.
This project is built with JUCE 8.0.8 under JUCE's AGPLv3 option. See Third-Party Notices for the pinned JUCE source and license information.
This license statement is limited to this repository. It does not set the license for the separate gary4local applications or installers, model weights, hosted services, or any other separately distributed part of the broader Gary ecosystem. Consult each project's own license before using or redistributing it.
The AI models and backend services used with gary4juce are separate works and are not included in this repository. Code and model weights may use different licenses:
- stable-audio-3: MIT code; weights use the Stability AI Community License
- musicgen: MIT code and CC-BY-NC-4.0 weights
- stable-audio-open-small: Stability AI Community License
- foundation-1: Stability AI Community License
- ace-step 1.5: MIT
- melodyflow: MIT code and CC-BY-NC-4.0 weights
- magenta-realtime: Apache-2.0 code; model-weight terms depend on the selected version
Always consult the exact upstream code and model version before commercial use or redistribution.
If gary4juce is useful to you:
- share your creations and tag
@thepatch_kev - contribute finetunes to the community
- help improve docs
- Gumroad: https://thepatch.gumroad.com/l/gary4juce (this is one of the only ways to support the project monetarily rn...working on some easier paths like github sponsors)
