MiniMax Music 2.6: generate full songs with lyrics
Learn how to use MiniMax Music 2.6 on OmniArt to turn a style prompt and lyrics into a complete song — vocals, layered instruments, genre-aware mixing.

Most AI music tools give you a loop. MiniMax Music 2.6 gives you a song — verse, chorus, bridge, and a vocal performance that carries real dynamics. Released in April 2026, the 2.6 version improves on its predecessor with richer low-end, more natural vocal delivery, and faster initial generation. On OmniArt it runs free, at 40 credits per track, and lives alongside the image and video tools you already use. This guide shows you how to write the style prompt and lyrics that turn a blank text box into a finished track you can actually use.
How MiniMax Music 2.6 works
The model takes two inputs: a style prompt that describes the sonic world you want, and an optional lyrics block that provides the words, structured into sections. It processes them together and outputs a complete piece — not a loop — with vocals, layered instrumentation, and genre-aware mixing baked in.
The 2.6 generation improves the areas that matter most in practice: bass is warmer and more defined, vocal performances use natural vibrato and emotional shaping rather than a flat delivery, and the model reaches a usable draft faster. The MiniMax family has built a reputation for realistic AI vocals; 2.6 extends that with phrasing that responds to the lyric structure you provide.
Lyrics are optional. Leave them out, and the model generates instrumental music from the style prompt alone. Both paths are covered below.
Style prompt vocabulary
The style prompt is where you set the sonic direction. MiniMax Music 2.6 responds well to precise, layered descriptions. Build yours from four dimensions:
Genre and sub-genre
Start specific. "Lo-fi hip-hop" is better than "hip-hop"; "cinematic orchestral" lands differently from "orchestra." Working terms: indie pop, dark ambient, synthwave, R&B ballad, neo-soul, folk acoustic, Latin trap, jazz fusion, post-rock, chillout electronic.
Mood and emotional direction
Name the feeling you want the listener to arrive at. Terms that work: melancholic, uplifting, tense, nostalgic, euphoric, intimate, cinematic, playful, brooding, hopeful, anthemic.
Tempo and energy
You do not have BPM controls, so describe tempo in language: slow-burning, mid-tempo groove, driving rhythm, relaxed pulse, pulsing and urgent, gentle and unhurried.
Instrumentation and texture
List the sounds that anchor the piece. Terms: warm electric piano, fingerpicked acoustic guitar, punchy drum machine, lush string pads, sub bass, muted trumpet, shimmering reverb guitar, 808 kick, close-mic piano, airy synth pads.
Vocal style
Describe what you want from the performance: female lead with warm alto, breathy indie vocal, raspy male lead, harmonized background vocals, conversational delivery, belted chorus, whispered verse.
A style prompt that combines all five gives the model a clear brief. A vague one ("relaxing music") produces a vague result.
Structuring lyrics with section tags
MiniMax Music 2.6 uses section tags to understand where a song changes structure. Wrap each section in square brackets:
[verse]— narrative sections, lower energy, sets context[chorus]— the hook, highest emotional intensity, repeats[bridge]— a section that breaks the verse/chorus pattern, adds contrast[pre-chorus]— builds into the chorus, optional[outro]— closing section, can repeat chorus or wind down
Write lyrics the way you would for a real song. Rhyme scheme, line length, and density all affect how the model performs them. Denser verse lyrics suit a slower, more deliberate delivery; short punchy lines in a chorus drive momentum.
Tip
Worked examples
Example 1: indie pop for a short-form video intro
Style prompt:
Indie pop, nostalgic and warm, mid-tempo groove, fingerpicked acoustic guitar with light electric piano, soft drum machine, female lead with breathy delivery, harmonized background vocals in the chorus, airy reverb tail throughout.
Lyrics:
[verse]
Coffee going cold beside the window seat
Morning light is slipping through the leaves
I keep the photos in a box below the bed
Hold onto the versions of us I never said
[chorus]
We were golden, we were almost right
Dancing slow through an ordinary night
Golden, almost right
I'd do it all again if I could
[bridge]
Maybe that's enough, to have held it for a while
Maybe that's enough, to have meant it when I smiled
This combination gives you a mellow, nostalgic track suitable for a montage, product intro, or podcast opener. The sparse instrumentation leaves room for dialogue or voiceover layered on top.
Example 2: brand energy cue for social edits
Style prompt:
Upbeat electronic pop, euphoric and driving, pulsing synth bass, punchy four-on-the-floor kick, shimmering synth pads, short instrumental drops, anthemic energy, no lead vocals — instrumental only.
Lyrics: (leave empty — instrumental mode)
Use this for reels, product reveal cuts, or highlight edits where the music carries energy without competing with on-screen text. The "no lead vocals — instrumental only" note in the style prompt reinforces the model's instrumental path even without lyrics.
Example 3: R&B track for a creator project
Style prompt:
Contemporary R&B, intimate and late-night, slow-burning mid-tempo, warm sub bass, Rhodes electric piano, brushed snare, male lead with smooth tenor delivery, conversational verse and belted chorus, lush string pads in the bridge.
Lyrics:
[verse]
Caught me off guard with a message at midnight
Said you've been thinking and you don't know why
I've been here doing the same thing, you know
Watching the city lights flicker and go
[pre-chorus]
Tell me what you're holding back
I've got time, I've got patience, and I've got your back
[chorus]
Stay a little longer in the conversation
Don't rush the feeling, let it find its way
Stay a little longer
We don't need a reason
Just you, just me, just the end of the day
[bridge]
There's something quiet in the space between us
Something neither of us wants to name
But here we are
Here we are
The structured section tags give the model clear cues: low-energy verse, building pre-chorus, an open repeating chorus hook, and a bridge with short punchy lines for contrast. This yields a track usable as a background for short film content, brand storytelling, or a standalone creator release.
Instrumental mode
Leaving the lyrics field empty (or toggling instrumental mode) tells the model to generate a full piece from the style prompt alone. Instrumental-only tracks are well suited to:
- Podcast intros and outros — set tone without lyrics fighting with speech
- Video background beds — sit under dialogue or narration without distraction
- Brand and product reels — kinetic cuts and transitions where lyrics read as noise
- Ambient and lo-fi content — long-form listening without vocal fatigue
In instrumental mode, the style prompt does all the compositional work, so spend extra time on it. Name the specific instruments, the texture, and the arc you want — "builds from sparse piano to full arrangement" or "stays minimal throughout, no percussion." The model respects these directional cues.
Note
MiniMax Music 2.6 in a creator workflow
Social content and short-form
Generate a track per video batch rather than reusing library music. A 40-credit track that matches the brief — right genre, right energy, right length — lands better than stock audio that almost fits. Use the instrumental path for reels where you're overlaying text or a voiceover.
Video and podcast production
Pair music generation with OmniArt's other audio and video tools in the same session. Generate a voiceover with a MiniMax Speech model, generate a background score with MiniMax Music 2.6, and cut both to the video clip — without leaving the platform. See AI voiceover for YouTube videos for the voiceover half of that workflow.
Brand audio
Brand music cues — a 5-second intro sting, a 15-second loop for a landing page, a 30-second track for an ad — follow the same process. Write a style prompt that describes the brand character (not just the genre), generate three or four takes, and pick the one that fits. You're not committing to a single library track; you can regenerate any time the brief shifts.
How it compares to other music models on OmniArt
OmniArt's audio workspace includes three music models. Each wins at a different brief:
| Model | Lyrics support | Credits | Best for |
|---|---|---|---|
| MiniMax Music 2.6 | Yes | 40 | Full songs with vocals; any genre; instrumental also |
| ElevenLabs Music | Yes | 150 | Structured, section-aware music with rich arrangement |
| Google Lyria 3 Pro | No | 20 | High-quality instrumental and cinematic scoring |
MiniMax Music 2.6 is the default starting point for any brief that involves a vocal performance or a full song arc. Lyria 3 Pro is the right call for a cinematic instrumental score where you want high quality at low cost. ElevenLabs Music fits briefs where the arrangement structure and section fidelity matter more than the credit count.
For sound effects, ambience, and voiceover alongside music, see the full audio model overview.
Getting started on OmniArt
Open the audio workspace and select the Music tab. Pick MiniMax Music 2.6, write a style prompt from the vocabulary above, and paste in structured lyrics if you want a vocal track. Generate two or three takes, audition them, and refine the prompt for the next pass. The gap between a rough brief and a usable track is usually one or two iterations — the model's output with a well-written prompt is close enough to final that the main work is choosing, not fixing.
Ready to Create?
Start generating amazing content with AI