This demonstration page presents the generation diversity of the proposed consistency TTA model.
The generations correspond to the first 50 AudioCaps test prompts,
and are from our consistency model with four different random seeds.
For quantitative evidence, we standardize each generated Mel spectrogram,
calculate the standard deviation across different seeds,
and average the standard deviation across all Mel spectrogram points of the 50 examples.
The averaged number is 0.871, demonstrating non-trivial generation diversity.
Please listen to the following audio clips to confirm the generation quality of these seeds.
Since the model are not trained on speech data, we do not expect it to produce meaningful speech.
Prompt 0
A machine is making clicking sound as people talk in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 1
A missile launching followed by an explosion and metal screeching as a motor hums in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 2
A toy train running as a young boy talks followed by plastic clanking then a child laughing.
Seed I
Seed II
Seed III
Seed IV
Prompt 3
Clattering of a train is ongoing, a railroad crossing bell rings, and a train horn blows.
Seed I
Seed II
Seed III
Seed IV
Prompt 4
Food sizzling with some knocking and banging followed by a woman speaking.
Seed I
Seed II
Seed III
Seed IV
Prompt 5
A man talks while several animals make noises in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 6
An emergency siren ringing with car horn honking.
Seed I
Seed II
Seed III
Seed IV
Prompt 7
An infant yelling as a young boy talks while a hard surface is slapped several times.
Seed I
Seed II
Seed III
Seed IV
Prompt 8
A bus engine running followed by a bus horn honking.
Seed I
Seed II
Seed III
Seed IV
Prompt 9
A man speaking followed by snoring.
Seed I
Seed II
Seed III
Seed IV
Prompt 10
Rolling thunder with lightning strikes.
Seed I
Seed II
Seed III
Seed IV
Prompt 11
A woman and a baby are having a conversation.
Seed I
Seed II
Seed III
Seed IV
Prompt 12
Water trickling with man speaking.
Seed I
Seed II
Seed III
Seed IV
Prompt 13
Female speech, a toilet flushing and then more speech.
Seed I
Seed II
Seed III
Seed IV
Prompt 14
Loud high humming and croaking sound.
Seed I
Seed II
Seed III
Seed IV
Prompt 15
A cuckoo bird coos followed by a train running on railroad tracks as a bell dings in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 16
A man talking then meowing and hissing.
Seed I
Seed II
Seed III
Seed IV
Prompt 17
Water flowing through pipes.
Seed I
Seed II
Seed III
Seed IV
Prompt 18
An infant crying followed by a man laughing.
Seed I
Seed II
Seed III
Seed IV
Prompt 19
A man speaking, followed by a door shutting, and then the man speaks some more.
Seed I
Seed II
Seed III
Seed IV
Prompt 20
The wind is blowing, and a person is whistling a tune.
Seed I
Seed II
Seed III
Seed IV
Prompt 21
Motor vehicles are driving with loud engines and a person whistles.
Seed I
Seed II
Seed III
Seed IV
Prompt 22
Bubbles gurgling and water spraying as a man speaks softly while crowd of people talk in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 23
Metal clacking followed by a man talking then a metal bang as footsteps shuffle on dirt and a group of men laugh.
Seed I
Seed II
Seed III
Seed IV
Prompt 24
Ducks quack and water splashes with some animal screeching in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 25
Multiple gun shots woman screaming.
Seed I
Seed II
Seed III
Seed IV
Prompt 26
An aircraft engine runs and vibrates, metal spinning and grinding occur, and the engine accelerates and fades into the distance.
Seed I
Seed II
Seed III
Seed IV
Prompt 27
A man is talking as tap water is running.
Seed I
Seed II
Seed III
Seed IV
Prompt 28
Woman speaking, plastic container opening.
Seed I
Seed II
Seed III
Seed IV
Prompt 29
A male speaking.
Seed I
Seed II
Seed III
Seed IV
Prompt 30
A vehicle engine revving followed by tires skidding as a group of people talk in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 31
A woman talking followed by a plate rattling as food and oil sizzle.
Seed I
Seed II
Seed III
Seed IV
Prompt 32
Humming of an idling engine.
Seed I
Seed II
Seed III
Seed IV
Prompt 33
A train running on railroad tracks as a train horn whistle blows several times while railroad crossing warning signals are ringing.
Seed I
Seed II
Seed III
Seed IV
Prompt 34
Several varying hisses.
Seed I
Seed II
Seed III
Seed IV
Prompt 35
A motorboat driving by as water splashes followed by wind blowing into a microphone.
Seed I
Seed II
Seed III
Seed IV
Prompt 36
A bus engine slowing down then accelerating.
Seed I
Seed II
Seed III
Seed IV
Prompt 37
A woman talks as a baby cries.
Seed I
Seed II
Seed III
Seed IV
Prompt 38
Kids laughing then talking followed by a young man talking as wind blows into a microphone.
Seed I
Seed II
Seed III
Seed IV
Prompt 39
A woman delivers a speech.
Seed I
Seed II
Seed III
Seed IV
Prompt 40
Clicking followed by humming noise.
Seed I
Seed II
Seed III
Seed IV
Prompt 41
Electronic beeping followed by a cat singing then meowing as paper shuffles and a man talks with music playing in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 42
A high frequency motor hums loudly and splashes water.
Seed I
Seed II
Seed III
Seed IV
Prompt 43
An adult male speaks, followed by another adult male speaking.
Seed I
Seed II
Seed III
Seed IV
Prompt 44
A horn and then an engine revving.
Seed I
Seed II
Seed III
Seed IV
Prompt 45
Man speaking while insects buzz around.
Seed I
Seed II
Seed III
Seed IV
Prompt 46
A motorboat engine running as water splashes and a man shouts followed by birds chirping in the background.
Seed I
Seed II
Seed III
Seed IV
Prompt 47
A man speaks and a machine runs with a continued speech.
Seed I
Seed II
Seed III
Seed IV
Prompt 48
Man speaks followed by whistling.
Seed I
Seed II
Seed III
Seed IV
Prompt 49
Warning bells ring and a train passes with a honking horn.