Riffusion
- Seth Forsgren
- Hayk Martiros
Riffusion is a neural network, designed by Seth Forsgren and Hayk Martiros, that generates music using images of sound rather than audio.[1] It was created as a fine-tuning of Stable Diffusion, an existing open-source model for generating images from text prompts, on spectrograms.[1] This results in a model which uses text prompts to generate image files, which can be put through an inverse Fourier transform and converted into audio files.[2] While these files are only several seconds long, the model can also use latent space between outputs to interpolate different files together.[1][3] This is accomplished using a functionality of the Stable Diffusion model known as img2img.[4]
The resulting music has been described as "de otro mundo" (otherworldly),[5] although unlikely to replace man-made music.[5] The model was made available on December 15, 2022, with the code also freely available on GitHub.[2] It is one of many models derived from Stable Diffusion.[4]
Riffusion is classified within a subset of AI text-to-music generators. In December 2022, Mubert[6] similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on their own text-to-music generator called MusicLM.[7][8]
References
- ^ a b c Coldewey, Devin (December 15, 2022). "Try 'Riffusion,' an AI model that composes music by visualizing it".
- ^ a b Nasi, Michele (December 15, 2022). "Riffusion: creare tracce audio con l'intelligenza artificiale". IlSoftware.it.
- ^ "Essayez "Riffusion", un modèle d'IA qui compose de la musique en la visualisant". December 15, 2022.
- ^ a b "文章に沿った楽曲を自動生成してくれるAI「Riffusion」登場、画像生成AI「Stable Diffusion」ベースで誰でも自由に利用可能". GIGAZINE.
- ^ a b Llano, Eutropio (December 15, 2022). "El generador de imágenes AI también puede producir música (con resultados de otro mundo)".
- ^ "Mubert launches Text-to-Music interface – a completely new way to generate music from a single text prompt". December 21, 2022.
- ^ "MusicLM: Generating Music From Text". January 26, 2023.
- ^ "5 Reasons Google's MusicLM AI Text-to-Music App is Different". January 27, 2023.
- v
- t
- e
- Differentiable programming
- Information geometry
- Statistical manifold
- Automatic differentiation
- Neuromorphic engineering
- Pattern recognition
- Tensor calculus
- Computational learning theory
- Inductive bias
- Gradient descent
- Clustering
- Regression
- Hallucination
- Adversary
- Attention
- Convolution
- Loss functions
- Backpropagation
- Batchnorm
- Activation
- Regularization
- Datasets
- Diffusion
- Autoregression
- TensorFlow
- PyTorch
- Keras
- Theano
- JAX
- Flux.jl
- MindSpore
Audio–visual | |
---|---|
Verbal |
|
Decisional |
- Yoshua Bengio
- Alex Graves
- Ian Goodfellow
- Stephen Grossberg
- Demis Hassabis
- Geoffrey Hinton
- Yann LeCun
- Fei-Fei Li
- Andrew Ng
- Jürgen Schmidhuber
- David Silver
- Ilya Sutskever
- Neural Turing machine
- Differentiable neural computer
- Transformer
- Recurrent neural network (RNN)
- Long short-term memory (LSTM)
- Gated recurrent unit (GRU)
- Echo state network
- Multilayer perceptron (MLP)
- Convolutional neural network
- Residual neural network
- Mamba
- Autoencoder
- Variational autoencoder (VAE)
- Generative adversarial network (GAN)
- Graph neural network
- Portals
- Computer programming
- Technology
- Categories
- Artificial neural networks
- Machine learning
This artificial intelligence-related article is a stub. You can help Wikipedia by expanding it. |
- v
- t
- e