[10:26 Thu,2.February 2023 by Thomas Richter] |
How fast the development in the field of AIs is progressing can be seen, among other things, in the field of "text-to-music", i.e. AIs that generate any music via text description: Google had just presented MusicLM (we
In addition, the AudioLDM team wants to make the programme and its model available online as open source, which means that it could not only be used freely on one&s own computer, but could also be improved by others and integrated into other programmes. For example, it could be used as a plug-in in video editing programmes such as Adobe Premiere or Blackmagic&s DaVinci Resolve to generate sound backdrops. Another argument in favour of using AudioLDM at home is that it is supposed to be very efficient (i.e. it requires relatively little computing power) and the training - for example, of your own sound samples - can be done using only one GPU (such as an NVIDIA RTX 3090). ![]() In addition, AudioLDM has practical functions that are already known from the image AIs, such as InPainting (a part of an audio recording is replaced by another sound via text prompt to match the rest), Style Transfer (a melody is played by another instrument) or ![]() Here is an example of style transfer: trumpet to children&s singing audioldm.github.io/samples/2_style_transfer/audio/Untitled%20Session%202_mixdown.wav In addition to the description of the sounds that are to be generated, other parameters can be entered that affect the sound such as the type of acoustic environment (reverberation), the material of things that make sounds as well as the temporal order. The sound of a steam engine: Cutting meat on a wooden table: For more complex soundscapes, the researchers enlist the help of the text AI ChatGPT, which, for example, responds to the prompt "Describe the sound of the universe" with a detailed description ("Radio emissions from stars, planets, galaxies and other celestial bodies, high fidelity, as well as the sounds of solar winds and cosmic rays"), which can then be used as a prompt for MusicLDM and generates the following output: ![]() Model of AudioLDM Actually, the source code was supposed to be published together with the research work on Monday, but the team is still reluctant to put the model (i.e. the result of the training process) online because of the just announced lawsuits against several image AIs due to copyright infringements, since the well-known ![]() Examples of music generation: More Audio AI ProjectsThe following demonstrates just how rapidly development in the field of audio AIs is progressing. ![]() ![]() Audio AI Timeline Within a few days, several text-to-audio AIs of very different quality have been developed, such as ![]() ![]() ![]() ![]() deutsche Version dieser Seite: Neue Audio KI generiert neben Musik auch beliebige Soundeffekte |
![]() |