HOME | DD

Description
Hiya guys, Miko here, and today I wantd to share a few quick and easy tips on how to humanize an UTAU's voice.
Note that some UTAU voice banks have a lower quality due to the way in which they were recorded - this can range from a cheaper microphone, to a lack of knowledge how to properly record ones voice.
Often times people will also record a voice bank in a certain pitch that isn't natural to their own voice, which will result in an UTAU sounding like it's throat is being constantly choked.
For the best result, you should keep your UTAU's voice as close to your natural voice as posible, since the pitch and depth of your vocal range will gradually stay the same.

Now, as previously mentioned, if an UTAU is recorded with a cheaper microphone that lacks proper filters and range, there will be a greater risk of said voice turning out to sound robotic in nature.

But that shouldn't be an immediate death sentence, since a proper OTO edit can still work miracles - Think about Zipperloid Teto Kasane for example, her first voice bank was rather choppy and robotic, but the OTO was done perfectly, and with the right genre of song, the quality could be easily masked. And don't look down on robotic-sounding UTAUs, they are wonderful for Techno themed genres of music. (NEL/NERU or Someone Who Sings being great examples for robotic voices to find a moment to shine)

Now, usually the creator of an UST will do all there can be done within their power to ensure the settings suit and match the range and pitch of the song. But the downside to pre-made USTs, is that they are most often created according to a certain type of voice.
This is a natural occurrence - think about the music industry - most pop-artists sing songs that are written for their vocal range and vocal sound specifically. So, sometimes a UST needs a little tinkering to ensure your UTAU of choice still shines bright like a diamond.

If a USTs vocal pitch does not suit the (natural) pitch of your UTAUs voice, there's more you need to do than simply drag the vocals up by an octave. (Always keep in mind that you should never lower or heighten pre-made vocals more than a single octave. This means that if an UST is mostly set on a G octave, you can drag it to a G#, but you absolutely cannot move it over to an F - the simple reason for this is that it will make the UST off key in combination with the pitch of the instrumentals)

1. If you decide to up/lower the UST vocals, it might be best to up/lower the instrumentals by the same amount. It might sound a tad weird at first, but it will keep the arrangement of voice + instruments smoother.

2. Manually editing the UST presets themselves. An option is to click on the OPT button (you can find this button on the far right of the secondary taskbar in the UTAU program itself) - this will optimize cross fading. Most USTs are already optimized in this division, so normally there's no need for that.
A second option to to is to click the 'RESET' button right next to the OPT button. This one will reset Envelopes in the selected area of the UST. And lastly click on the P2P3 button - this will connect the selected notes according to how you've edited them in the oto.ini.

3. Manual editing past-production. This means that you render the .wav file and take it to an audio-editing software tool. Most hobbyist UTAU users make use of Audacity - including myself, with some exceptions on the side if everything else fails.
Nifty tools to use in Audacity to humanize an UTAU voice are the :

- Bass and treble.
- Echo.
- Fade in/out.
- Reverb.
- Noise Reduction. (use with caution)
- GSnap (an additional plugin you can download for Audacity, that's a simplistic version of autotune)

-- Coming back to the UST itself - select the all the notes (vocals) in the UST and right click on 'property'.
Then you adjust the settings as follows :

Modulation: 20 - 30%
Preutterance and Overlap : Clear. (you likely want to keep this one untouched, since it will rely on your oto.ini settings, unless it is your intent to change and modify individual notes)
Consonant Velocity: (depending on the voice itself and the length of their consonants) you'd want to keep this between 80 and 15o. If the consonants are on the short side, you do the opposite and keep them between 50 and 100. If you think your UTAUs consonants are fine the way they are, you leave this section alone.

-- Onto manual audio editing. Open Audacity, align the vocals with the instrumentals and then open up the 'effect' option in the taskbar. There you select 'Bass and Treble' - insert 4 on treble, hit okay and repeat it once or twice (according to your own liking)

Next move on to the 'Reverb' option, again select the vocals and apply these changes:

Roomsize: 90 (or 100, I tend to use 100, but to each their own)
Reverb time: 5 (or 4)

Extra otions are:
Damping: 0115
Dry signal level: 0
Early reflection level: -25
Tail level: -50

The remaining options are best left in their default setting.

--If done correctly, the UTAU should sound perfectly tuned. If not yet satisfied with the results, you could always add a low echo-effect on a chorus part of a song. I tend to use it to add power to my UTAUs voice, without having to turn up the velocity or actual volume of the vocals.

The Fade in/out effect comes in handy when your UTAU has a hard sounding voice, and helps to smooth out ending vowels that are usually smothered by 'softer' sounding UTAU voices. (For example, imagine Miku singing taaaaaaaaa at the end of a lyric, and then imagine Ritsu singing taaaaa at the end of a lyric - wherein Miku's softer voice would smooth-out the vowel more easily, a harder sounding voice like Ritsu could benefit from a low key fade-out effect on the vowel) - I personally use it more than I can keep track of with my own UTAU, MiKO, since she also has a rather hard and powerful voice.

-Noise Reduction is a tricky part, and I do not recommend using this effect on clear sounding or airy UTAU voices. This works best for low-quality voice banks that have leftover background static in their voice. But this is a nifty tool for more calm songs that heavily rely on the sentiment and pronunciation of the voice, rather than its power.

-GSnap is something I personally use for about 97% of my covers. It might look a bit intimidating at first, but it really is an easy tool to use to fine-tune a voice. It doesn't work miracles, but the changes it applies to vocals are noticeable.

And last but not least- the Compression and Normalization tools. These aren't recommended by me, but when properly used, they are known to add power to a voice (I admit using it with MiKO's calm append). But again, use with caution, since the Compression effect .. well, the name speaks for itself, it compresses the velocity and quality of the voice itself, so unless this is used on an UTAU that has perfect pronunciation and clarity, this will garble up the voice beyond belief. You can compare it to online chats in video games, and having to listen to people scream into a very cheap headset. Audio clipping beyond belief..

The Normalization tool lowes dB by normalizing the amplitude (it is recommended to keep the settings at -1,0 dB),
it also removes the DC-offset, and independently normalizes the stereo channels.

And there we have it. I hope this was somewhat helpful.
Let me know if any of these tips worked for you!

- Miko.

Description
Hiya guys, Miko here, and today I wantd to share a few quick and easy tips on how to humanize an UTAU's voice.
Note that some UTAU voice banks have a lower quality due to the way in which they were recorded - this can range from a cheaper microphone, to a lack of knowledge how to properly record ones voice.
Often times people will also record a voice bank in a certain pitch that isn't natural to their own voice, which will result in an UTAU sounding like it's throat is being constantly choked.
For the best result, you should keep your UTAU's voice as close to your natural voice as posible, since the pitch and depth of your vocal range will gradually stay the same.

Now, as previously mentioned, if an UTAU is recorded with a cheaper microphone that lacks proper filters and range, there will be a greater risk of said voice turning out to sound robotic in nature.

But that shouldn't be an immediate death sentence, since a proper OTO edit can still work miracles - Think about Zipperloid Teto Kasane for example, her first voice bank was rather choppy and robotic, but the OTO was done perfectly, and with the right genre of song, the quality could be easily masked. And don't look down on robotic-sounding UTAUs, they are wonderful for Techno themed genres of music. (NEL/NERU or Someone Who Sings being great examples for robotic voices to find a moment to shine)

Now, usually the creator of an UST will do all there can be done within their power to ensure the settings suit and match the range and pitch of the song. But the downside to pre-made USTs, is that they are most often created according to a certain type of voice.
This is a natural occurrence - think about the music industry - most pop-artists sing songs that are written for their vocal range and vocal sound specifically. So, sometimes a UST needs a little tinkering to ensure your UTAU of choice still shines bright like a diamond.

If a USTs vocal pitch does not suit the (natural) pitch of your UTAUs voice, there's more you need to do than simply drag the vocals up by an octave. (Always keep in mind that you should never lower or heighten pre-made vocals more than a single octave. This means that if an UST is mostly set on a G octave, you can drag it to a G#, but you absolutely cannot move it over to an F - the simple reason for this is that it will make the UST off key in combination with the pitch of the instrumentals)

1. If you decide to up/lower the UST vocals, it might be best to up/lower the instrumentals by the same amount. It might sound a tad weird at first, but it will keep the arrangement of voice + instruments smoother.

2. Manually editing the UST presets themselves. An option is to click on the OPT button (you can find this button on the far right of the secondary taskbar in the UTAU program itself) - this will optimize cross fading. Most USTs are already optimized in this division, so normally there's no need for that.
A second option to to is to click the 'RESET' button right next to the OPT button. This one will reset Envelopes in the selected area of the UST. And lastly click on the P2P3 button - this will connect the selected notes according to how you've edited them in the oto.ini.

3. Manual editing past-production. This means that you render the .wav file and take it to an audio-editing software tool. Most hobbyist UTAU users make use of Audacity - including myself, with some exceptions on the side if everything else fails.
Nifty tools to use in Audacity to humanize an UTAU voice are the :

- Bass and treble.
- Echo.
- Fade in/out.
- Reverb.
- Noise Reduction. (use with caution)
- GSnap (an additional plugin you can download for Audacity, that's a simplistic version of autotune)

-- Coming back to the UST itself - select the all the notes (vocals) in the UST and right click on 'property'.
Then you adjust the settings as follows :

Modulation: 20 - 30%
Preutterance and Overlap : Clear. (you likely want to keep this one untouched, since it will rely on your oto.ini settings, unless it is your intent to change and modify individual notes)
Consonant Velocity: (depending on the voice itself and the length of their consonants) you'd want to keep this between 80 and 15o. If the consonants are on the short side, you do the opposite and keep them between 50 and 100. If you think your UTAUs consonants are fine the way they are, you leave this section alone.

-- Onto manual audio editing. Open Audacity, align the vocals with the instrumentals and then open up the 'effect' option in the taskbar. There you select 'Bass and Treble' - insert 4 on treble, hit okay and repeat it once or twice (according to your own liking)

Next move on to the 'Reverb' option, again select the vocals and apply these changes:

Roomsize: 90 (or 100, I tend to use 100, but to each their own)
Reverb time: 5 (or 4)

Extra otions are:
Damping: 0115
Dry signal level: 0
Early reflection level: -25
Tail level: -50

The remaining options are best left in their default setting.

--If done correctly, the UTAU should sound perfectly tuned. If not yet satisfied with the results, you could always add a low echo-effect on a chorus part of a song. I tend to use it to add power to my UTAUs voice, without having to turn up the velocity or actual volume of the vocals.

The Fade in/out effect comes in handy when your UTAU has a hard sounding voice, and helps to smooth out ending vowels that are usually smothered by 'softer' sounding UTAU voices. (For example, imagine Miku singing taaaaaaaaa at the end of a lyric, and then imagine Ritsu singing taaaaa at the end of a lyric - wherein Miku's softer voice would smooth-out the vowel more easily, a harder sounding voice like Ritsu could benefit from a low key fade-out effect on the vowel) - I personally use it more than I can keep track of with my own UTAU, MiKO, since she also has a rather hard and powerful voice.

-Noise Reduction is a tricky part, and I do not recommend using this effect on clear sounding or airy UTAU voices. This works best for low-quality voice banks that have leftover background static in their voice. But this is a nifty tool for more calm songs that heavily rely on the sentiment and pronunciation of the voice, rather than its power.

-GSnap is something I personally use for about 97% of my covers. It might look a bit intimidating at first, but it really is an easy tool to use to fine-tune a voice. It doesn't work miracles, but the changes it applies to vocals are noticeable.

And last but not least- the Compression and Normalization tools. These aren't recommended by me, but when properly used, they are known to add power to a voice (I admit using it with MiKO's calm append). But again, use with caution, since the Compression effect .. well, the name speaks for itself, it compresses the velocity and quality of the voice itself, so unless this is used on an UTAU that has perfect pronunciation and clarity, this will garble up the voice beyond belief. You can compare it to online chats in video games, and having to listen to people scream into a very cheap headset. Audio clipping beyond belief..

The Normalization tool lowes dB by normalizing the amplitude (it is recommended to keep the settings at -1,0 dB),
it also removes the DC-offset, and independently normalizes the stereo channels.

And there we have it. I hope this was somewhat helpful.
Let me know if any of these tips worked for you!

- Miko.