Quote Originally Posted by ChrisCall View Post
Just speaking from my own experiences here, I have found it handles speaking exceptionally well in general.
I've heard some fantastic results on rap/hip-hop tracks etc. The flip side is that when the instrumentation is bare, every tiny little missed detail stands out. It's easier to scrub in spectral editing since there's not much sound on the spectrum to dig through ... but it all sticks out.

But really, since it's trained on voice, so much depends on how well it recognizes a specific type of voice sound, and how much it mistakes certain instrumentation for voice. That's why having multiple models could potentially prove very useful. Reverb kinda fits into that category as well. This model is trained on different music than the primary AI over in the other thread. The primary one can't handle reverb nearly as well as this one.
I agree with this 100%. I just got my new PC parts today so I will be ready to start doing some tests this week! Regarding rkeanes' question, I put a dataset together consisting of trailer music with and without movie dialogue to see how well it learns to separate spoken word. I found a YouTube channel that has official instrumentals for trailers, so I'm hoping it pans out! I will probably include rap in that dataset as well because it isn't quite big enough.

I will be doing A LOT of experiments. This will take some time!