Monday 12 October 2015

waves - Frequency shift without affecting signal length


Non-physicist here.


From what I've learned in university and what common sense says, a shift in frequency of a signal results in a change in its length in time. For example, if a sinusoid signal of frequency $f$ and length $t$ is transferred to frequency domain, it's $f$ divided by $2$, then converted back to time domain, the length of the signal would be $2t$.


Correct me if I'm wrong! But this is quite intuitive. If you make a signal slower, it would take more time to finish and vice versa.


This is a reason why I've been told it's impossible to change frequency of a streaming signal and output it with the same speed. For example, it would be impossible to take a man's speech and change it to sound like a woman without making it faster.


Now I've had various observations:




  • In Coursera, you can increase the speed of course lectures and the sound is also of course sped up. However, there is no rise in the pitch of the voice of the speaker. In fact, the speaker sounds very similar to the normal speed. How is it possible to change in speed of the signal didn't affect its frequency?

  • My amplifier has a dial for shifting the pitch. So, while playing, you can here the output with a different pitch. Again, how is this possible? If the amplifier is rising the pitch, shouldn't its output be faster than its input? (i.e. a contradiction!) I'm suspecting some trick though, as the output sounds rather "artificial".


It seems that it is somewhat possible to change the frequency of a signal without affecting its length. One (theoretical) way I can think of would be to understand the current "notes" playing (i.e. the different frequencies making the signal) in very small intervals, change the note and replay them for the duration of that interval.


My question is, first if my understanding is at all correct. Either way, is there a mathematical way for changing the pitch of a signal without affecting its length in time? If not, how can they be doing it in practice?



Answer



You are correct that it's impossible to change the frequency of every component of a waveform while (a) preserving all the phase relationships between the frequencies and (b) preserving the length of the waveform. However, while it's mathematically impossible to do this precisely, it is possible to do things that sound quite a lot like it to the human ear. The most basic algorithms for this are fairly simple, but doing it well is pretty difficult, and the algorithms are being improved all the time. It is referred to as "pitch shifting" if the goal is to change the pitch but not the time, and "time stretching" if you want to change the time while keeping the pitch.


The reason this is possible is that the ear distinguishes rhythm and pitch as quite distinct things. A sequence of clicks will sound like a sequence of clicks until its frequency passses about 20-30 Hz, at which point it transitions into a buzzing sound that's perceived as having a pitch. Because of this, as DumpsterDoofus said, you can chop the sound up into little segments of short duration, change the pitch of each one individually, and then stitch them back together again, repeating things or leaving gaps where necessary.


By itself this doesn't sound so good, because you get a click at the end of each segment, where the waveforms don't line up. It can be improved a lot by overlapping "windows" of sound, which each have a fade-in and fade-out applied. That way you don't hear the clicks that the ends of the segments, but you do tend to hear phase-cancellation effects where thr windows are added back together again. For this reason this technique requires quite a bit of tuning to get it to sound good on any given input sound.


There are also Fourier-domain techniques, as Brandon Enright said. These also usually use overlapping windows; I guess the advantage of using the Fourier domain is that you have more control over the phase relationships. (I don't know very much about these techniques.)



A fairly recent development is the "pitch synchronous overlap-add" (PSOLA) algorithm. This uses overlapping windows as well, but it synchronises the length of the windows to the pitch of the input sound. This makes it much harder for the ear to perceive the individual windows. I call it "the algorithm that ruined music", because it's responsible for that nasty "autotune" effect that's been overused on the vocals of every pop record for the last ten years or so. However, it does also have peaceful applications, and the pitch dial on your amplifier probably uses some version of it.


You can find more details about these algorithms on Wikipedia.


No comments:

Post a Comment

Understanding Stagnation point in pitot fluid

What is stagnation point in fluid mechanics. At the open end of the pitot tube the velocity of the fluid becomes zero.But that should result...