This concept grew out of the need to have a computer produce a sequence of digits for a short-term memory experiment that Clive Frankish, then a graduate student, was trying to set up (Frankish, 1984). Clive was looking at the effects of grouping on serial recall. The basic rate of presentation of the digits was 2 items per second but the experiment required an extra 50milliseconds [ms] between groups of 3. This is why we decided to have computer controlled sequences rather than recording a basic sequence on tape and then splicing blank tape at the appropriate points. There would be too much splicing to do. Steve Marcus was a Cambridge undergraduate who was doing a project with me, using the computer to generate speech, and he joined in. We loaded the spoken digits into the computer, each one in its own file. There was then a control program that simply called up the files and output them in the order we wanted, every 500ms. However, when we tried it, the digits sounded irregular, with the “eight”, in particular sounding rushed and the “six” sounding late. Listen for yourselves by pressing the button [awaiting audio].
Hearing this, we imagined that there was something the matter with the program, so we took it to pieces and redid the whole thing. The result was the same. Could it be a hardware failure? But when we repeated the same digit a number of times it sounded absolutely regular – and that worked for all the digits. It didn’t make sense. Finally one of us made a mental sideways move and asked the question:
“When you hear the digits coming at regular intervals, what has to be regular?”
This is, of course, the first question that we should have asked, but the assumption that you align things by their onsets was too great.
If it was not the beginning of the sounds, then what could it be? We tried aligning the digits by vowel onset, which sounded better but was still not right. So we ran a little experiment in which we alternated a pair of digits, adjusting their onsets until they sounded regular. We discovered that the “perceptual centre” was not onset, offset or vowel onset but some complex function of them all. Figure 1 shows the waveforms for the digits 1-9, aligned so that they sound regular. As you can see, there is a difference of about 80ms between the onsets of six and eight. Listen to the adjusted digits here.
This was an interesting finding because it had instant application – unfortunately, not patentable. P-centre adjustment has become standard in all automated announcements – on the telephone or on the train, for example – and where it has not been used, the phrasing of the words sounds unnatural. It also turns out that P-centres are important in speech recognition and have been implicated in one theory of dyslexia. The idea itself can be taken beyond speech. A dancer is supposed to make a certain movement, make a step, say, on the beat. We can ask what part of the movement coincides with the beat? Is it the initiation of the movement, the completion, the instant the foot makes first contact with the floor? For another example, if a double bass and a piano play a note “at the same time”, the double bass player will have to start his movement well in advance, because the double bass note takes time to build up. The production centres and the perceptual centres do not correspond.