I've been mulling some ideas over, and I find that the best way to clear my own mind is to put words on a page. Here, I'd like to speculate on why we stutter, and why we stutter when we do. In this case, my question does not refer to ultimate causes, such as a possible mis-wiring in the brain, or proximate causes, like a learned anxiety over saying one's own name. I'm looking to a level of cause somewhere in between the two extremes.
I'll start with the observation that slowing down speech decreases the occurrence of stutter blocks, and a dramatic slowdown can eliminate them. So I'll take it that a certain rate of speech can trigger stutter blocks. Much research has looked at the origin of blocks. Linguistic sources have been examined (the accessing of words from memory) as well as motor plan failures (the instructions to carry out coordinated muscle movements to generate words).
Rather than choosing among theoretically possible sources of the error and explore them, I prefer to start from what I know. When stutter blocks occur (at least in chronic, adult stage stutterers), they typically occur as failures of coarticulation within syllables. Whatever the nature of the original failure, the basic pathology of stutter is this temporary loss of the ability to coarticulate phonemes (sounds) within a syllable.
Whatever the neurological origin of the stutter block, or the psycho-social trigger that cues it to occur, we know two things: that the failure is a failure of coarticulation, and that the failure doesn't occur when the rate of speech is slowed down sufficiently. So what can we do with these two facts?
First, we need to examine the process of coarticulation. In generating syllables and words out of individual sounds, human speech does not simply connect the sounds in sequence, like beads on a string. The articulation of an initial sound will be modified by putting the articulators (lips, tongue, jaw) in position to immediately generate the following sound. My example from a previous entry was Seesaw (or See-Saw). When the speaker begins to produce the 's' sound in the syllable 'See,'the articulators are already in place to produce the 'ee' sound. The 's' and 'e' sounds are not produced through independent actions - they have become a single, simplified unit. The same is true in the production of the syllable 'saw.'
Coarticulation allows us to speak significantly faster than independent, sequential production of individual sounds would allow. We can imagine how this might have been important to our ancient ancestors on the African plains. The ability to transmit the information "Hey, Nog, there's a lion in the grass right behind you!" would be critical to the survival of the individual and the family group. Somewhere along the line, human speech was turbo-charged by this sound-blending, articulation-modifying process.
In speeding speech, coarticulation also puts stress on the rest of the speech production process. Words must be accessed faster, and proper grammatical structures built at the same time. While the average rate of speech measured by syllables per second varies somewhat across languages, the differences are not great. Speech production is a complicated, rapid process, like nothing else we do until we take up musical instruments and practice many years.
Native English speakers have been determined to average in the 5-6 syllable per second range. If we take 5 syllables per second as a working number, we have each syllable taking an average of a fifth of a second to produce. But this is an average summed over many sentences. When we look closer, we see that if we account for pauses between sentences, the rate within sentences would actually be slightly higher. And we need to account for the fact that multi-syllabic words are spoken faster than mono-syllabic words. In the sentence "She's a beautiful girl," the three syllable word 'beautiful' is spoken in approximately the same time as 'girl.' So while an average gives us a useful approximation of syllable rate, in actual performance some sequential syllables are generated faster than others.
Within syllables that are produced sequentially in a fifth of a second or faster, the coarticulation process must not only produce multiple sounds, but must modify the motor plan for producing them into new, hybrid forms. Without training in speech science, this seems to me to be the most difficult part of speech production to execute.
So the answer to the question 'why does stuttering happen?' would be because coarticulation happens at the most rapid time-scale of all speech processes. Thus, it is the weakest link in the speech production process, and the most likely to fail at normal (rapid) speech rates. And rate-controlled slow speech sees the elimination of stutter blocks because it relieves the break-neck speed requirement of the normal speech process. That's speculation through and through, but speculation based on a reasoned examination of what we know.
One puzzle is that while adults tend to block more during (longer) content words, children beginning to stutter are reported to block more on (shorter) function words. One explanation for this difference is that children might block on the word before the longer content word that is actually triggering the block. This is possible, but if this tendency of early stutterers to block on short, often single-syllable words is correct, then it tends to go against my proposal. For now, I'll leave it up in the air as merely a thought-provoking suggestion.