2018 Friday Session B 1430

Friday, November 2, 2018 | Session B, Conference Auditorium | 2:30pm

A Rhythm Account of Word Segmentation Tasks
F. Wang, J. Trueswell, J. Zevin, T. Mintz

Statistical learning accounts posit that infants can compute statistics given input without prosody [1, henceforth ASN]. However, there is a growing body of findings that cannot be explained by the tracking of transitional probabilities. For example, infants fail to perform word segmentation in a mixed-length language, when the artificial language has both 2-syllable and 3- syllable words [2, henceforth JT]. We offer a novel explanation for these findings, and propose the perception of rhythm plays a crucial role in studies of word segmentation. We developed a computational model that explains word segmentation in terms of rhythm perception, without the computation of transitional probabilities.

The model (with two parameters, interval and phase) segments a continuous sequence based on rhythmic structure. Selecting any element in the sequence and recording its position as pi (i=1, 2, …), a sequence with rhythmicity has the property that position increments of any element are multiples of a certain interval. The interval can be computed as the greatest common factor (gcf) between the positional increments of successive occurrences of any element (see Figure 1):

gcf(p2-p1, p3-p2, …)

The parameter phase is selected by minimizing the number of unique subsequences (see Figure 2). Below we provide simulations showing that the model captures patterns of human data from multiple experiments.

Simulation 1. Figure 1 shows how ASN can be rhythmically segmented without tracking transitional probabilities. The positional difference of the first two occurrences of syllable tu is 6 (p2-p1), and the second two occurrences is 9 (p3-p2). The interval is thus 3, and selecting the phase is to minimize the unique number of subsequences (Figure 2). For ASN, phase = 0 yields the minimum number of unique subsequences.

Simulation 2. JT created a mixed-word-length language sequence by concatenating two- syllable and three-syllable words (such that TPs within and between words are the same as ASN). Figure 3 shows the greatest common factor to settle at 1, showing no rhythmicity in the input. The model therefore predicts that this language has no rhythmic cues to segmentation and should fail to be segmented. The fact that infants failed to learn in the mixed-length condition is consistent with our proposal that these studies can be explained by rhythm perception.

Lastly, we conducted a replication study of JT with adult participants, asking whether adults could compute TPs in the absence of rhythm or have to rely on rhythm for segmentation like infants. Adult participants (N=60) were exposed to either a uniform-word-length or a mixed- word-length language, and were asked to rate the familiarity of words vs. part-words (using a scale from “Definitely” to “Definitely Not”). There was no learning in the mixed-word-length condition (β=-0.333, z=-1.65, p=0.100), significant learning in the uniform-word-length condition (β=-0.933, z=-4.52, p<0.001), with a reliable interaction (β=-0.600, z=-2.07, p=0.038, Figure 4). Thus, even adults, in the absence of rhythmic input, cannot segment stimuli that can be segmented with transitional probability.

We argue that the perception of rhythm, rather than computing transitional probability, holds explanatory power to many statistical learning studies.