Schnee (schnee) wrote,

Shy gypsy, slyly spryly tryst by my crypt

A few days ago, I came across the word "mystifyingly", and couldn't help but notice how unusual its vowels were: a total of five, alternating between "y" and "i", with no others present.

I wondered if there were any others with the same vowel pattern (as I'd like to call it), so when I remembered again tonight, I grabbed a wordlist[1], slapped together a quick Perl script and checked[2]. The result? Nope, "mystifyingly" is the only one.


use strict;
use warnings;
use feature qw/say/;

my %freq = ();

while(<>) {
    (my $vowels = $_) =~ s/[^aeiouy]//g;
    push @{ $freq{$vowels}->{'words'} }, $_;

say join ", ", @{ $freq{'yiyiy'}->{'words'} };

But of course the fun doesn't stop there. Want to know the least/most common vowel patterns?

my $min_length = 1;
foreach (sort { $freq{$a}->{'count'} <=> $freq{$b}->{'count'} } grep { length($_) >= $min_length} keys %freq) {
    say "$_: ", $freq{$_}->{'count'};

It turns out that the most common vowel pattern is "ae" (3313 hits), followed by "ie" (2710), "oe" (2044) and "ue" (1714). Interestingly, "ai" is ever so slightly more common than "ee" (1608 vs. 1602); the most common three-vowel pattern is "aie" (1572), while the most common four-vowel pattern is "eaie" (477). For five, it's "eiaio" (80), and after that you get "eaiaio" (45), "eaiiaio" (12), "eeoeeaoa" (4)", "aiiiuioai" (3), "ioiiauiaio" (2), and finally "aiieaieaiai" and "oueeouioaie" (1 each).

Do you want to guess what those last two words are? One is (comparatively) easy to guess, the other's more difficult.

Gave it some thought? Cool! The easy one is "antidisestablishmentarianism", of course, an old pal as far as quirky words are concerned. The other one turns out to be "counterrevolutionaries" — that's a new one!

There's many other unique vowel patterns, of course, not just "yiyiy" or "oueeouioaie". What's the shortest one? We'll find out:

foreach (sort { length($b) <=> length($a) } grep { $freq{$_}->{'count'} == 1 } keys %freq) {
    say "$_: ", join ", ", @{ $freq{$_}->{'words'} };

Turns out that there's three patterns of three vowels that are only found in one word each: "uyu", "eyy" and "yyy". Additionally, there's a false hit ("yuo") where the "y" represents a consonant sound. Want to guess what these are? Here are the answers: "yuo" (the false hit) is "yukon"; "uyu" is "subphylum", "eyy" is "greyly", and "yyy" is "syzygy". Whew, that last one was tough!

Interestingly, the latter two are also the two shortest English words (in this wordlist!) having a unique vowel pattern.

And "yyy" is curious in another way as well, since it only contains one distinct vowel. Let's generate all patterns of that sort and see how many words matching each there are:

foreach my $num (1..5) {
    foreach (grep { $_ =~ m/^(?:a{$num}|e{$num}|i{$num}|o{$num}|u{$num}|y{$num})$/ } keys %freq) {
        say "$_: ", $freq{$_}->{'count'};
        say "$_: ", join ", ", @{ $freq{$_}->{'words'} } if $freq{$_}->{'count'} < 15;

This reveals that, among other things, there are 12 words on this list with a "yy" vowel pattern, and 11 with a "uuu" pattern; what's more, there are two with a "uuuu" pattern, one "aaaaa", nine "eeeee", and, amazingly, one "iiiii". Want to play the guessing game again?

Here's the solutions: "dryly", "flyby", "flybys", "gypsy", "pygmy", "shyly", "slyly", "spryly", "stymy", "sylphy", "thymy", "wryly"; "cumulus", "jujutsu", "jujutsus", "succubus", "susurrus", "tumultus", "tumulus", "untrustful", "untruthful", "usufruct", "usufructs"; "muumuu", "muumuus"; "abracadabra", "beekeeper", "beekeepers", "defenselessness", "effervescence", "enfeeblement", "enfeeblements", "freewheelers", "reemergence", "representee"; "libidinizing".

Isn't that fascinating?

Finally, it turns out that there are two English words (again, on this list!) that have all the vowels in order, each precisely once ("aeiouy"). One of these two words is guessable; the other — well, I'd not bet on it, but if you want to try anyway, be my guest. Or just check the solutions: "facetiously", and "abstemiously".

Alas, there are no word that have all the vowels backwards ("yuoiea"), though there is one word with a "uoiea" pattern: all the vowels backwards, each precisely once, except for "y". Want to guess again? It's "subcontinental". If you have any other ideas for interesting patterns to look for, let me know.

Thank you, and good night!

  1. Note that this is a file with DOS line endings, so you should probably run it through dos2unix(1) or so first. Note also that the word "gland" appears to have an extra trailing carriage return. And note finally that the list is, of course, incomplete: to give just two example, the words "freewheeler" and "coriolis" are missing (even though the former's plural is on the list).
  2. Take the script's output with a grain of salt, of course: it treats "y" as a vowel unconditionally, which is not always the correct approach.
Tags: english, interesting stuff, linguistics, perl, programming
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded