Basic Level and Global Basic Level

Each noun token was coded in the form it was uttered (e.g. including plurals, wordplay, diminutives, compound nouns, etc.). In order to group the related forms of the noun, each word was subsequently assigned a lemma, or a basic level, at both the recording level and the corpus level.

Basic Level

Basic levels were initially assigned with the goal of creating customized eyetracking experiments with the most common version of the noun that each child heard every-other month. For this reason, they roughly corresponds to the most common version of the noun, as assigned on the recording level. They additionally have somoe idiosyncratic rules as laid out below.

Basic levels are:

  • Phonologically similiar to the uttered word

  • Singular, except for deafult plurals like "glasses" and food items, which default to the most frequent form

  • The most frequent version within the file, with respect to singular vs. plural, dimunitives, and wordplay/family names for items (e.g. a bottle is almost always referred to as a baba, so the basic level is 'baba')

  • The "head" noun of the NP, when the uttered word was a compound (e.g. dump+truck becomes 'truck') or contained a modifier (silly+monkey has the basic level 'monkey')

  • Some objects don't have a basic level because they refer to a generic category (e.g. animal), or because they would be an unpleasant or infeasible item to include in an eye tracking experiment (e.g. poop). These have NA in the basic_level column.

  • If a character name includes the animal or object category that it belongs to (ex. Daniel+tiger or Thomas+the+train), it becomes the generic animal or object (ex. tiger or train) at basic level.

For more specific cases of basic levels, see the Common Words Index.

Global basic level

In order to reconcile lemmas across the entire corpus, we subsequently assigned a Global basic level, which eliminate the family-specific variability in wordform (e.g. while an utterance may have a basic level of 'doggy, its global basic level is 'dog'), and to assign a lemma to words that received an NA in the basic_level assignment.

Global basic levels were assigned by a script that references an in-lab dictionary and either outputs the standard form, or notes that the uttered object is ambiguous.

To disambiguate, a human coder listened to the token and assigned the corresponding global basic level for a given homophone. For example, all tokens of 'glasses' could refer to eyeglasses, in which case their global basic level is 'glasses,' or to multiple drinking glasses, in which case the global basic level is 'glass.'

Last updated