# Noun Annotations

Each annotation has five elements: the coded **noun**, a code for the **utterance type**, a code for **object presence**, a three letter code for **speaker**, and a **unique identifier**.

## Noun

Trained coders identified each concrete, imageable noun that was said clearly in proximity to the target child, in English. Nouns did not have to be child-directed in order to be coded.&#x20;

### Formatting

Nouns were coded however they are said: plurals, diminutives, wordplay, and speech errors were coded as they occured for all speakers except the target child (the participant child who was wearing the recording device). See [Target Child Productions](https://seedlings-nouns.bergelsonlab.com/noun-annotations/target-child-productions) for more information.

Words were spelled according to the [New Oxford American Dictionary](https://www.oxfordlearnersdictionaries.com/us/definition/american_english/).

For diminutives that are not in the dictionary, coders added a "y" for singular and "ies" for plural.

When more than one word was coded as the noun (as in the case of compound nouns, book titles, and character names), a plus sign was used to string the words together (e.g. "kitty+cat").

Punctuation was not included (usually relevant in coding book titles -- Where's Baby's Belly Button was coded as "Wheres+babys+belly+button").

Other than the first letter of book titles, country names, place names (e.g. "Niagara+falls"), holidays ("Christmas+tree"), and character names, no capital letters were used (e.g. "ipad" for iPad, "tv" for TV). Capital letters do not occur in any placement except the first letter of a coded noun (including book titles).

### Specific details about nouns

* Color words: typically not coded. Sometimes coded as informative modifiers (e.g. if a parent references a child's toy as "purple+monkey").
* Letters/Numbers: only coded if there was a depiction of the letter or number, as in a book or flashcard, e.g. "Here is a four!"
  * Letters are coded in lower case as the individual letter, not its spelling, e.g. "l" not "el"
* People words (e.g. Mom, Dad, Grandma): not coded.&#x20;
  * Exceptions: "baby" was coded, but not when used as a nickname ("You're a sweet baby." baby would be coded; "Do you want a bottle, baby?" baby would not be coded).
  * People words were coded when used as character names in a book. Identifiable by initial capital letter ("Grandma+jones").
  * Possessives: coded as main noun only ("kitty's bowl" is just "bowl").
  * Proper nouns: pet names and people names were not coded.&#x20;
* Book titles and book characters were coded and formatted using one capital letter at the beginning of the noun, with plus signs between each word.
* Sign Language: nouns were not coded if they occured **only** in sign language.
* In cases of ambiguous syntax, where a word could be a verb or an object word, the word was counted as a noun (e.g. "Do you wanna *drum*?").

### Common Words Index

The [Common Words Index](https://docs.google.com/spreadsheets/d/1lfvV4W8wzwpeN4xe5YomSBqe_LFJYr3YbnLWeC3c_dI/edit?pref=2\&pli=1#gid=0) contains many common nouns and determinations about how they should appear in the SEEDLingS dataset. The "Include (Y/N)" column provides a yes or no answer as to whether a word should be coded or not.

## Utterance Type

Each noun-containing utterance was coded as belonging to one of the following categories:

* d: declarative/statement
* i: imperative/command
* q: question
* r: reading
* s: singing
* n: non-utterance (a word in isolation)

### Specific details about utterance types

Words said in isolation, and noun phrases, e.g. 'ooh, ball!', 'nice kitty.', 'applesauce!', were coded as **n (non-utterance).**

Noun phrases longer than 3 words were not coded as non-utterances, even the speaker did not say a full sentence (e.g. "prunes on your pants!" would be coded as a **declarative**).

Any time two object words were used in a phrase (ex. Bat and ball!) coders marked the sentence as **d (declarative)**.

Asking a child to say a phrase (even a question phrase) was coded as an **imperative** (e.g. "Baby, say, "Where's the doggy?")

Utterances beginning with "Let's" (e.g. "Let's go to the park") were coded as **declarative**. Utterances beginning with "Let me" (e.g. "Let me see your hand") were coded as **imperative**.

Nursery rhymes and chanting were counted as singing if the speaker was using a sing-song tone that was distinct from their normal talking voice. Things like Patty Cake and This Little Piggy counted as singing on a case-by-case basis, i.e. in cases where the speaker's tone and use of rhythm was different from usual child-directed speech for that particular speaker.

Book titles were **not** coded as reading. Coders determined the utterance type according to the rest of the sentence. If the speaker said only the book title (even if they were reading the title page), it was coded as **"n."** Examples:

* "Dr. Seuss's ABCs" = n
* "Do you want to read Llama Llama Red Pajama?" = q
* "Oh, I love Dr. Seuss's ABC's!" = d

Declarative sentences with tag questions (e.g. "This is your ball, right?") were usually coded as **declarative** if they otherwise sounded like a declarative sentence.

### Ranking

The following system was used to rank the utterance types (which type takes precedence over the other types when more than one utterance type could be used).

* reading > singing (if both reading and singing, coded as "r")
* reading, singing > non-utterance, question, imperative, declarative (if an utterance was both reading and non-utterance, coded as "r")
* non-utterance > question, imperative, declarative (if prosody was used to make a single noun sound like a question, it was coded as "n," not "q")
* question > declarative, imperative (if an utterance was a question and a command/declarative (e.g. “get your blocks?” or “You like cookies?) coded as “q," not "d" or "i")

## Object Presence

In video recordings, the coder used the visual scene available from small cameras that the baby wore on a hat as well as a full-room-view tripod camera to determine if the object was present or not.

In audio recordings, the coder used the audio context to make their best inference as to whether the object was present.

Coded as y (yes, object is present) and n (no, object is not present).

### Specific details about object presence

Objects were coded as present if they were being touched, held, or pointed to by the **speaker** or **baby**, or if the baby was likely visually focusing on the object (a determination that could be made easily in video files and not as easily in audio files).

Body parts were coded as present if the speaker touched the body part on the baby (e.g. "You have something on your head," while touching the baby's head: "head" is coded as **present**).

Objects that are audible but not visible (e.g. hearing a lawnmower but not seeing it from a window) are **not** coded as present.

If the speaker took out, pointed to, or touched the object immediately after naming it (within two seconds) then it was coded as **present.**

Object presence was determined by assessing whether the object was acting as an appropriate symbol, and not whether the actual object itself was present (e.g. if a child was using a container as a drum, the object "drum" was marked as **present**).

If the coder was 50/50 unsure of object presence, they could use the code "u" to mean unknown object presence.

## Speaker

Speaker codes are three capital letters. They provide information about the speaker's age (adult or child), gender (Male or Female), and relationship to the child (aunt, cousin, etc.). Speaker codes were generated on a per-subject basis such that a speaker that recurs in multiple of a child's files should have the same speaker code across files (e.g. child X has two sisters, A and B; SI1 will always refer to sister A and SI2 will always refer to sister B).

A new speaker code was created for times when two or more speakers said a noun in unison.

For a speaker code key, [see here](https://seedlings-nouns.bergelsonlab.com/speaker-code-key).

## Unique Identifier

Each individual noun that was coded in the dataset has a unique 8-digit identifier that begins with "0x" followed by a random set of six alpha-numeric-characters.

The reason for these unique identifiers is because there are times when the same noun appears more than once with the same timestamp (and identical codes for utterance type, object presence, and speaker). The unique identifiers (also referred to as "annotid" short for "annotation id") allow us to disambiguate the annotations and the actual instance of the word in the recording.