Noun token timestamps
Each noun token in the output data has an affiliated onset and offset timestamp in ms. Note that roughly, these timestamps correspond to the start and end of the utterance containing the instance of the noun.
These timestamps generally allow researchers to locate the token and extract it from the audio file, but SHOULD NOT be used to calculate meaningful duration of words or utterances containing them. There is high variance in how close to the beginning and end of an utterance each timestamp is.
Below we discuss how these timestamps were genereated for each type:
Video
Onset and offset times were marked manually in Datavyu. Coders listened to a video file and were instructed to create a new cell demarcating the start of the utterance containing a codeable noun. They then hit a "mark offset" button while listening and reaching the end of the utterance.
Due to human error, variation in reaction time, and buffering time playing the video, these may not be reliably aligned with the utterance boundaries. Most commonly these timestamps are too broad, encompassing more time than the utterance, prioritizing the capture of the token within the utterance boundaries over the relationship to the real-world duration of the utterance.
Coders were instructed to copy and paste the onset and offset timestamps when multiple nouns occurred in the same utterance.
Audio
Audio timestamps are generated by the LENA algorithm. Onset and offset times correspond to the algorithm's estimation of an utterance, calculated by the continuous speech of one identified speaker category (e.g. one utterance is a duration of speech by FAN, female-adult-nearby). The timestamps do not correspond to listener-identified syntactic utterances for several reasons:
Because these are automated analyses and not human-generated,
Because multiple adjacent speakers may be identified as a single speaker category, and
because the analysis includes speaker categories such as "overlapping speech" and "electronic noise," which do not align with human utterances.
Each LENA-identified segment outputs one file the the CLAN .cha file that was generated for audio transcription. Coders were instructed to annotate a noun token on the line that was playing while the speaker starts to say the noun, regardless of the speaker identified by LENA.
Therefore, nouns that were spoken within the same utterance may nonetheless have different onset and offset timestamps, and there may be more than one utterance encapsulated in a single set of timestamps.
Last updated