Research colloquium: Computational & experimental psycholinguistics

Week 11: Stimulus design

Anna Pryslopska

2024-06-20

Stimulus design

Items and conditions

Target sentences/phrases/words/items

→ what you test

Control sentences/phrases/words/items

→ what you compare against

Testing costs time and money.

How to get the strongest effect with least effort?

→ Test the same contrast multiple times.

  • Option 1 test many participants
  • Option 2 test many items
  • Option 3 test both → repeated measure experiment

People are lazy and good at pattern recognition.

They will notice a sentence they already read and not process it the same way.

→ Test different sentences.

Do I have to make new items for each participant?

I’m lazy too.

→ “Recycle” the sentences.

Show all sentence in one version per person. Participants don’t talk to each other/only take the experiment once/forget.

Juggling these constraints is impossible.

How to ensure that all participants read all sentences and see all conditions equally often with no repetitions?

→ Divide et impera.

How do I sure participants don’t catch on to the research question?

Observing changes the outcome.

→ Hide your design.

Stimulus design best practices

Clear and simple

Linguistic sentences are famously convoluted.

Die Leonie begeisterte den Nikolas immer wieder, weil er, als ein muskulöser Hausmeister, so wunderschön singen nicht konnte.
Leonie always delighted Nikolas because, as a muscular janitor, he couldn’t sing so beautifully.

Use clear and simple language → no technical terms

Avoid complex or ambiguous structures → cause confusion

Comparable and consistent

If your sentences vary in length and complexity, this will add noise to your data.

Balance your sentences for:

  • length more words/letters/syllables are more difficult
  • complexity complex syntax is more difficult
  • frequency rare words are more difficult

Natural and plausible

The sentences should be understandable without additional context.

The only thing that outlandish sentences measure is your creativity.

Factual errors will irritate and distract participants.

Avoid bias and distressing content

Be mindful of potential cultural, gender, socioeconomic biases in sentence materials.

Make the sentences neutral and non-controversial.

Make the sentences appropriate for your participants (PG).

Avoid sensitive or potentially distressing content.

Avoid humorous content.

Get started

Try DeepL (deepl.com) or AI (e.g. chatgpt.com) for translating stimuli.

Get started

Try Wikipedia or AI for inspiration

→ provide AI with background info and examples

Use the spellchecker and look for the the frequent errors.

Example prompt

I am making a psycholinguistic experiment on the noisy channel framework. Can you help me make 100 item sentences in 4 conditions in English? Here are some examples of what I need:
The cook baked a cake Lucy.
The cook baked Lucy a cake.
The cook baked Lucy for a cake.
The cook baked a cake for Lucy.
The bartender poured a drink the customer.
The bartender poured the customer a drink.
The bartender poured the customer for a drink.
The bartender poured a drink for the customer.

Fillers

People are lazy and good at pattern recognition.

They catch on to obvious, awkward, convoluted sentences

Wenn eine junge Frau Besitzerin eines Friseursalons ist, dann putzt sie den meist mit Hilfe einer Putzfirma.
When a young woman owns a hairdressing salon, she usually cleans it with the help of a cleaning company.

→ Divert their attention

→ Hide your design with fillers.

Fillers or distractor items

Task: mask the purpose of the study.

  • Blend seamlessly with the experimental items
  • Neutral, plausible, unbiased, non-distressing, uncontroversial
  • Similar to items in length, complexity, structure, word frequency, naturalness, plausibility …
  • Different from items in all ways that matter (depends on your design: different words, syntax, topic, pattern …)
  • In acceptability rating studies: set the rating boundaries

How and where to hide the items?

→ Disperse items among fillers

At lest 1:1 item to filler ratio, 1:3 is better

Make the sentence order random-ish.

Lists

A list is a set of stimuli a participant is asked to process during an experiment.

Typically number of lists = number of conditions in your experiment.

  • All participants read the same amount of sentences.
  • All participants read each one of the items.
  • All participants read the conditions equally often.
  • No participant reads the same sentence twice in the same condition.

How to ensure the right amount of items and conditions across all lists?

Unmanageable!

→ Latin square design

Latin square

Latin square design

  • Control the distribution of items and conditions across lists.
  • Balance the presentation of stimuli in the most efficient way.

→ Perfect randomization is impossible. This is the next best thing.

Start out with item list

Item Condition List
1 Target
1 Control
2 Target
2 Control
3 Target
3 Control
4 Target
4 Control

Color code the conditions

Item Condition List
1 Target
1 Control
2 Target
2 Control
3 Target
3 Control
4 Target
4 Control

Add the list column

Item Condition List
1 Target
1 Control
2 Target
2 Control
3 Target
3 Control
4 Target
4 Control

Populate the Latin Square

Item Condition List
1 Target 1
1 Control
2 Target
2 Control
3 Target
3 Control
4 Target
4 Control

Populate the Latin Square

Item Condition List
1 Target 1
1 Control 2
2 Target
2 Control
3 Target
3 Control
4 Target
4 Control

Populate the Latin Square

Item Condition List
1 Target 1
1 Control 2
2 Target 2
2 Control
3 Target
3 Control
4 Target
4 Control

Populate the Latin Square

Item Condition List
1 Target 1
1 Control 2
2 Target 2
2 Control 1
3 Target
3 Control
4 Target
4 Control

Populate the Latin Square

Item Condition List
1 Target 1
1 Control 2
2 Target 2
2 Control 1
3 Target 1
3 Control 2
4 Target
4 Control

Populate the Latin Square

Item Condition List
1 Target 1
1 Control 2
2 Target 2
2 Control 1
3 Target 1
3 Control 2
4 Target 2
4 Control 1

Latin square with 3 conditions

Item Condition List
1 A
1 B
1 C
2 A
2 B
2 C
3 A
3 B
3 C
4 A
4 B
4 C
Item Condition List
1 A
1 B
1 C
2 A
2 B
2 C
3 A
3 B
3 C
4 A
4 B
4 C
Item Condition List
1 A 1
1 B 2
1 C 3
2 A 2
2 B 3
2 C 1
3 A 3
3 B 1
3 C 2
4 A 1
4 B 2
4 C 3
1 2 3
2 3 1
3 1 2

Latin square with 4 conditions

Item Condition List
1 A
1 B
1 C
1 D
2 A
2 B
2 C
2 D
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
Item Condition List
1 A
1 B
1 C
1 D
2 A
2 B
2 C
2 D
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
Item Condition List
1 A 1
1 B 2
1 C 3
1 D 4
2 A 2
2 B 3
2 C 4
2 D 1
3 A 3
3 B 4
3 C 1
3 D 2
4 A 4
4 B 1
4 C 2
4 D 3
1 2 3 4
2 3 4 1
3 4 1 2
4 1 2 3

Populate the Latin square

Presentation order

At the start of the experiment participants behave differently than at the end

  • tired, bored, or distracted
  • recognized target items/manipulation
  • got better at the task
  • primed by previous items

Randomization and pseudorandomization are used to balance the order of presentation and avoid bias.

Randomization

Completely random order, no constraints, all items and fillers are equally likely to appear at any point.

✔️ unbiased (most “natural”)

✔️ easy to implement (only random numbers)

❌️ no control over accidental patterns

❌️ no control over sequence (conditions could cluster)

❌️ no true randomness

Pseudorandomization

Almost random order, but follows certain constraints to prevent unwanted patterns.

✔️ controlled randomness (no accidental patterns)

✔️ stimuli is equally spread out (no clusters)

❌ complex to plan and implement

❌ not random (can still introduce bias)

Attention checks

Participants often try to get away with the least amount of effort → don’t pay attention.

Attention checks are a secondary task (e.g. recall, embedded question, comprehension question, response time checks)

  • ensure that participants focus and complete the main task
  • assess participants’ performance and identify bad participants
  • identify difficult items and fillers

Comprehension questions

Follow all/some items and ask for information about the item that was shown right after the item has disappeared.

What was the previous section about?

A: Randomization

B: Latin Square design

Have I shown you a spreadsheet?

Yes

No

Ideal comprehension questions:

  • easy to answer if you completed the task
  • have only one answer
  • balanced answers (50/50 yes/no, 50/50 left and right)
  • short and clearly phrased
  • consistent (not mixing yes/no and A/B)
  • test the whole item

Feedback

Can help participants improve and focus.

Usually only in the practice part of the experiment.

→ A delay or negative feedback will annoy participants.

→ Sometimes having a slight delay or feedback appear ONLY when the participants get something wrong can be a useful motivator.

Workflow

  1. Make items
  2. Make fillers
  3. Make comprehension question (optional)
  4. Ensure you have all items in all conditions
  5. Correct spelling typos
  6. Latin Square design
  7. Generate lists
  8. (Pseudo-)randomize

Spreadsheet

Recommended for this course: Google Sheets (share on ILIAS) https://docs.google.com/spreadsheets/

Figure out what columns you need (at least: item, condition, sentence, question/rating, list)

Populate the columns

Item Condition Start Target Spillover Sentence List
1 A The cook baked a cake Lucy The cook baked a cake Lucy. 1
1 B The cook baked Lucy a cake The cook baked Lucy a cake. 2
1 C The cook baked Lucy for a cake The cook baked Lucy for a cake. 3
1 D The cook baked a cake for Lucy The cook baked a cake for Lucy. 4
Item Condition Start Target Spillover Sentence List
1 A The cook baked a cake Lucy The cook baked a cake Lucy. 1
1 B The cook baked Lucy a cake The cook baked Lucy a cake. 2
1 C The cook baked Lucy for a cake The cook baked Lucy for a cake. 3
1 D The cook bakeb a cake for Lucy The cook baked a cake for Lucy. 4

Splitting sentences may make it easier find typos, but then you need to put them together. Don’t forget punctuation!

=TEXTJOIN(delimiter, ignore_empty, text1, text2 ...])
=TEXTJOIN(" ", TRUE, C2, D2, E2)

=CONCATENATE(string1, string2 ...)
=CONCATENATE(C2:E2)

=CONCATENATE(TEXTJOIN(" ", TRUE, C2, D2, E2), ".")

List generation

  1. Automatically color code your conditions.
  2. Latin Square
    → each list gets all items in some conditions and all fillers
  3. Make a new sheet (not a new document!) for each list
  4. Run the Apps Script code
  5. Check lists (nr of items and conditions per list, same length, etc.)
Item Condition Start Target Spillover Sentence List
1 A The cook baked a cake Lucy The cook baked a cake Lucy. 1
1 B The cook baked Lucy a cake The cook baked Lucy a cake. 2
1 C The cook baked Lucy for a cake The cook baked Lucy for a cake. 3
1 D The cook baked a cake for Lucy The cook baked a cake for Lucy. 4
2 A The father bought a bicycle his son The father bought a bicycle his son. 2
2 B The father bought his son a bicycle. The father bought his son a bicycle. 3
2 C The father bought his son for a bicycle The father bought his son for a bicycle. 4
2 D The father bought a bicycle for his son The father bought a bicycle for his son. 1
Item Condition Start Target Spillover Sentence List
101 filler The gardener watered the rose this morning The gardener watered the rose this morning.
102 filler The cat sat on the mat in a hat The cat sat on the mat in a hat.
103 filler Last night a storm flooded the street Last night a storm flooded the street.
104 filler No dog barks louder than ours No dog barks louder than ours.
Item Condition Start Target Spillover Sentence List
1 A The cook baked a cake Lucy The cook baked a cake Lucy. 1
2 D The father bought a bicycle for his son The father bought a bicycle for his son. 1
101 filler The gardener watered the rose this morning The gardener watered the rose this morning.
102 filler The cat sat on the mat in a hat The cat sat on the mat in a hat.
103 filler Last night a storm flooded the street Last night a storm flooded the street.
104 filler No dog barks louder than ours No dog barks louder than ours.
Item Condition Start Target Spillover Sentence List
1 B The cook baked Lucy a cake The cook baked Lucy a cake. 2
2 A The father bought a bicycle his son The father bought a bicycle his son. 2
101 filler The gardener watered the rose this morning The gardener watered the rose this morning.
102 filler The cat sat on the mat in a hat The cat sat on the mat in a hat.
103 filler Last night a storm flooded the street Last night a storm flooded the street.
104 filler No dog barks louder than ours No dog barks louder than ours.
Item Condition Start Target Spillover Sentence List
1 C The cook baked Lucy for a cake The cook baked Lucy for a cake. 3
2 B The father bought his son a bicycle. The father bought his son a bicycle. 3
101 filler The gardener watered the rose this morning The gardener watered the rose this morning.
102 filler The cat sat on the mat in a hat The cat sat on the mat in a hat.
103 filler Last night a storm flooded the street Last night a storm flooded the street.
104 filler No dog barks louder than ours No dog barks louder than ours.
Item Condition Start Target Spillover Sentence List
1 D The cook baked a cake for Lucy The cook baked a cake for Lucy. 4
2 C The father bought his son for a bicycle The father bought his son for a bicycle. 4
101 filler The gardener watered the rose this morning The gardener watered the rose this morning.
102 filler The cat sat on the mat in a hat The cat sat on the mat in a hat.
103 filler Last night a storm flooded the street Last night a storm flooded the street.
104 filler No dog barks louder than ours No dog barks louder than ours.

Script for Google Sheets

/**
 * This script distributes materials from "Items" and "Fillers" sheets over lists (default is List1 and List2).
 * Clears existing content in 'List1' and 'List2' before copying new data.
 * - Copies headers from 'Items' sheet to 'List1' and 'List2' if they are empty.
 * - Copies rows from 'Items' sheet to 'List1' or 'List2' based on the value in the 'LIST' column.
 * - Appends all rows from 'Fillers' sheet to 'List1' and 'List2'.
 * 
 * Note: Ensure the 'Items' sheet contains a header row with a column labeled 'LIST' for categorization.
 * 'List1' and 'List2' sheets should exist in the document. 
 * 'List1' and 'List2' sheets will be emptied and filled with updated data each time this function is run. All changes made to lists will be overwritten each time this code is run.
 */

function generateLists() {
  const ss = SpreadsheetApp.getActiveSpreadsheet();
  const it = ss.getSheetByName('Items');
  const l1 = ss.getSheetByName('List1');
  const l2 = ss.getSheetByName('List2');
  const fi = ss.getSheetByName('Fillers');
  
  // Clear existing content in 'List1' and 'List2'
  l1.clear();
  l2.clear();

  // Get headers from 'Items' sheet
  const headers = it.getRange(1, 1, 1, it.getLastColumn()).getValues()[0];
  
  // Copy headers
  l1.appendRow(headers);
  l2.appendRow(headers);
  
  // Get all data rows from 'Items' sheet
  const rangeItems = it.getDataRange();
  const valuesItems = rangeItems.getValues();
  
  // Process each row starting from the second row (data rows)
  for (let i = 1; i < valuesItems.length; i++) { // Start from row 2 assuming row 1 is headers
    const row = valuesItems[i];
    const val = row[headers.indexOf('LIST')]; // Adjust 'LIST' to match your header
    
  // Copy list 1 rows to List1 and list 2 rows to List2 
    if (val == 1) {
      l1.appendRow(row);
    } else if (val == 2) {
      l2.appendRow(row);
    }
  }
  
  // Get all data rows from 'Fillers' sheet
  const rangeFillers = fi.getDataRange();
  const valuesFillers = rangeFillers.getValues();
  
  // Copy all rows from 'Fillers' sheet to 'List1' and 'List2'
  for (let j = 1; j < valuesFillers.length; j++) { // Start from row 2 assuming row 1 is headers
    const row = valuesFillers[j];
    l1.appendRow(row);
    l2.appendRow(row);
  }
}

List creation workflow with Google Sheets