How we measure accuracy
Two ways we measure how closely Pai’s digital twins reflect real human responses.
Matching individual human responses
Pai digital twins match their corresponding human’s answers to hold-out questions 85.8% of the time (relative to test-retest reliability). This is a measure of individualized accuracy — how often any one twin matches their corresponding human’s answers.
Predicting the right winner
For head-to-head concept tests, head-to-head message tests, and Likert-scale questions, Pai digital twins correctly select the highest-selected answer option 93.3% of the time. This is a measure of directional accuracy — how often the twins in aggregate choose the same winning concept, message, or Likert direction as the humans in aggregate.
What are hold-out questions?
Hold-out questions are questions that we’ve asked a human but have not trained their corresponding twin on. This gives us a way to gauge how well the twin can predict the human’s responses to data outside its training dataset.