A controlled experiment comparing two versions of a design to determine which performs better against a defined metric. Users are randomly split between version A and version B, and results are analyzed for statistical significance before drawing conclusions.