A research method for evaluating the findability of topics within a website's hierarchy, asking users to locate items using only the site's navigation labels — without any visual design. Results identify where users get lost or choose the wrong path.
Common contexts
- Validating a revised navigation taxonomy before building any visual design for a site redesign
- Comparing two competing IA structures by running tree tests with each on split user samples
- Diagnosing a high exit rate on a category page by testing whether users can find the items they expect to be there
Use when
Run tree testing after card sorting has given you a proposed taxonomy but before you invest in visual design — it isolates whether navigation labels and hierarchy are the problem, without the confound of visual design choices that can mask or compensate for IA issues.
Avoid when
Don't use tree testing as a substitute for full usability testing with design — tree tests remove all visual context, which means a navigation structure that tests well in isolation may still fail when users encounter it within a dense visual layout with competing calls to action.
The directness score — how often users took the correct path without backtracking — reveals more than the success rate alone; a low directness score on a 'successful' task means users got lucky, not that the IA is working.
Real-world examples
- Shopify used tree testing with 150 merchants to validate their new admin navigation IA before building it, discovering that 'Apps' was found 40% faster under 'Sales Channels' than under 'Settings' — a counterintuitive finding that only tree testing could isolate.
- The BBC ran tree tests across 12 navigation structures before launching their current site architecture, measuring directness of path (correct with no backtrack) rather than just success rate to find the most efficient IA.
- Nielsen Norman Group recommends a minimum 50-participant tree test for each major navigation decision, as their validation data shows error margins drop below 5% at this sample size for binary path tasks.