Effective pre-hire assessments impact organizational outcomes. Recent developments in machine learning provide an opportunity for practitioners to improve upon existing scoring methods. This study compares the effectiveness of an empirically keyed scoring model with a machine learning, random forest model approach in a biodata assessment. Data was collected across two organizations. The data from the first sample (N=1,410), was used to train the model using sample sizes of 100, 300, 500, and 1,000 cases, whereas data from the second organization (N=524) was used as an external benchmark only. When using a random forest model, predictive validity rose from 0.382 to 0.412 in the first organization, while a smaller increase was seen in the second organization. It was concluded that predictive validity of biodata measures can be improved using a random forest modeling approach. Additional considerations and suggestions for future research are discussed.

Corresponding Author Information

Mathijs Affourtit




To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.