Leadership assessment has been a particular point of difficulty for contemporary scholarship, with many practitioners rejecting academically-driven leadership instruments and scales and preferring their own, less rigorous, scales. We believe that current conceptualizations and measurements of leadership are problematic, indicated by contemporary challenges that can be widely understood as failures of leadership (e.g. the Australian Banking Royal Commission and Volkswagen's ‘Dieselgate’). Also, how effective leadership is measured needs to change. This paper presents a systematic review of 17 leadership scales developed in the new millennium. The majority of scales lack some degree of rigor. Our response has been to conduct eighteen critical checks over four stages of scale development: theory generation, item development, content validity, and empirical evaluation. On the premise that understanding past practices, with their limitations, can be used to drive forward a suite of more effective organizational tools, we provide best practice recommendations using contemporary psychometric research.