Phase 2 and Phase 3 CNS studies fail at a rate of 25%.
 

Combined with high cost of drug development and limited R&D budgets, the biopharmaceutical industry has been struggling to find solutions to decrease the risk of failed trials and move new medicines to market at a faster rate. 

 

Among key challenges for a successful trial is patient enrolment and more specifically, accurate patient selection.  Selection of appropriate patients for a clinical trial study is in turn dependent on the proper selection and practical use of screening and diagnostic assessments. 

Many of the psychometric rating scales used to monitor treatment effects in clinical trials today were originally designed for other purposes, or for other types of patient populations.  Due to limited knowledge of available assessments on the market, and in some cases complete unavailability of better options at a given time, the accessible rating scales were adapted to meet the needs of the clinical trial study objectives.  Thus, they remained to be the standard tools for measurement of symptom severity even as other and better tools became available.  The methodologies used to adapt those scales are largely unknown and undocumented, and the use of the rating scales in clinical trials have resulted in significant variability in clinical diagnosis and monitoring response to treatment within raters, among raters, across sites and across studies.

Clinical diagnosis and treatment monitoring also varies among geographic regions.  In many countries, certain mental health disorders represent a cultural stigma that negatively affects patients’ access to treatment.

 

The overcome these challenges, the pharmaceutical industry has established a standard practice for training raters on use of rating scales included in clinical trial studies.  While this practice has improved the quality of data and increased the number of successful trials, there are more directly related challenges that continue to contribute to failed trials.

 

 
Trainer Competency:


Standard practice is not necessarily effective practice.
 

Since the 1970s, response to treatment has been measured across multiple symptom dimensions with psychometric rating scales that were designed to detect changes in symptom severity.  Some scales used in clinical trials were originally designed for different practical applications, for different populations, and/or custom revised without validation studies, and as a result, researchers found them challenging to use and quality of data was compromised.

While ICH guidelines mandate that rater training must occur, the quality of rater training and especially the quality and the qualifications of trainers has not been well defined.  Factors such as varied facilitation skill, little or no formal training in psychometrics, lack of familiarity with origins of instrument,  development of meaningful gold-standards, and lack of linguistic and cultural sophistication continue to contribute to inadequate rater training and raters’ comprehension. The rigorous standards of modern multi-national clinical trials demand a more comprehensive, quality-oriented solution.

 


Rater Competency:


Rating inconsistencies are rarely caught in a timely manner, and its effects have dramatic impacts on the outcome of clinical trial studies.
 

Rater drift is defined as the tendency for raters to unintentionally redefine assessment criteria and standards over time, or across a series of ratings and is a major cause of poor quality data and a significant contributor to clinical trial failure.  Rater drift is an internal phenomenon that can be controlled with refinement of existing training and performance management programs. 

In the context of the increasingly high rate of failed CNS studies(1), unfavorable signal/noise ratio is caused by both intrinsic and extrinsic factors related to the conduct of multi-center double blind trials in which relatively “weak” parameters define outcome. Several research findings clearly indicate that rater bias, inter-rater reliability, and interview quality (for both evaluation and diagnosis) significantly impact signal detection and study outcome.(2) 

Even with excellent training and good site selection, the Sponsor may still have limited control over what occurs at the site. The current financial compensation system tends to incentivize sites to enroll as many patients as allowed in the shortest period of time. As a result some patients are enrolled in the study even if they do not meet the study criteria, they have the wrong diagnosis, or their baseline rating scores may be artificially inflated.(3)  In addition the majority of sites growing numbers of studies are located in countries/regions where language and culture can play an important role in understanding the clinical assessment, and critical elements can get “lost in translation”. These factors can significantly compromise the integrity of trial conduct and ultimately the outcome of the study. 

The use of rating scales requires a unique balance of clinical judgement, objectivity, and strict adherence to study protocols to ensure that scales are administered, scored and interpreted correctly.  To ensure that data collected is both reliable and valid, raters and pharmaceutical company sponsors must overcome several obstacles, including:

  • Raters must have appropriate levels of clinical experience with target patient populations
  • Qualifying, training and monitoring performance of raters must be carried out in a timely manner to ensure rating scales are used accurately and consistently from patient to patient, and from site to site
  • New raters who enter a trial during later stages should receive the same level of training as the original cohort of raters
  • Raters should be periodically re-trained during the course of a study to prevent rater drift
  • Raters should be trained in their own languages, using standards and approaches that are meaningful for the culture, demographics, and contextual realities of their patient populations
  • Rating scales and training materials should be translated, culturally adapted and validated


______________________________________

(1) Laughren, TP “The scientific and ethical basis for placebo-controlled trials in depression and schizophrenia: an FDA perspective.” European Psychiatry. 2001 Nov;16(7):418-23.

(2) Schoemaker, J et al “Expert Rater Assisted Score Evaluation (ERASE): A New Method to Enhance Signal Detection in Randomized, Placebo-Controlled Clinical Trials" (presented at NCDEU 2009; presentation in press).

 

(3) Mundt et al “Is it easier to find what you are looking for if you think you know what it looks like?” Journal of Clinical Psychomarmacology, 27(2): 121-125, April 2007