Vieta E, Bobes J, Ballesteros J, González-Pinto A, Luque A, Ibarra N, the Spanish Group for Psychometric Studies (GEEP). Validity and reliability of the Spanish versions of the Bech-Rafaelsen's mania and melancholia scales for bipolar disorders. --> Objective: To assess classical psychometric properties of the Spanish versions of the Bech‐Rafaelsen's mania (MAS) and melancholia (MES) scales. Method: Observational, prospective, and multicentric study in bipolar out‐patients. Convergent validity was assessed against the Young Mania Rating Scale and the Montgomery‐Åsberg Depression Rating Scale. Discriminant validity, reliability, and sensitivity to change, were also assessed. Results: One hundred and thirteen bipolar patients with a manic episode and 102 bipolar patients with a depressive episode were included. Both the MAS and the MES showed appropriate convergent validity (r > 0.90), discriminant validity (P < 0.0001), internal consistency (Cronbach's alpha >0.80), test–retest reliability [intraclass correlation coefficient (ICC) = 0.69 for the MAS and 0.94 for the MES], inter‐rater reliability (ICC > 0.80), and sensitivity to change at 4 weeks since inception (P < 0.0001; within‐group effect size ≥1.8). Conclusion: The Spanish versions of both scales present appropriate psychometric estimates in bipolar patients treated in ambulatory care.
Keywords: bipolar disorders; MAS; MES; validity; reliability
• •
- The Spanish versions of the MAS and the MES show adequate psychometric properties regarding their reliability, validity, and sensitivity to clinical change in bipolar out‐patients.
• •
- The classical psychometric properties of the MAS and the MES are similar to those of competing severity scales as the YMRS and the MADRS in bipolar out‐patients.
• •
- Contrary to other studies, this one was not embedded within an efficacy trial, and thus its reported estimates might be conservatively biased.
• •
- This study was performed with bipolar out‐patients on treatment as usual, and thus its results do not necessarily map to those obtained in other clinical settings.
• •
- The MAS and the MES were translated to Castilian Spanish. As both scales must be administered by trained raters, their wording should be appropriately adapted to other local variations of Spanish if needed.
There are just a few canonical scales which are customarily used to assess the severity – and its change over time – of symptomatic symptoms of mania and depression in bipolar patients. Among them, the Young Mania Rating Scale (YMRS) ([
So far, the above scales have been recommended as main outcome measures in the analysis of efficacy from clinical trials with patients presenting bipolar disorders ([
i) To assess the convergent validity of the MAS and the MES scales when compared with the YMRS and the MADRS respectively; ii) to assess the discriminant validity of both scales by comparing the distribution of their scores with the Clinical Global Impression (CGI); iii) to asses their reliability (internal consistency; inter‐rater and test–retest reliability); and iv) to assess their sensitivity to the clinical change of bipolar out‐patients on treatment as usual over an adequate follow‐up time.
The study was designed as a psychometric study on its own. It was not embedded into another observational or experimental design in which psychometrics would be included as an ancillary aim. It is an observational, short‐term prospective (follow‐up period of 1 or 4 weeks), and multicentric study (18 psychiatric centers across Spain were included), in out‐patients diagnosed of Bipolar I or II Disorder (DSM‐IV‐TR or ICD‐10 criteria) who were on treatment as usual.
All consecutive bipolar patients attending out‐patient psychiatric services of the clinical centers involved in the study were approached to ascertain their willingness to participate in the study. Those patients who agreed and fulfilled the inclusion criteria were recruited into the study. The research protocol was approved by the corresponding Ethics and Research Board. Inclusion criteria were i) to get a diagnosis of Bipolar I or II Disorder according to DSM‐IV‐TR or ICD‐10 criteria; ii) to present an adequate severity level at baseline in unstable patients to allow for a sensible analysis of the sensitivity to change of the MAS and MES scales (a minimum score of 18 and 22 points for the YMRS and MADRS respectively was recommended for inclusion but the final decision was left to the researchers on the basis of their clinical judgement); iii) an age ≥18 years; iv) to provide a signed informed consent to participate in the study; and v) not to present with any relevant cognitive problem which could interfere with the correct understanding of the clinical questions or with the completion of the clinical scales.
According to the methods reported in previous psychometric studies ([[
Graph: 1 Flowchart of the validation study.
Minimum estimates for sample size were calculated according to the psychometric characteristics to evaluate: validity, reliability, and sensitivity to change. All calculations were performed assuming 5% significance (two‐tailed), and a 90% power. For the test–retest and inter‐rater reliability, the minimum estimated sample size was of 20 pairs assuming a correlation among times or raters of 0.70. The minimum sample size to estimate the sensitivity to change, assuming a decrease of 10 points in the scales (SD = 11) and a moderate correlation (0.40) among baseline and final scores (4 weeks), was also of 20 pairs. For convergent validity, the estimated sample size was of 40 patients, assuming a correlation of 0.70 among validating and reference scales (correlation coefficient under the null hypothesis 0.30). Final minimum sample size estimates were increased assuming a 10% drop‐out rate throughout the study. Figure 1 presents the sample size attained at inception and the sample size analyzed at the corresponding follow‐up points.
The MAS and the MES were developed to assess the severity of manic and depressive symptoms ([
The YMRS, and the MADRS were included as reference scales to asses convergent validity ([[
A modified CGI for use with bipolar patients, and also validated in Spanish ([
Figure 1 displays, along with the flowchart of the study and the sample sizes attained, the main analyses conducted to assess the classical psychometric properties of the MAS and the MES scales. Estimates for their reliability were obtained by analysing, i) their internal homogeneity by the Cronbach's alpha statistic, ii) their inter‐rater reliability (two independent raters at three centers) by the intraclass correlation coefficient (ICC), and iii) their test–retest reliability by using also the ICC (1 week after recruitment into the study, and within the subgroup of patients considered to be clinically stable). It is important to note that Cronbach's alpha estimate for internal reliability strongly depends on the scale's number of items. In our case, its comparison among scales was not affected by such difference as all of them include a similar number of items. Estimates for their convergent validity at baseline were obtained by the Pearson's correlation among the MAS and the YMRS total scores, and the MES and the MADRS total scores. Discriminant validity was assessed also at baseline by, i) one‐way anovas contrasting the mean differences of the MAS and the MES scales across the ordered categories of the CGI, and ii) by evaluating from ordinal regression models the significance of the linear trend in their scores according to the CGI categories. Sensitivity to change was assessed at week 4 since inception, for the subgroup of patients considered to be clinically unstable at baseline, with an appropriate within‐group effect size ([
Overall, 215 bipolar I and II patients from 18 clinical centers were recruited; 113 presenting with mania [mean age (SD) = 43.1 years (13.2); 60.7% women] and 102 presenting with depression [mean age = 47.3 years (12.7); 67.6% women]. Table 1 shows their main characteristics at baseline according to the current affective episode and clinical stability as predicted by the psychiatrists. As expected by the study inclusion criteria, the psychopathological severity at baseline was significantly different between stable and unstable patients (P < 0.0001 for both the YMRS and the MADRS), even if the scores range presented some overlapping (YMRS: 5–41 and 8–46 for stable and unstable patients respectively; MADRS: 3–46 and 19–53 for stable and unstable patients respectively). However, the shift in the mean distribution of the scores for both groups was clearly apparent with 43 unstable patients over 56 (76.8%) scoring ≥18 in the YMRS [24 over 57 (42.1%) for stable patients], and 43 unstable patients over 49 (87.7%) scoring ≥22 in the MADRS [19 over 53 (35.8%) for stable patients].
1 Baseline patients' characteristics
Variables Current episode: manic Current episode: depressive Stable ( Unstable ( Stable ( Unstable ( Mean age (SD) 44.0 (13.6) 42.3 (12.8) 46.6 (12.9) 48.0 (12.6) Sex, Male 23 (41.1) 21 (37.5) 19 (35.8) 14 (28.6) Female 33 (58.9) 35 (62.5) 34 (64.2) 35 (71.4) Level of education, Primary school 27 (48.2) 27 (50.0) 18 (34.6) 27 (56.2) Secondary, High school 17 (30.4) 19 (35.2) 22 (42.3) 14 (29.2) College, University 12 (21.4) 8 (14.8) 12 (23.1) 7 (14.6) Median time (months) since first diagnose (IQ range) 114 (51–240) 81 (15–240) 96 (40–180) 127 (64–204) Severity (CGI), Borderline ill 8 (14.5) NA 10 (18.9) NA Mild 17 (30.9) 7 (12.7) 15 (28.3) 2 (4.2) Moderate 20 (36.4) 17 (30.9) 18 (34.0) 17 (35.4) Marked 8 (14.5) 20 (36.4) 4 (7.6) 15 (31.2) Severe 2 (3.6) 10 (18.2) 6 (11.3) 12 (25.0) Extremely ill NA 1 (1.8) NA 2 (4.2) Mean YRMS (SD) 16.3 (8.3) 24.0 (8.0) NA NA Mean MAS (SD) 12.9 (6.4) 18.5 (6.2) NA NA Mean MADRS (SD) NA NA 20.2 (10.2) 31.5 (8.8) Mean MES (SD) NA NA 14.1 (7.6) 23.0 (6.7)
1 CGI, Clinical Global Impression; YRMS, Young Mania Rating Scale; MAS, Bech‐Raphaelsen Mania Scale; MADRS, Montgomery‐Åsberg Depression Rating Scale; MES, Bech‐Raphaelsen Melancholia Scale; NA, not applicable.
The mean time (SD) for the administration of the MAS was 10.9 min ([
The mean elapsed time for the test–retest study was 5.8 days (SD = 1.5) for the MAS, and 6 days (1.2) for the MES. Table 2 presents the reliability estimates obtained for both scales and their validation counterparts (YMRS and MADRS). All estimates for the validating scales, but the test–retest correlation for the MAS (ICC = 0.69), were above 0.80, with similar estimates to those obtained by the reference scales.
2 Reliability estimates
Psychometric characteristics YMRS MAS MADRS MES ( ( ( ( Internal consistency; Cronbach's α 0.84 0.88 0.90 0.91 Test–retest reliability at 1 week ( ( ( ( Mean score test (SD) 16.3 (8.3) 12.9 (6.4) 20.2 (10.2) 14.1 (7.6) Mean score retest (SD) 13.3 (7.5) 10.5 (5.8) 18.4 (10.6) 12.9 (8.0) Test–retest correlation 0.88 0.76 0.95 0.95 ICC (95% CI) 0.85 (0.78–0.92) 0.69 (0.56–0.83) 0.93 (0.89–0.97) 0.94 (0.90–0.97) Inter‐rater reliability ( ( Mean score rater 1 (SD) NA 13.4 (5.0) NA 15.5 (6.3) Mean score rater 2 (SD) NA 12.9 (5.2) NA 15.3 (6.4) ICC (95% CI) NA 0.89 (0.80–0.97) NA 0.98 (0.95–0.99)
2 YMRS, Young Mania Rating Scale; MAS, Bech‐Raphaelsen Mania Scale; MADRS, Montgomery‐Åsberg Depression Rating Scale; MES, Bech‐Raphaelsen Melancholia Scale; ICC, intraclass correlation coefficient.
Figure 2 displays the linear correlations obtained among the MAS and the YMRS, and the MES and the MADRS. Both of them were well above 0.80.
Graph: 2 Correlations among validating (MAS, MES) and reference (YMRS, MADRS) scales.
Both the MAS and the MES showed adequate discriminant validity when compared with the CGI severity scores at baseline (Table 3). The apparent linear trend observed in the table for the MAS and the MES scores were confirmed by further ordinal regression analyses with the CGI categories as dependent variables (P‐value for linear trend ≤0.001 in both cases). The results obtained for the validating scales run in parallel with those obtained for the YMRS and the MADRS.
3 Discriminant validity
CGI severity scores at baseline YMRS, Mean (SD) MAS, Mean (SD) MADRS, Mean (SD) MES, Mean (SD) Minimum (2) 8 (7.3) 8.1 (2.3) 6.5 (2.3) 10 (9.9) 10.9 (8.6) 7.6 (5.4) Mild (3) 24 (21.8) 11.6 (5.2) 9.2 (3.9) 17 (16.8) 16.9 (6.7) 11.8 (5.3) Moderate (4) 37 (33.6) 20.5 (5.4) 15.9 (4.2) 35 (34.7) 24.8 (6.6) 17.5 (5.3) Marked to extreme (5–7) 41 (37.3) 27.3 (7.5) 21.0 (6.0) 39 (38.6) 34.0 (9.4) 24.7 (7.3)
3 CGI, Clinical Global Impression; YMRS, Young Mania Rating Scale; MAS, Bech‐Raphaelsen Mania Scale; MADRS, Montgomery‐Åsberg Depression Rating Scale; MES, Bech‐Raphaelsen Melancholia Scale.
The mean elapsed time to assess the sensitivity to change was 29.5 days (SD = 8) for the MAS, and 28.5 days (8.1) for the MES. Table 4 shows the estimates obtained for the sensitivity to change as evaluated after ≈ 4 weeks on treatment as usual. The within‐group effect size for the validating scales and their counterparts were large enough (>1.5) to support an appropriate sensitivity to clinical change, as were the results of the paired t‐tests among the baseline and final scores. There were not significant differences for the within‐group effect sizes when both validating scales were compared against their counterparts (MAS vs. YMRS difference: z = 1.1, P = 0.29; MES vs. MADRS difference: z = 0.48, P = 0.63). The discriminative power for the estimates of the manic scales (MAS, YMRS), as evaluated by the ROC area, was somehow lower than the values obtained by the depression scales (MES, MADRS). The ROC area for the MAS was not different to the ROC area for the YMRS [Χ
4 Sensitivity to clinical change at 4 weeks since baseline
Psychometric properties YMRS ( MAS ( MADRS ( MES ( Baseline mean score (SD) 23.5 (7.6) 18.6 (6.2) 30.9 (8.5) 23.0 (6.4) Final mean score (SD) 6.3 (5.8) 5.6 (6.0) 13.2 (8.9) 10.1 (7.6) Mean change final – baseline scores (SD) −17.2 (8.6)*** −13.0 (6.7)*** −17.8 (11.6)*** −13.0 (8.8)*** Correlation baseline/final scores 0.19 0.40 0.11 0.21 Within‐group effect size (95% CI) 2.5 (1.9–3.1) 2.1 (1.6–2.6) 2.0 (1.4–2.6) 1.8 (1.3–2.4) ROC area for change scores vs. final CGI (95% CI) 0.65 (0.49–0.80) 0.69 (0.53–0.84) 0.89 (0.79–0.99) 0.88 (0.78–0.99)
- 4 YMRS, Young Mania Rating Scale; MAS, Bech‐Raphaelsen Mania Scale; MADRS, Montgomery‐Åsberg Depression Rating Scale; MES, Bech‐Raphaelsen Melancholia Scale; CGI, Clinical Global Impression; ROC, Receiver Operating Characteristic Curve.
- 5 ***P < 0.0001.
The use of valid and reliable instruments is essential to psychiatry given the subjective nature of many symptoms and the lack of external validators ([[
As our study was not embedded within an efficacy trial but was designed to reflect, as closer as possible, the usual clinical practice with bipolar patients (broad inclusion criteria and patients in treatment as usual), its results could actually be underestimates of those reported previously within the framework of efficacy trials (see, for instance, the scores range for mania and depression at baseline, which cover a broader range of symptomatic severity – from mild to severe – than the criteria usually followed for the inclusion of bipolar patients in efficacy trials). However, this does not seem to be true, at least regarding the comparisons with the results reported by others for internal homogeneity, inter‐rater reliability, discriminative validity, and sensitivity to change ([[
In summary, in this study, both the MAS and the MES have shown adequate and comparable psychometric results to those achieved by other canonical scales (YMRS, MADRS). Both scales could be then used in bipolar out‐patients to assess their symptomatic profile, their severity, and their change over time; and both scales may be then added to the few severity scales adequately translated and validated into Spanish to assess bipolar patients for clinical or research purposes. Probably, the last step – if any – needed for definitely including both the MAS and the MES among the canonical scales usually used for the assessment of bipolar disorders, would be further research on their standardization against independent criteria for clinical response and remission, and on their comparative performance against patient research outcomes designed to tap those clinical constructs ([
Eduard Vieta, José Sánchez‐Moreno, Anabel Martínez‐Arán, María Reinares, José Manuel Goikolea and Antoni Benabarre (Hospital Clinic i Provincial, Barcelona); Julio Bobes, María Paz García‐Portilla and Pilar Saiz (CSM II, Oviedo); José Cañete and Silvia Cañizares (Hospital Mataró, Barcelona); José Vicente Baeza and Milagros Fuentes (Hospital General de Elche, Alicante); José Ramón Doménech and Antonio Córdoba (Hospital Dos de Maig, Barcelona); Antonio Bulbena, Purificación Salgado and Carles Masip (Hospital del Mar, Barcelona); Diego Palao, Blanca Navarro and Jesús Mendoza (Hospital General de Vic, Barcelona); Josep Gascón, María José Martín and María Luque (Hospital Mutua de Tarrasa, Barcelona); Antonio Higueras, Guillermo Pardo and José Eduardo Muñoz (Hospital Virgen de las Nieves, Granada); Ana González‐Pinto, Eider Tapia, Ainara Jiménez, María Gracia Domínguez, Saioa Azpiazu, Patricia Vega and Sara Barbeito (Hospital Santiago Apóstol, Vitoria); José Ramón Gutiérrez, Javier Busto and Fernando Galán (Hospital Infanta Cristina, Badajoz); Eduardo García‐Camba, Elena Ezquiaga and Alejandra Martín (Hospital de la Princesa, Madrid); Ignacio Tortajada, Julio Alonso, Jorge Pérez, Mario Páramo and Alicia Crespi (Hospital de Conxo, Santiago de Compostela); Antonio Lobo, Federico Dourdil and Raúl López (Hospital Clínico Universitario Lozano Blesa, Zaragoza); Jesús de la Gándara and Nuria Español (Hospital Divino Valles, Burgos); Pedro Pozo, María Angeles de Haro and Juan Francisco Tello (Hospital General Universitario José M Morales Meseguer, Murcia); Manuel Serrano, Juan Carlos Díez, Domingo de Miguel and María José Avila (Complejo Hospitalario Universitario Juan Canalejo, La Coruña); Miguel Roca and María Jesús Serrano (Hospital Joan March, Palma de Mallorca), Sara Ruíz (Quintiles).
We are indebted to Professor Per Bech for giving us free permission to validate the 1996 versions of the MAS and the MES scales in Spanish. Eduard Vieta, Ana González‐Pinto, Antonio Lobo, Anabel Martínez‐Arán, María Reinares, Jose Manuel Goikolea, Antoni Benabarre and Javier Ballesteros thank the support of the Spanish Ministry of Health, Instituto de Salud Carlos III, RETICS RD06/0011 (REM‐TAP Network). These authors plus Julio Bobes are currently included in the Spanish network for psychiatric research CIBER‐SAM (Spanish Ministry of Health, Instituto de Salud Carlos III).
This study was funded by AstraZeneca Farmacéutica Spain SA.
Dr Eduard Vieta has received grant support, acted as consultant, or given presentations for the following pharmaceutical companies: Almirall, AstraZeneca, Bial, Bristol‐Myers‐Squibb, Eli Lilly, Glaxo‐Smith‐Kline, Jansenn‐Cilag, Merck Sharp & Dohme, Lundbeck, Novartis, Organon, Otsuka, Pfizer, Sanofi‐Aventis, Servier, and UCB. Dr Julio Bobes has received grant support, acted as consultant, or given presentations for the following pharmaceutical companies: AstraZeneca, Bristol‐Myers‐Squibb, Eli Lilly, Glaxo‐Smith‐Kline, Janssen‐Cilag, Otsuka, Pfizer, and Sanofi‐Aventis. He is also in the Editorial Board of Acta Psychiatrica Scandinavica. Dr Javier Ballesteros has received grant support from Glaxo‐Smith‐Kline, Eli Lilly, and AstraZeneca. Dr Ana González‐Pinto has received grant support, acted as consultant, or given presentations for the following pharmaceutical companies: Almirall, Astra‐Zeneca, Bristol‐Myers‐Squibb, Eli Lilly, Glaxo‐Smith‐Kline, Janssen‐Cilag, Sanofi‐Aventis, Lundbeck, Novartis, and Pfizer. Mr Antonio Luque was a key figure both in the design and in the administrative follow‐up of this study as Head of Therapeutical Area of Neuroscience in the Medical Department of AstraZeneca Pharmaceuticals Spain until his departure on October 30, 2006. Mrs Nora Ibarra does not present any conflict of interest.
Annexe S1. Spanish version of the Bech‐Raphaelsen's Mania Scale (MAS)
Annexe S2. Spanish version of the Bech‐Raphaelsen's Melancholia Scale (MES)
Please note: Blackwell Publishing are not responsible for the content or functionality of any supplementary materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.
Graph: Supporting info item
Graph: Supporting info item
Graph: Supporting info item
Graph: Supporting info item
By E. Vieta; J. Bobes; J. Ballesteros; A. González‐Pinto; A. Luque and N. Ibarra
Reported by Author; Author; Author; Author; Author; Author