Development and Case-Control Validation of the Canadian Men’s Health Foundation’s Self Risk-Assessment Tool: “You Check”

Farshad Pourmalek, MD, PhD1,2, S. Larry Goldenberg, CM, OBC, MD1, Kendall Ho, MD3, Sean C. Skeldon, MD, MSc 4, David M. Patrick, MD, MHSc2

Department of Urologic Sciences, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
2 School of Population and Public Health, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
3 Department of Emergency Medicine, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
4 Department of Family & Community Medicine, Faculty of Medicine, University of Toronto, Ontario, Canada


Background and Objective : To facilitate the engagement of men in the evaluation of their own health status and risk of disease, we have developed and validated the Canadian Men’s Health Foundation’s self-risk assessment tool (“You Check”). In a single questionnaire, the “You Check” tool estimates the 10-year risk for myocardial infarction (MI), diabetes type 2 (DM), osteoporosis (OS), erectile dysfunction (ED), and low testosterone (LT). Additionally, the tool provides the user with his risk-factor profile for prostate cancer and his current risk of depression (using the Center for Epidemiologic Studies Depression scale).

Materials and Methods : Known risk factors for each disease were collated, the questionnaire designed, and risk scores for each disease were assigned by clinical experts. A risk formula was developed using the sum of risk scores divided by their own range. We validated the risk models with case-control data from a retrospective review of 400 outpatient records from 4 Vancouver family practice clinics. Maximal correct classification proportions were determined and used as thresholds for categorization of risk to low, medium, or high categories.

Results : For DM, sensitivity and specificity were 0.86 and 0.96 respectively and the Area Under Curve was 0.88 (95% Confidence Interval [CI] 0.81-0.94). For MI these values were 0.70 and 0.93, and 0.75 (0.65-0.85); for LT 0.70 and 0.90 and 0.75 (0.66–0.84); for OS 0.70 and 0.86 and 0.70 (0.61–0.80); and for ED 0.42 and 0.96 and 0.66 (0.58–0.75).

Conclusion : This is the first comprehensive men’s health self-risk assessment tool for 7 important diseases. Moderate internal validity was demonstrated for 5 diseases, meeting the public health objectives of “You Check” which is now in the public domain and under appropriate monitoring and evaluation (

Keywords : men’s health, risk assessment, risk model, internal validation

In general, men have poorer health status compared to women1: with higher all-cause and disease-specific mortality rates and shorter life expectancy and healthy life expectancy.2,3 In 2015, the life expectancy at birth for Canadians was 79.5 years for males and 83.9 years for females.2 Canadian men also have higher disability-adjusted life years (DALYs), an expression of total health loss due to disability and premature mortality. The fact that men have shorter lifespans is often left unquestioned in practice and presumed to be ‘natural’ or inevitable. 1 However, the top risk factors that waste the healthy life years of Canadian men and women4 can be reduced by upstream healthy lifestyle improvements.

Gender is a key determinant of health, and gender inequity is a key contributor to health inequalities.5 While previous health-related gender discussion has been focused on women’s health issues, men’s health promotion would also benefit from a similar gender-based male-focused lens.6 The concept of “men’s health” encompasses all health issues that shorten life expectancy or reduce quality of life: lifestyle factors, health behaviours, and the way in which men perceive and react to their health risks. In Canada, the concept of men’s health is capturing the interest of various stakeholders, in much the same way as earlier initiatives that improved the health and lives of girls and women. 5

Risk prediction models consider multiple risk factors to estimate the probability that a certain outcome is present or will occur in an individual.7 Although individualized health risk assessment tools already exist for specific diseases and conditions in clinical and public health use domains,8–10 there are none that assess multiple disease risks simultaneously with the same sets of questions, and provide health promotion messages that are comprehensible to the general public. The Canadian Men’s Health Foundation (CMHF) has developed and validated a self-risk assessment tool called “You Check” (, designed to provide simultaneous risk assessment for multiple diseases using a holistic approach to health promotion and communication, with real-life health promotion follow-up strategies. This paper presents the development methodology and the results of internal validation of this risk assessment tool.


Development of the Risk Tool

To select the diseases for inclusion in our tool, we reviewed the results of the global burden of disease 2010 study for Canada11 and selected the most burdensome and largely preventable diseases as well as conventional men’s health conditions. We looked at the 10-year risk of developing myocardial infarction (MI), type 2 diabetes, prostate cancer, osteoporosis, erectile dysfunction (ED), low testosterone (LT), and the current risk of major depression. Clinical experts from Vancouver General Hospital and the University of British Columbia’s Faculty of Medicine selected the key predictors that could be readily measured in an online questionnaire. The selected risk factors included age, ethnicity, family history, diet, smoking history, body mass index (BMI), blood pressure, alcohol intake, plasma glucose, total cholesterol, level of physical activity, sleep duration, and presence of snoring. We found limited evidence for risk factors in predicting the onset of depression in our literature review. Our expert panel also determined that the underdiagnosis of depression in men and delay in treatment leading to the risk of suicide was of greater public health importance. As such, we identified the 10-item version of Center for Epidemiologic Studies’ Depression (CES-D) questionnaire12,13 as the shortest possible and validated questionnaire for current risk of depression for inclusion in our tool. Hence, depression was not included in our You Check risk model development and validation.

To develop risk scores, clinical experts assigned weighting to each risk factor and disease pair based on their experience and knowledge of available literature. Forty hypothetical patients were then created, with and without each of the target diseases, and a “risk score” and an “importance score” were then assigned for each risk factor in each of the target diseases in each of these cases. The weightings were “fine-tuned” to maximize the ability to predict for the known outcome. Risk scores used a 5-point Likert scale to indicate the likelihood of a given risk factor being related to a given disease. The importance score weighted each risk factor, ranging from −3 to 30, where 30 indicated the most significant predisposing risk and negative values indicated protective effects. For each risk factor-disease pairing, the product of the risk score and the importance score was the “risk value” which was then used to calculate a risk percent using the formula: risk percent = (calculated risk value - minimum risk value) / (maximum risk value - minimum risk value). Finally, the risk percentages for each disease were categorized as low, intermediate, or high risk, using 2 cut off points for each disease:

Low-risk, if 0 ≤ RP < LCOP; Medium-risk, if LCOP ≤ RP < UCOP; High risk, if UCOP ≤ RP ≤ 100 (where RP = Risk Percent, LCOP = Lower Cut Off Point, and UCOP = Upper Cut Off Point).

Internal Validation of the Risk Tool

For internal validation of the risk models, we used a case-control study design. Four family practice medical clinics in Vancouver (Spectrum Health Care and The Doctors Office) and Burnaby (Central Park Medical Clinic and Old Orchard Medical Clinic) were reviewed from October 2013 to January 2014. Cases were men with a known diagnosis (based on ICD-9 codes) of at least one of the target diseases between 2010 and 2013 with a minimum of 8–10 years of medical records before the year of diagnosis. Records had to contain documentation of at least 3 out of 5 biological risk factors (BMI, waist circumference, blood pressure, fasting blood glucose, and total serum cholesterol) and at least 2 out of 5 lifestyle risk factors (smoking, alcohol consumption, physical activity, eating habits, and sleeping at least 7 hours in a 24-hour period). Controls were age-matched male patients from the same family practice clinics who did not have the target disease or a diagnosis closely related to the target disease. Sample size was determined using the rule of a minimum of 10 observations per variable. 14 The minimum diagnostic criteria for each of the 5 diseases are shown in Table 1.

Cases and controls were reviewed by a single, unblinded researcher through retrospective, anonymized chart review. The reviewer applied the “You Check” questionnaire to the latest encounter date in the medical record. Since the tool only allows a non-response (“I prefer not to respond” or “I don't know”) for ethnic background, waist circumference, blood pressure, blood glucose, blood cholesterol, family history of diseases, and a query on firmness of erections when needed, then in cases where there were no data available in the chart, risk scores and importance scores were assigned based on average population values. For weight, fasting blood glucose, total blood cholesterol, and blood pressure, the average levels over the 10-year time interval before diagnosis of the target disease were used.

Data from 110 records without the 7 conditions were collected to create the control group. For this group, age at diagnosis was defined as age at last encounter. To adjust for the confounding effect of age on disease risk, cases and controls were matched on age. For each disease, one nearest age-match control was chosen without replacement. Stata software version 13.1 was used for statistical analysis (StataCorp, College Station, Texas). Propensity score matching command was used in Stata (psmatch2 module, version 4.0.10, dated 10 Feb 2014, by E. Leuven and B. Sianesi). Descriptive and bivariate inferential analyses were performed for each “risk factor-disease” pair. Pearson chi-square and point and interval estimate of odds ratio were used for binary and non-ordinal categorical risk factors. Chi-square for linear trend was used for ordinal categorical risk factors. Student’s t-test was used for continuous risk factors. A significance level of 0.05 was chosen for inferential analyses.

Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC) curve was used as the discrimination measure, and sensitivity and specificity were used as classification measures.15,16 To control for over-optimism in estimation of internal validity we used bootstrap resampling.17 For each disease, bootstrap resampling stratified by cases and controls was used with 1,000 draws. We set our goal for risk model accuracy as being at least at moderate level of statistical accuracy, such as AUC values ≥ 0.7.18 ROC analysis was used for determination of 2 optimal thresholds to be used for categorization of risk of disease in 10 years as low, medium, or high.

The first threshold was defined by breaking the spectrum of risk percentages into high- and low-risk categories, using the cutoff point which gave the highest correct classification. The high-risk category was then further split into 2, in able to create the medium and high-risk categories, using the same approach. For diseases with 2 tied (repetitive) values of maximum correct classification proportion (yielding different correspondent values of sensitivity and specificity for each tie), the cutoff point with higher specificity was chosen, and in the case of 3 ties, the cutoff point with the median value of specificity was used. Specificity was used to prevent false-positive alarms being given to users who are not at the highest or higher actual risk of disease.

For diseases with AUC < 0.7, the original risk scores and importance scores were adjusted empirically to increase their discriminative accuracy in a subsequent ROC curve analysis. The empirical adjustment of risk scores and importance scores was guided by considering the tabulation and graphic visualization of joint distribution of the risk factors and the disease among cases and controls. In case the AUC for an adjusted model did not reach the minimum target set for AUC, we used a risk-factor profile approach. That was the case for prostate cancer, for which we had intentionally not included Prostatic-Specific Antigen (PSA) as a risk factor. We used a risk-factor profile approach based on clinical and patient guidelines.19–21 The optimal cutoff point for the 10-item depression scale of the Center for Epidemiologic Studies (CES-D) was identified as being equal to or greater than 10, categorizing its risk continuum of 0 to 30 into 2 categories of low or high risk of current depression.12 In developing the results and recommendations section of the online tool, we used the approach of small sequential changes for improvement of healthy life style factors, based on the Transtheoretical Model (TTM) for behaviour change.22

Ethics Approval

This study involved no patient or participant recruitment. Anonymized data without any personal identifiers were collected in a retrospective chart review and was approved by the University of British Columbia Research Ethics Board (H13-00817 and H14-01364).


Sun Life Financial Canada ( and Mohseni Foundation (


The “You Check” tool assesses the risk of MI and diabetes and current risk of major depression as selected from the top 10 causes of DALYs in Canadian men in 2010.11 Osteoporosis was included because it is commonly underrecognized as a disease of aging men, as it is generally considered by both the public and health professionals as a women’s health condition. 23 Prostate cancer, ED, and LT were selected as common and publicly recognized men’s health conditions. DALYs for the 7 diseases selected for inclusion in our risk tool are summarized in Table 2.

Model Validation. For each target disease, we identified 50 cases and 50 controls with adequate data, except for MI, which had 40 cases and 40 controls. The distributions of risk values and risk percentages by cases and controls for each disease are summarized in Tables 3 and 4, respectively. ROC curves for diabetes type 2 and for all diseases are shown in Figures 1 and 2, respectively. The first models using the identified cutoff points for prostate cancer and ED had AUC < 0.7. Model refinement and retest improved AUC for ED to 0.66 (95% Confidence Interval 0.58 to 0.74) but AUC for prostate cancer remained as low as 0.52 (95% CI 0.41 to 0.62). Therefore, the original risk model for prostate cancer was set aside and a risk-profile approach was chosen instead. The 3-category risk estimates and the validity measures for each disease are shown in Tables 5 and 6, respectively.

Fig 1 ROC curve for all risk percentages by cases (in red) and controls (in blue), diabetes type 2. Hollow circle identifies the first optimal cutoff point (COP) based on maximum correct classification proportion (max C-Class).

FIG 2 ROC curve for all risk percentage cutoff points (COPs) by disease.


“You Check” is unique in that it provides men a user-friendly questionnaire that addresses 7 target diseases at one time. Before releasing this self-risk assessment tool into the public domain, we sought to validate it internally using male patient data from primary care physician practices. We were able to demonstrate moderate internal validity of the “You Check” risk model (AUC ≥ 0.70) for type 2 diabetes, MI, LT, and osteoporosis; and modest internal validity for ED. For prostate cancer, we chose a “risk-factor profile” approach, designating respondents’ risk profiles as high, intermediate, or low, rather than categorizing them according to actual development of the disease.

In developing the “You Check” tool we did not use the conventional method of choosing coefficients from multivariable models for derivation of the risk scores in our risk models, but used clinician-assigned scores. Although this is not the optimal method for development of risk scores, important risk models have been developed using clinician-assigned scores such as the Apgar score,24 Glasgow Coma Scale,25 Norton, Waterlow, and Braden scores for pressure ulcers,26 and American Society of Anesthesiologists (ASA) physical status score for prediction of perioperative and anesthetic-related mortality.27 Altman and Royston14 and Zhou et al.28 discuss the significance of expert based risk prediction models – such as You Check – for meeting clinical and public health objectives, when high-level, statistically generated accurate risk models (i.e. AUC ≥ 0.90) are not feasible or available.

There are multiple validated prediction tools for a variety of different diseases. In general, the sensitivity and specificity of risk prediction models can be categorized as high (≥ 0.70,) moderate (0.60–0.69) and low (< 0.60).29 In this study, the sensitivity and specificity of the “You Check” diabetes model are 0.86 and 0.84, respectively, for the lower cutoff and 0.54 and 0.96 for the higher cutoff. In comparison, the sensitivity and specificity of the Finnish diabetes risk assessment tool (FINDRISK) are 0.78 and 0.76 respectively, in their 1987 cohort for a point score ≥ 9, both of which are high.30 The Australian diabetes risk model (AUSDRISK) has a high sensitivity of 0.74 and a moderate specificity of 0.68, for a point score ≥12.31 For the Canadian diabetes risk assessment tool (CANRISK), these values are respectively 0.95 and 0.28 (for “Slightly elevated” risk category, i.e. score 21 in paper version of CANRISK), 0.70 and 0.67 (for “Balanced” risk category, i.e. score 32 in paper version of CANRISK), and 0.30 and 0.94 (for “Very high” risk category, i.e., score 43 in paper version of CANRISK).10

For cardiovascular risk, the Framingham global model has a sensitivity of 0.46 and specificity of 0.83.8 These values are 0.46 and 0.83 respectively for the Scottish cardiovascular risk model (ASSIGN) 32 The Systematic Coronary Risk Evaluation model (SCORE) has combinations of moderate or low sensitivity (from 0.19 to 0.97) and specificity (from 0.15 to 0.95) based on different sets of validation cohorts and cut off levels.33 In our “You Check” model, the sensitivity and specificity of MI are 0.70 and 0.75 respectively for the lower cutoff and 0.40 and 0.93 for the higher cutoff.

Discriminative ability measures (AUC or c statistic) of risk prediction models can be categorized as excellent (0.90–1.00), good (0.80–0.89), modest or moderate (0.70–0.79), or low (0.50–0.69).34 AUC of the diabetes risk model of “You Check” (0.88) and the Finnish diabetes risk assessment tool (0.86).30 are in the “good” level of accuracy. The Australian and Canadian diabetes risk assessment tools have AUC levels of 0.78 and 0.75 respectively.10,31 The Framingham Offspring Study’s diabetes type 2 model has an AUC of 0.72 (for their simple clinical model).35 For cardiovascular risk Harrell’s c statistic is 0.71 for the Reynolds global score for men.36 The AUC value for Scottish cardiovascular risk model (ASSIGN) is 0.73 for men.32 The “You Check” risk model for MI has an AUC of 0.75.

This study has several limitations. First, random and systematic errors in self-reported health data, and misclassification in disease ascertainment could affect the model’s accuracy. Secondly, non-probability sampling tends to reduce the internal and external generalizability of the results, especially if the sampling list includes known and unknown specific patterns. As we used the reverse chronological list of the patient encounters that spanned over several months, the likelihood of the presence of known and unknown specific patterns in patient encounters were minimized. Thirdly, when a reviewer is not blinded to the case or control status of the patients, differential measurement bias might occur unintentionally or intentionally. However, if the reviewer takes note of potential sources of bias and diligently uses the same degree of meticulousness in standard chart review procedures regardless of the case or control status, these sources of bias are minimized. Another limitation was missing data on ethnicity, physical activity, and healthy eating habits which were generally lacking in the clinical charts.


Overall, the results of this study demonstrate that the risk models developed for the CMHF online risk assessment tool are internally valid. External validation of the tool is in progress. We are currently evaluating the actual effects on health behaviour and disease outcomes of men who use “You Check”.

Competing Interests

None declared.


The clinical experts for development of the “You Check” tool were Dr. Richard Bebb (Endocrinology), Dr. Stacey Elliott (Psychiatry), Dr. Elliott Goldner (Psychiatry), Dr. Saul Isserow (Cardiology), and Dr. Roger Sutton (Nephrology).


1. Goldenberg SL. Status of men's health in Canada. Can Urol Assoc J 2014 Jul;8(7–8 Suppl 5):S142–4.

2. GBD 2015 Mortality and Causes of Death Collaborators. Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 2016 Oct 8;388(10053):1459–44.

3. GBD 2015 DALYs and HALE Collaborators. Global, regional, and national disability-adjusted life-years (DALYs) for 315 diseases and injuries and healthy life expectancy (HALE), 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 2016 Oct 8;388(10053):1603–58.

4. GBD 2015 Risk Factors Collaborators. Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 2016 Oct 8;388(10053):1659–24.

5. Goldenberg SL. Men's Health Initiative of British Columbia: connecting the dots. Urol Clin North Am 2012 Feb;39(1):37–51.

6. Fernández-Sáez J, Ruiz-Cantero MT, Guijarro-Garví M, et al. Looking twice at the gender equity index for public health impact. BMC Public Health 2013 Jul 16;13:659.

7. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J 2014 Aug 1;35(29):1925–31.

8. D'Agostino RB Sr, Vasan RS, Pencina MJ, et al. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation 2008 Feb 12;117(6):743–53.

9. Kim DJ, Rockhill B, Colditz GA. Validation of the Harvard Cancer Risk Index: a prediction tool for individual cancer risk. J Clin Epidemiol 2004 Apr;57(4):332–40.

10. Robinson CA, Agarwal G, Nerenberg K. Validating the CANRISK prognostic model for assessing diabetes risk in Canada's multi-ethnic population. Chronic Dis Inj Can 2011 Dec;32(1):19–31.

11. Murray CJ, Vos T, Lozano R, et al. Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet 2012 Dec 15;380(9859):2197–223.

12. Zhang W, O'Brien N, Forrest JI, et al. Validating a shortened depression scale (10 item CES-D) among HIV-positive people in British Columbia, Canada. PLoS One 2012;7(7):e40793.

13. Morin AJ, Moullec G, Maïano C, et al. Psychometric properties of the Center for Epidemiologic Studies Depression Scale (CES-D) in French clinical and nonclinical adults. Rev Epidemiol Sante Publique 2011 Oct;59(5):327–40.

14. Altman DG, Royston P. What do we mean by validating a prognostic model? Stat Med 2000 Feb 29;19(4):453–73.

15. Steyerberg EW. Clinical Prediction Models. Springer; 2009.

16. Pepe, MS. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press; 2004.

17. Efron B, Tibshirani R. An Introduction to the Bootstrap. Springer Science; 1993.

18. Swets JA. Measuring the accuracy of diagnostic systems. Science 1988 Jun 3;240(4857):1285–93.

19. Goldenberg SL, Pickles T, Chi K. The Intelligent Patient Guide To Prostate Cancer. Vancouver : Intelligent Patient Guide Ltd; 2013.

20. Hoeh MP, Deane LA. PSA screening: A discussion based on the USPSTF recommendations and the AUA and EAU guidelines. J Mens Health 2014 Apr 8; 11(1):10–17.

21. Whittemore AS, Kolonel LN, Wu AH, et al. Prostate cancer in relation to diet, physical activity, and body size in blacks, whites, and Asians in the United States and Canada. J Natl Cancer Inst 1995 May 3;87(9):652–61.

22. Noar SM, Chabot M, Zimmerman RS. Applying health behavior theory to multiple behavior change: considerations and approaches. Prev Med 2008 Mar;46(3):275–80.

23. Geusens P, Dinant G. Integrating a gender dimension into osteoporosis and fracture risk research. Gend Med. 2007;4 Suppl B:S147–61.

24. Apgar V. A proposal for a new method of evaluation of the newborn infant. Curr Res Anesth Anal. 1953;32:260–7.

25. Teasdale G, Jennett B. Assessment of coma and impaired consciousness. A practical scale. Lancet 1974;2:81–4.

26. Anthony D, Parboteeah S, Saleh M, et al. Norton, Waterlow and Braden scores: a review of the literature and a comparison between the scores and clinical judgement. J Clin Nursing 2008;17:646–53.

27. Bainbridge D, Martin J, Arango M, et al. Perioperative and anaesthetic-related mortality in developed and developing countries: a systematic review and meta-analysis. Lancet 2012;380:1075–-81.

28. Zhou X, obuchowsk NA, McClish DK. Statistical Methods in Diagnostic Medicine. John Wiley & Sons, Inc; 2011.

29. Thoopputra T, Newby D, Schneider J, et al. Survey of diabetes risk assessment tools: concepts, structure and performance. Diabetes Metab Res Rev 2012 Sep;28(6):485–98.

30. Lindström J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care. 2003 Mar;26(3):725–31.

31. Chen L, Magliano DJ, Balkau B, et al. AUSDRISK: an Australian Type 2 Diabetes Risk Assessment Tool based on demographic, lifestyle and simple anthropometric measures. Med J Aust 2010 Feb 15;192(4):197–202.

32. Woodward M, Brindle P, Tunstall-Pedoe H; et al. Adding social deprivation and family history to cardiovascular risk assessment: the ASSIGN score from the Scottish Heart Health Extended Cohort (SHHEC). Heart 2007 Feb;93(2):172–6. Epub 2006 Nov 7.

33. Conroy RM, Pyörälä K, Fitzgerald AP, et al. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. Eur Heart J 2003 Jun;24(11):987–1003.

34. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010 Jan;21(1):128–38.

35. Wilson PW, Meigs JB, Sullivan L, et al. Prediction of incident diabetes mellitus in middle-aged adults: the Framingham Offspring Study. Arch Intern Med 2007 May 28;167(10):1068–74.

36. Ridker PM, Paynter NP, Rifai N, et al. C-reactive protein and parental history improve global cardiovascular risk prediction: the Reynolds Risk Score for men. Circulation 2008 Nov 25;118(22):2243–51.