Introduction

Computer simulation models are being used increasingly both to model the progression of Type 2 diabetes and to estimate lifetime outcomes associated with different disease management strategies. These models estimate the future occurrence of diabetes-related complications and quantify outcomes in terms of mean life expectancy or mean quality-adjusted life expectancy. They can also be used to estimate future healthcare costs of patients with Type 2 diabetes, but their main purpose is to estimate the cost-effectiveness of different disease management strategies, especially when evidence of the impact of interventions on surrogate endpoints is limited, or where evidence from clinical trials has to be extrapolated over patients’ lifetimes. Currently, there are at least five simulation models being used in these ways [1, 2, 3, 4, 5].

Existing simulation models have synthesised data from a variety of sources into a series of modules that are used to estimate occurrence of different complications. For example, the first model of the progression of Type 2 diabetes had separate modules for cardiovascular disease, retinopathy, nephropathy and neuropathy [1], and used a probabilistic Monte-Carlo analysis to simulate event histories over the remaining lifetimes of newly diagnosed patients with Type 2 diabetes. While that model represented a landmark in the use of computer simulation to model the progression of the disease, it had several limitations. In particular, it placed considerable reliance on data from a Type 1 diabetes trial [6] and on cardiovascular risk estimates derived from the Framingham cohort study, despite there being only 337 people with Type 2 diabetes in the Framingham study and consequent doubts concerning its predictive accuracy for such patients [7]. The Framingham cardiovascular risk equations for the general population were modified by a dichotomous variable indicating the presence or absence of diabetes. It was also unable to simulate the timing or occurrence of multiple complications and hence could not fully capture the clustering of complications that tend to arise [2]. While subsequent simulation models have been able to address some of these limitations [2], there is a need for a comprehensive integrated system of equations from a large and well-validated diabetes-specific data source to forecast major diabetes-related complications.

Observational analyses using data from the UKPDS have examined the relationship between exposure to glycaemia [8] and levels of systolic BP [9] over time, and the development of macrovascular and microvascular complications. The results of these studies, expressed in terms of the change in relative risk of major complications associated with decrements in HbA1c or systolic BP, were consistent with those obtained with regimens of intensive control of blood glucose and BP in the clinical trial. The UKPDS Risk Engine, a risk calculator for estimating absolute risk of coronary heart disease [10] and stroke [11], has been developed based on factors such as age, sex, race, smoking status, and levels of HbA1c, systolic BP and lipid ratios in the first 2 years after diagnosis of Type 2 diabetes.

The model presented here, called the UKPDS Outcomes Model, develops further the modelling of lifetime outcomes, through a system of equations, to predict the occurrence and timing of seven diabetes-related complications and death, and thereby to calculate life expectancy and quality-adjusted life expectancy for patients with Type 2 diabetes. A key aspect of this model is that it is designed to capture the association between different types of complications at an individual patient level. Complications may be associated not only because they share common known risk factors, but also due to the event-related dependence that arises when one complication substantially increases the likelihood of another [2]. For example, the probability of a patient experiencing congestive heart failure (CHF) or myocardial infarction (MI) is positively associated with systolic BP, but the risk of an MI is higher for patients with a history of CHF due to event-related dependence. To account for these dependencies the model makes use of time-varying risk factors, which also facilitates its application to patient groups at different stages of the disease.

Subjects and methods

Study subjects

The UKPDS was a clinical trial evaluating regimens with more intensive control of blood glucose and BP than had been described previously [12]. Briefly, between 1977 and 1991, 5102 patients with newly diagnosed Type 2 diabetes were recruited to the study. The subjects were aged between 25 and 65 years, were subsequently shown to have a fasting plasma glucose of above 6 mmol/l on two occasions, had no recent history of MI, ischaemic heart disease (IHD) or CHF, and had never had more than one major vascular event or a severe concurrent illness that would limit life expectancy. Biochemical measurements were taken, including HbA1c, systolic BP, and lipid and lipoprotein fractions at entry to the study, at randomisation after 3 months of dietary therapy, and in each subsequent year. The systolic BP result was the average of three measurements taken at the same visit. Lipids were included in the model as the ratio of total cholesterol : HDL cholesterol. The methodology for the biochemical measurement of HbA1c in the UKPDS, and standardisation with measures reported in the DCCT has been reported elsewhere [13]. Included in this analysis were results from 3642 patients (white, Asian–Indian and Afro-Caribbean) where annual data on potential risk factors were available. The characteristics of these patients and comparisons with the 3867 patients included in the UKPDS glucose control study have been reported elsewhere [8].

The study design and protocol amendments, which are in accordance with the guidelines of the Declaration of Helsinki as revised in 2000, were approved by the Central Oxford Research Ethics Committee and by the equivalent committees at each centre. Each patient also gave informed witnessed consent.

Model structure

The aim of the UKPDS Outcomes Model is to estimate the first occurrence of each of seven diabetes-related complications (fatal or non-fatal MI, other IHD, stroke, heart failure, amputation, renal failure and eye disease measured in terms of blindness in one eye) and death in order to estimate lifetime outcomes and quality-adjusted life expectancy. The model is based on an integrated system of parametric equations used to estimate absolute risk of any of the above complications based on the patient’s characteristics (e.g. age and sex), and time-varying risk factors such as HbA1c and the patient’s history of complications. Definitions of all risk factors initially included in the model are listed in Table 1. To improve model stability, values for time-varying risk factors (HbA1c, systolic BP, and total cholesterol : HDL cholesterol) were calculated as the means of the values for the previous 2 years (e.g. the value for these variables in the fourth year was the average of the values from the second and third years), while smoking status was based on 3-year periods from diagnosis of diabetes. To model the long-term dependence [14] between different complications, time-varying covariates indicating whether a patient had had a history of other complications since diagnosis of diabetes were also included in the model. These covariates were set to 0 until an event occurred, and then set to 1 from that point onwards.

Table 1 Abbreviations, definitions and values of variables included in the model

The procedures used to fit equations to the observed events in the UKPDS population, as well as the estimated parameters, are reported in greater detail in the statistical appendix. In brief, each type of diabetes-related event was modelled using one or more equations that included time-varying risk factors. In the case of diabetes-related complications, a Weibull proportional hazards regression was used to model the occurrence of a composite outcome covering both fatal and non-fatal events. The coefficients for risk factors were then estimated using maximum likelihood methods that account for censoring (e.g. due to factors such as loss of follow-up, or death). Risk factors with a p value of less than 0.05 were considered statistically significant. Separate equations were used to model diabetes- and non-diabetes-related mortality using a combination of Gompertz and logistic regression equations. Finally, the time paths of four risk factors were also estimated using these data. To aid exposition, we grouped the estimated equations into the following categories.

Cardiovascular disease

The increased risk of cardiovascular disease among people with Type 2 diabetes is well established and is increasingly recognised as one of the major hazards of Type 2 diabetes. In developed countries, cardiovascular disease accounts for the majority of deaths amongst patients with Type 2 diabetes [15] and is a significant contributor to diabetes-related healthcare costs [16]. Within the cardiovascular disease category we have three Weibull equations to estimate the absolute risk of the following: (i) MI, defined as non-fatal myocardial infarction (International Classification of Diseases, Ninth Revision [ICD-9] code 410), or fatal vascular cardiac event (ICD-9 codes ≥410–≤414.9, or ≥428–≤428.9), or sudden death (ICD-9 codes ≥798–≤798.9); (ii) IHD, defined as an ICD-9 code of ≥411–≤414.9; and (iii) CHF, defined as an ICD-9 code of ≥428–≤428.9. We only recorded IHD and CHF events if they occurred prior to an MI event; a patient who had experienced an MI was not classified subsequently as developing IHD.

Cerebrovascular disease

People with Type 2 diabetes have a higher risk of stroke, and previous epidemiological studies have found that this risk increases with age, elevated BP, smoking and lipoprotein levels [17]. In our model a single Weibull equation was used to estimate the absolute risk of a first non-fatal stroke (ICD-9 codes ≥430–≤434.9, or 436) or fatal stroke (ICD-9 codes ≥430–≤438.9).

Amputation

Patients with diabetes have an amputation rate that is many times higher than that of patients without diabetes, and previous studies have found that the primary risk factors for amputation include age, (male) sex, smoking, the presence of peripheral vascular disease, and BP [18]. Our model contains a single Weibull equation to estimate the absolute risk of a first amputation of a digit or limb (ICD-9 codes ≥5.845–≤5.848, or 250.6) or a fatal peripheral vascular event (ICD-9 codes 997.2, 997.6, 250.6 or 440.2).

Blindness

The model contains a Weibull equation to estimate the risk of blindness in one eye, which is defined as a visual acuity Snellen 6/60 or ETDRS logMAR 1.0 or worse for any reason, persisting for 3 months (ICD-9 codes ≥369–≤369.9).

Nephropathy

Increasing duration of diabetes, hypertension and poor glycaemic control have been shown to elevate the risk of renal disease [19] and the subsequent risk of mortality [20]. Our model contains a single Weibull equation to predict renal failure, defined as creatinine levels of above 250 µmol/l, not ascribable to any acute inter-current illness (ICD-9 codes 250.3, and ≥585–≤586), and death due to renal failure (ICD-9 codes ≥580–≤593.9).

Risk factor progression

Previous simulation models of the progression of diabetes have made a variety of assumptions regarding how risk factors such as HbA1c and systolic BP change over time [2, 4]. While carrying forward current values provides a simple means of extrapolation [21, 22], it does not account for the likely change in some of these risk factors over time (e.g. the upward trend in HbA1c levels that many patients experience). In the UKPDS Outcomes Model, we treat the four risk factors (HbA1c, systolic BP, total cholesterol : HDL cholesterol, and smoking status) as panel data (longitudinal data), and fit a random effects model to allow estimation of a time path [23]. While the exact specification differs between risk factors, these equations generally include the value or status of the risk factor at the time a decision was made regarding randomisation in the UKPDS (which took place after a 3-month dietary run-in) [24], the number of years since diagnosis, and the value of the risk factor in the preceding period. For HbA1c an indicator variable for the second year was also included to account for the initial decline in HbA1c observed in the study. Equations for HbA1c, systolic BP, and total cholesterol : HDL cholesterol were based on annual measures of each risk factor, while smoking status (due to its less frequent ascertainment) was based on 3-year periods from diagnosis of diabetes.

Mortality

People diagnosed with Type 2 diabetes in middle age typically have a life expectancy of between 5 and 10 years shorter than people without diabetes, with heart disease being the major underlying cause of death [25]. Our model contains three equations for estimating absolute risk of mortality. Two of these equations estimate the likelihood of death after the occurrence of several diabetes-related complications that have been shown to elevate the risk of mortality. The first equation, based on logistic regression, estimates the probability of death in the first year in which either an MI, CHF, stroke, amputation or renal failure first occurs (as defined in the Weibull models outlined above). The second equation estimates the risk of diabetes-related mortality of patients with a history of any of these events in all subsequent years. The third mortality equation estimates the risk of death from causes unrelated to diabetes (primarily due to cancers and accidents).

Using the model to simulate outcomes

The main purpose of the model is to estimate the likely occurrence of the major diabetes-related complications over a lifetime for patients with specified prognostic risk factors, in order to calculate health outcomes such as life expectancy or quality-adjusted life expectancy.

While increases in mean life expectancy are widely recognised as a useful measure of benefit [26], it is also important to encapsulate the potential impact of complications on quality of life. This model uses the quality-adjusted life-years (QALY) method to adjust length of life for quality of life by assigning a value or health utility, on a scale on which 0 represents death and 1 represents full health, for each year of life.

The impact on utility of different diabetes-related complications has been reported in a number of studies, including one study in which the EQ-5D health status questionnaire [27] was used to survey 3192 patients still participating in the UKPDS in 1997 [28]. Based on data from that survey, the mean utility for patients free of microvascular and macrovascular complications was estimated at 0.78. Patients with a history of complications have a lower utility and the following decrements were estimated: −0.055 for an MI; −0.090 for other IHD; −0.164 for stroke; −0.108 for heart failure; −0.280 for amputation; and −0.074 for blindness in one eye. For simplicity and in the absence of sufficient data to estimate empirically, it is assumed that multiple complications have an additive effect on utility, e.g. the utility of a patient who had other IHD and then had an MI would first be decremented by 0.090 and then by a further 0.055. However, the UKPDS Outcomes Model is sufficiently flexible to allow other assumptions to be adopted regarding the impact that different complications (and combinations of complications) have on quality of life.

When estimating QALYs it is important to model the sequence of endpoints for each patient. For example, consider the simulated profiles for two patients who are free of complications: the first patient is predicted to have an amputation in the second year post follow-up, an IHD and an MI in the fourth year post follow-up, and is predicted to die in the fifth year; the second patient is predicted to have IHD in the third year post follow-up, an amputation in the fourth year, and a fatal MI in the fifth year. Using the utility weights reported above, while both patients experience the same set of diabetes-related complications over their remaining lifetimes, the QALY profile is different: the profile of the first patient is equivalent to 2.19 QALYs, and the second is equivalent to 2.66 QALYs. By simulating individual patient histories, the model is able to take these differences into account.

Simulations are based on a probabilistic discrete-time illness–death model [14] with annual cycles. In the model, patients start with a given health status (e.g. no complications) and can have one or more non-fatal complications and/or can die in any model cycle. Figure 1 provides an algorithm that illustrates the sequence of modelling events. To begin, the values for all the characteristics for each patient included in the simulation are set (the variables representing these characteristics are listed in Table 1). The first cycle of the model is then run, in which the probability of death or of experiencing the seven complications is calculated. Each probability is compared with a random number drawn from a uniform distribution ranging from 0 to 1 to determine whether an event occurs.

Fig. 1
figure 1

Algorithm for model simulation

There is a need to account for event-related dependence between some complications, as noted above. The actual event-related dependencies were estimated empirically during the equation-fitting procedure, and included if they had a significance level (p value) of less than 0.05. We then checked all significant dependencies against the known epidemiology, as discussed in more detail below. In the model, if a non-fatal event is predicted to have occurred, the patient acquires a history of that event (the time-varying covariate for that endpoint is updated from 0 to 1), and thereafter carries that history in their set of characteristics. In addition, as noted above, the two diabetes-related mortality equations are applied to all patients who have a history of complications that were found to elevate the risk of death (MI, CHF, stroke, amputation or renal failure) from the point at which the complication is predicted to have occurred. These equations deal with the likelihood of immediate death following the first event, and the long-term elevation of risk of death consequent to acquiring a history of these complications. Finally a non-diabetes mortality equation is used to forecast the incidence of death from cancer, accidents or other causes not addressed elsewhere in the model.

If the model predicts that a death has occurred, the years lived and quality-adjusted years lived (life years decremented in each cycle by the reduced quality of life associated with particular complications) by that patient are calculated. If the patient is predicted to have survived that cycle, the risk factor equations are used to update their current risk factor values (e.g. systolic BP) and these are carried forward to the next cycle of the model along with the updated event history.

It is important to note that the order in which the event equations listed in Figure 1 (equations 1 to 10) are evaluated to determine the occurrence of an event is not predetermined. Further, some of these events are competing risks (e.g. if a patient dies within a cycle of the model, they can have no additional events). To take this into account, the equations are run in random order in each cycle.

Internal validation of the model using the UKPDS population

We tested the consistency of the forecast cumulative incidence of different complications and death to the cumulative incidence calculated using non-parametric (life-table) methods. This test of internal validity [29] was based on the forecast number of events from the second to the twelfth year after diagnosis, using the information from the first year after diagnosis in the 3642 patients used to estimate the model [11].

Application of the model

We demonstrate two potential applications of the model. Firstly, we used the simulation model to extrapolate outcomes over the remaining lifetime of the 3867 patients randomised to conventional or intensive blood glucose control in the UKPDS. Estimated outcomes are stratified by sex and 5-year age categories for patients over 45 years of age at diagnosis of diabetes. Secondly, we calculated the incremental benefit of a regimen of intensive blood glucose control compared with conventional therapy for patients’ remaining lifetimes in terms of life expectancy and quality-adjusted life expectancy. Previously, in the absence of a full outcomes model, we reported the cost-effectiveness of this intervention only in terms of time free of diabetes-related complications [30] using all patients randomised in that comparison rather than the 3642 used here. Given the illustrative nature of these applications we did not discount outcomes.

Handling uncertainty

Appropriate handling of uncertainty in complex patient simulation models such as the UKPDS Outcomes Model is not straightforward. In applications involving extrapolation of outcomes of patients who were alive at the end of the study, a combination of bootstrap methods and multiple imputation methods were used. The confidence intervals around the QALY estimates were adjusted to take into account the variance within and across imputed data sets [31]. Our method removes Monte-Carlo error, so that confidence intervals from the outcomes model reflect parameter uncertainty in the model.

Results

The median follow-up time of patients in the UKPDS was 10.3 years with 39 460 person-years of follow-up data available for model fitting. The coefficients representing the log of the hazard ratios (or the log of the odds ratio in the case of the logistic equations representing event fatality [equation 8] and smoking status [equation 14]), and the standard errors for each coefficient are listed in Table 2 (event equations), Table 3 (mortality equations) and Table 4 (risk factor equations). The numbers of subjects in which each equation was estimated are also shown. In three event equations (IHD, CHF and stroke), a small number of patients with missing information were omitted from that particular estimation, but the eliminated cases represent less than 1% of the total.

Table 2 Sample size, functional form, parameters and beta coefficients (SEs) for seven equations to estimate the probability of diabetes-related complications
Table 3 Sample size, functional form, parameters and beta coefficients (SEs) for three equations to estimate the probability of mortality
Table 4 Sample size, functional form, parameters and beta coefficients (SEs) for four equations used to estimate annual risk factor values from diagnosis of diabetes

The relationship between the equations is illustrated in Figure 2 along with the estimated hazard or odds ratio for each risk factor. When interpreting these ratios it is important to note that some of the risk factors have been subject to a log transformation. The arrows in the figure linking different equations indicate the existence and the direction of event-related dependence, e.g. the arrow from CHF to MI indicates that an individual who has experienced CHF subsequently has a 4.74-fold increased risk of MI.

Fig. 2
figure 2

Summary of model equations showing event-related dependencies and hazard/odds ratio for each risk factor. The hazard/odds ratio shown for each risk factor is the exponent of the beta coefficients shown in Tables 2 and 3

Respectively for the risk factors HbA1c, systolic BP, total cholesterol : HDL cholesterol and smoking status, Figure 3a–d shows: (i) the actual mean value of the risk factor (for the 3642 patients used to estimate the model); (ii) the predicted values, using the equations given in Table 4, for patients with differing values of the relevant risk factor at the time of the decision regarding randomisation in the UKPDS. For HbA1c, which is shown in Figure 3a, it is evident that patients with an initial value of between 6% and 8% subsequently follow a time path similar to the mean of the observed values for all patients, while the time path for patients with higher initial values differs from the mean of the observed values for all patients, possibly reflecting more intensive treatment aimed at improving the metabolic control of these high-risk patients.

Fig. 3
figure 3

Observed and simulated time paths for (a) mean HbA1c, (b) systolic BP and (c) total cholesterol : HDL cholesterol, from diagnosis of diabetes, as well as for (d) smoking status

Figure 4 shows the simulated and actual cumulative incidence for each of the diabetes-related events, and all-cause mortality up to 12 years from diagnosis. The estimated number of patients experiencing each event is initially based on baseline levels of risk factors and hence on the time paths predicted by the time-varying risk factor equations. The predicted numbers in all cases are within the 95% confidence intervals surrounding the actual cumulative incidence curves for the study population.

Fig. 4
figure 4

Observed and simulated cumulative incidence for each of seven diabetes-related complications, and all-cause mortality, for 12 years from diagnosis of diabetes. a. Equation 1, ischaemic heart disease. b. Equation 2, myocardial infarction. c. Equation 3, heart failure. d. Equation 4, stroke. e. Equation 5, amputation. f. Equation 6, blindness. g. Equation 7, renal failure. h. Equations 8, 9 and 10, mortality. Black solid lines, observed; red dotted lines, estimated; grey dotted lines, observed upper and lower 95% CIs

Two illustrative applications of the model

Estimates of mean life expectancy and mean quality-adjusted life expectancy by age group and sex are listed in Table 5. For example, the mean life expectancy for males between the age of 45 and 50 years at diagnosis of diabetes is 25.54 years (95% CI: 21.92–29.66). This compares with 30.05 years remaining life expectancy for males between the age of 45 and 50 years in the UK general population in 1999 to 2001 [32], suggesting a loss of approximately 5 years of life attributable to Type 2 diabetes, in line with previously reported estimates [25]. The corresponding mean quality-adjusted life expectancy is estimated to be 18.82 (95% CI: 16.48–21.15) QALYs. As in the general population, women have a higher life expectancy at all ages. It is important to note that life expectancies are for cohorts in the UKPDS, and given the criteria for inclusion in the study (e.g. no recent history of MI), these patients may have longer life expectancies than some diabetic cohorts in the general population. Further, reported confidence intervals are for the life expectancies of cohorts, and the estimated life expectancy for any individual patient would be subject to a much greater level of uncertainty.

Table 5 Estimated mean life expectancy and mean quality-adjusted life expectancy by age at diagnosis of diabetes in the UKPDS populationa

Table 6 reports the quality-adjusted life expectancy for UKPDS patients who were treated with a regimen of intensive and conventional blood glucose control under the trial protocol. The quality-adjusted life expectancy for the conventional group was 16.35 QALYs, and that for the intensive group was 16.62 QALYs, a difference of 0.27 (95% CI: −0.48 to 1.03) QALYs. Such estimates will permit the calculation of cost-effectiveness.

Table 6 Estimated life expectancy and QALYS for UKPDS patients on conventional and intensive blood glucose control

Discussion

Simulation models such as the UKPDS Outcomes Model make it possible to quantify the lifetime impact of chronic diseases and the long-term benefits associated with alternative management strategies. They are therefore an important adjunct to randomised controlled trials and other types of epidemiological study, and may also be of value in disease management and healthcare planning. They are essential for proper evaluation of interventions affecting any of the risk factors incorporated in the model, because they permit users to estimate the likely occurrence, over a lifetime, of the main diabetes-related complications. Only when this has been done can quality-adjusted life expectancy and lifetime treatment costs be fully estimated, and the cost-effectiveness of the intervention calculated.

As noted in the Introduction, a number of other diabetes simulation models have been developed, and the present model has benefited greatly from the insights these have provided.

The UKPDS Outcomes Model has some important advantages over some previous simulation models: (i) we used data from 3642 patients who participated in the UKPDS; (ii) there was a median follow-up period of 10.3 years; (iii) we analysed full clinical histories including biochemical measures and ascertainment of major diabetes-related complications; and (iv) there was a very low loss of data during follow-up. The quantity and quality of these data, and the fact that they have come from a single study, has obviated the need to synthesise evidence from many disparate sources. Indeed, we have also been able to use this patient population to estimate the impact of the same diabetes-related complications on subsequent healthcare costs and on quality of life. These results have been reported elsewhere [28] but in time will be integrated with the outcome model described here to form part of a more general UKPDS diabetes health economics model.

The volume of data available to us also removed many constraints on the modelling approach adopted and allowed us to construct a model based on a set of equations estimated using a consistent methodology. In particular, our use of time-varying covariates allowed us to model the linkages between different diabetes-related complications. These have increasingly been recognised as crucial to such simulation models: as a recent study has noted “...it can be catastrophically inaccurate to simulate each kind of event as if it were statistically independent” [2]. We have attempted to address this issue by incorporating linkages where there is a high degree of statistical significance and supporting clinical and epidemiological evidence. For example, our results for stroke are consistent with those from the Framingham study showing that a history of heart failure and atrial fibrillation increase the risk of subsequent stroke [17]. The model may also be used to simulate outcomes in populations with differing baseline characteristics such as age, smoking status, systolic BP, blood lipids, HbA1c and history of diabetes-related complications.

A number of limitations of the UKPDS Outcomes Model should be acknowledged. Firstly, the model only predicts the first event in any single category of diabetes-related complications, and does not allow series of events such as sequential amputations to be modelled directly. However, this limitation should not be overstated, as: (i) such multiple events in the UKPDS data were relatively infrequent; (ii) subsequent fatal events in specific categories of diabetes-related complications are included in the diabetes-related mortality equation; and (iii) additional post-study monitoring data will in time allow this issue to be revisited.

Secondly, the model does not explicitly incorporate a number of morbidities related to diabetes. In particular, peripheral neuropathy and ulceration may give rise to significant reduction in quality of life, to hospitalisation and to high treatment costs [33]. These conditions are also likely to elevate the risk of subsequent events such as amputation. However, these were not included as major endpoints in the UKPDS and cannot easily be incorporated into the outcomes model. Hypoglycaemia and hyperglycaemia are also excluded. Episodes of these conditions are infrequent and rarely serious, but more serious morbidity could occur in small numbers of patients undertaking intensive glycaemic management.

Thirdly, some complications of diabetes are represented in the model using a single state. For example, the only state representing eye disease in the model is the endpoint of blindness in one eye. Again, this is consistent with the primary analyses of the UKPDS, but it almost certainly fails to fully describe the complex process of disease progression and associated treatments, as well as the potential impact on quality of life. There is scope to develop a more detailed model of the progression of eye disease in future work.

Fourthly, we have briefly described the methods used to estimate confidence intervals around our estimates, but there are many important methodological and computational issues in the generation of these confidence intervals and more generally in the ways uncertainty should be handled in a complex model; these need to be explored in greater detail in the future.

We expect the UKPDS Outcomes Model to develop in a number of ways as further follow-up data become available. The greater length of follow-up will allow the shape of the hazard in the Weibull and Gompertz equations to be estimated with greater precision. The risk factors and linkages contained in the model can be re-examined, especially the equations that are currently based on a relatively small number of events (e.g. only 24 patients had renal failure).

The UKPDS Outcomes Model has been described here in some detail to demonstrate the rationale for our approach, the problems encountered and the remaining limitations. Publication of the coefficients of all equations and the description of the simulation algorithm have been included to facilitate a high degree of transparency and to promote a deeper understanding of the model. In this study we have demonstrated that the UKPDS Outcomes Model is able to simulate event histories that closely match observed outcomes of patients in the UKPDS study, and that it can also be used to extrapolate lifetime outcomes. The next challenge is to test external validity by applying the model to other populations of Type 2 diabetic patients. Until this has been undertaken, caution should be exercised when using the model to predict outcomes in these populations. We look forward to the assessment of the model’s external validity and to its future use in assessing outcomes of therapies for Type 2 diabetes.