Article Text


Comparative outcomes of total hip and knee arthroplasty: a prospective cohort study
  1. David Hamilton1,
  2. G Robin Henderson2,
  3. Paul Gaston1,
  4. Deborah MacDonald1,
  5. Colin Howie1,
  6. A Hamish R W Simpson1
  1. 1Department of Orthopaedics and Trauma, University of Edinburgh and Royal Infirmary of Edinburgh, Little France, Edinburgh, UK
  2. 2Department of Medicine of the Elderly, Royal Infirmary of Edinburgh, Little France, Edinburgh, UK
  1. Correspondence to Dr David Hamilton, Department of Orthopaedics and Trauma Chancellor's Building University of Edinburgh 49 Little France Crescent Edinburgh EH16 4SB, UK; d.f.hamilton{at}


Purpose The comparative outcome of primary hip and knee arthroplasty is not well understood. This study aimed to investigate the outcome and satisfaction of these procedures and determine predictive models for 1 year patient outcome with a view to informing surgical management and patient expectations.

Study design Prospective cohort study of all primary hip and knee arthroplasty procedures performed at the Royal Infirmary of Edinburgh between January 2006 and November 2008. General health (SF-12) and joint specific function (Oxford Score) was assessed pre-operatively and at 6 and 12 months post-operatively. Patient satisfaction was assessed at 12 months.

Results 1410 total hip arthroplasty (THA) and 1244 total knee arthroplasty (TKA) procedures were assessed. Oxford Score improved by 4.9 points more in THA patients than in TKA patients. SF-12 physical scores were on average 2.7 points greater in the THA patients at one year. Satisfaction was also greater (91%) following THA compared with TKA (81%). Regression modelling was not able to predict individual patient outcome; however, mean pre-operative Oxford Scores were found to be strong predictors of mean post-operative Oxford Scores for each procedure. Age, gender and pre-operative general health scores did not influence these models.

Conclusions Both THA and TKA confer substantial improvement in patient outcome; however, greater joint specific, general health and satisfaction scores are reported following THA. This difference is physical in nature. Regression models are presented that can be applied to predict mean hip/knee arthroplasty outcome based on preoperative values.

  • Arthroplasty outcome hip knee
View Full Text

Statistics from


Total hip arthroplasty and total knee arthroplasty (THA and TKA) are the only successful interventions in treating the morbidity of patients with end-stage osteoarthritis1 and are thus very common surgical procedures; with 70 000 of each performed annually in the UK alone.2 ,3

The literature is ambiguous as to the relationship between the outcomes of THA and TKA, though patients and clinicians commonly perceive this to be equivalent. This similarity is reinforced in the general medical literature, Gidwani and Fairbank4 writing specifically about knee arthroplasty noted comparable outcome with hip arthroplasty as a BMJ article summary point. Comprehensive comparative analysis of patient outcome is lacking. A few previous studies with small numbers of patients and using differing generic measures of health outcome have compared the two procedures and reported conflicting results.5–10 Recently, Wylde et al 10 suggested superior outcome following THA in a ‘mid-term review’ but this conclusion was based on direct comparison of isolated post-operative joint specific Oxford Hip and Knee Scores. The originators of the Oxford Scores considered this analysis inappropriate and that the results of this paper may be misleading.11 Preoperative data was not available, thus comparative change in scores could not be calculated.

Understanding the relative outcomes of joint arthroplasty is important, as individual patients often undergo both hip and knee procedures. Expectations of outcome are central to the patient's decision making process as to the merits of a surgical intervention. Access to accurate information as to the comparative outcome of hip and knee replacement is essential to avoid potentially unrealistic expectations of the outcomes of one procedure based on the outcome of the other.

Currently there are no good tools with which to predict postoperative outcome following arthroplasty, though recently some health authorities have tried to prioritise access to surgery by applying a cut-off value to the Oxford Scores patients present with. While surgical resources must be managed, this practice is controversial as there is no evidence that individual patient outcomes can be predicted by the patient's preoperative Oxford Score.

The primary aim of this study was, prospectively, to assess and compare the outcome of TKA and THA in the first year following surgery, using relevant standardised instruments. A secondary aim was to create models with which to estimate patient outcome at one year from preoperative data. This information will help inform patients and healthcare providers, who are involved with both referrals into orthopaedic services and with subsequent postoperative management, as to the expected outcomes of the surgery.


We prospectively followed all elective primary THAs and TKAs performed at the Royal Infirmary of Edinburgh between January 2006 and November 2008 inclusive. This consisted of 1410 THA and 1244 TKA procedures.

The Oxford Hip or Oxford Knee Score, the Medical Outcomes Study Short Form 12 (SF-12) and a separate validated satisfaction question12 were used. Patients were asked to complete these self-administered questionnaires at the time of preoperative assessment and then by postal survey at 6 and 12 months post surgery.

When assessing the outcomes it has been suggested that a combination of a joint specific and general health assessment tool offers the best combined analysis.13 ,14 The Oxford Hip and Knee Scores and the SF-12 questionnaires are highly validated and reliable tools that are accepted by patients and surgeons to gauge pain and functional outcome.15–17 The original scoring system for the Oxford Hip and Knee Score was employed. This results in a single outcome score between 12 and 60 (a high score indicates increased levels of disability because of pain and poor function, with a reduction in the Oxford Score indicating improvement). Evaluation of comparative change in the respective hip and knee scores was performed to compare between procedures. The SF-12 results in two scores, the physical and mental component summary (MCS and PCS) scores. Its scoring is based on norm-based methods using population mean scores. Both PCS and MCS have a population mean score of 50, with SD of 10. Higher scores denote better function with these instruments.18 The satisfaction question consists of a four point Likert scale, with answers ranging from dissatisfied to very satisfied.

Data analysis

Mean values are reported for the patient outcome questionnaires with SDs as a measure of dispersion. Satisfaction scores are reported as percentages.

Change in outcome score

Initial analysis addressed the unconditional research question comparing the mean change-score in the THA population with the mean change-score in the TKA population.19 The analysis involved use of repeated measures ANOVA via the GLM facility in Minitab20 with factors operation (levels THA and TKA), patient (nested within operation) and occasion (with levels preoperation, 6 month and one year). An operation-occasion interaction term was included in the model and 95% Bonferroni simultaneous CIs obtained for all possible operation—occasion pairings.

Prediction of outcome

Correlation was assessed between the 1 year Oxford Scores and the patient's preoperative Oxford Scores, general physical health (PCS), mental health (MCS), age and gender. Regression methods were then used to explore the relationship between individual patient Oxford Score one year post-surgery and MCS, PCS and Oxford Scores prior to surgery together with procedure, age and gender. A valid regression model would enable prediction of outcome for an individual patient in terms of their preoperation profile. The prediction could be either in terms of a predicted mean score for patients with that profile, with an associated CI, or of a prediction interval, with confidence level such as 95%, for the score for an individual patient with that profile. A forward variable selection procedure incorporated the potential predictor variables (of one year Oxford Score) in the order of preoperative Oxford Score, procedure, MCS score, PCS score and age. Adjusted R-squared values were 15.1%, 20.1%, 23.5%, 24.2% and 24.3%. The discrete and bounded nature of the Oxford Score results that use of regression models, with individual patient Oxford Score one year post-surgery as the response, are problematic. However, the investigation did suggest that Oxford Score prior to surgery and procedure had greatest leverage in predicting outcome in terms of the Oxford Score, that PCS prior to surgery and age were relatively unimportant and that gender had no bearing. These considerations led to further use of regression to explore the relationship between mean patient Oxford Score one year post-surgery with the preoperative Oxford Score and procedure. Incorporation of both a squared term in Oxford Score prior to surgery and an interaction term coupled with use of patient numbers as weights yielded a model with an adjusted R-squared value of 91% and satisfactory residual plots. All data analysis and display was carried out with the Minitab Release software V.16.20


1410 THA and 1244 TKA datasets were available for analysis preoperatively, 1389 THA and 1223 TKA at 6 months and 1381 THA and 1227 TKA at 12 month follow-up. This represents a loss to follow-up of 2.0% THA and 1.4% TKA patients in the year post surgery. Satisfaction data was recorded at the 12 month follow-up, 1348 THA and 1185 TKA patient data were available.

Differences in age profiles of the groups were assessed with independent sample t-tests. THA patients were 2 years (95% CI (−2.86 to −1.33), p=<0.001) younger at 68.1 years at time of operation compared with TKA patients at 70.2 years. The male: female ratio in both groups was similar, and no significant difference was found between the mean ages of male and female patients within hip and knee groups (table 1). Means and SDs of the patient reported outcome measures are displayed in table 2.

Table 1

Age at the time of surgery, mean (±SD)

Table 2

Outcome scores for hip and knee groups, mean (SD)

Repeated measures/change score analysis

The change in Oxford Score and SF-12 Scores between assessment time points is detailed in table 3. This analysis provides evidence, via the presented CIs, that mean Oxford score continued to improve significantly between pre-op, 6 months and 12 months for both procedures. Mean PCS improved significantly over the year for THA but only over 6 months for TKA. MCS score improved significantly from preoperative score in the THA group but not the TKA group.

Table 3

Within operation comparisons

Between groups comparison of outcome is highlighted in table 4. No difference was apparent preoperatively in physical outcome scores (Oxford or SF-12 PCS). Greater scores (of around five points on the Oxford Score and three points on the PCS) were evident in the hip arthroplasty group postoperatively. Preoperatively the differences in mean MCS were statistically significant, with TKA patients having the higher mean. This situation was reversed at 6 months, although there was no significant difference between groups after a year.

Table 4

Between operation comparisons (estimate of μKnee–μHip)

Predicting outcome

The discrete and bounded nature of the Oxford Score means that use of regression models to predict individual patient scores are problematic. However, mean patient Oxford Scores can be usefully predicted. Figure 1 shows the observed mean Oxford Scores one year post-surgery plotted against Oxford Score prior to surgery for both hip and knee patients together with curves displaying the fitted model. For example, the data set includes 59 hip patients with Oxford Score 40 prior to surgery. Their mean Oxford Score one year post-surgery was 21.7 and the model predicts a value of 20.1. Similarly for the 61 corresponding knee patients the mean was 23.1 with model prediction 24.5. The respective 95% CI for these means are (19.5 to 20.6) and (24.0 to 25.0). Figure 2 shows observed individual Oxford Scores one year post-surgery plotted against Oxford Score prior to surgery for both hip and knee patients together with the curves displaying the fitted model referred to above.

Figure 1

Model for mean one year Oxford Scores with observed mean scores.

Figure 2

Model for mean one year Oxford Scores with observed individual scores. An example is highlighted by the triangular symbol which corresponds to an Oxford Score of 40 points prior to surgery and of 15 one year post surgery. The data set includes five hip patients and two knee patients for whom this was the case.

Satisfaction with outcome

Very high levels of patient satisfaction were recorded for both the procedures. However, the proportion of patients recording overall satisfaction with the outcome of the arthroplasty at one year was significantly greater for the THA group (91.1%) than the TKA group (81.4%) (p value <0.001). Thus 18.6% of TKA patients, around twice the proportion of THA patients (8.9%), did not consider themselves satisfied with the result of the procedure one year following surgery. Satisfaction was strongly related to both Oxford Hip and Knee Scores (figure 3), where a broadly linear relationship was evident between Oxford Score and satisfaction stratum, better scores relating to enhanced level of satisfaction at both 6 and 12 months.

Figure 3

Oxford Score by satisfaction stratum.


The primary aim of this study was to assess the comparative change in patient outcome following THA and TKA. Both procedures conferred substantial improvement in the patient's report of pain and ability to perform physical tasks, corroborated by the very high levels of patient satisfaction recorded for each; however, THA was found to confer greater benefits than TKA. Regression modelling was not able to predict individual outcome; however, preoperative Oxford Score was reflective of average postoperative Oxford Score.

The difference in outcome was physical in nature. At 12 months postoperative mental health scores (MCS) were equivalent between the THA and TKA groups, whereas physical scores (Oxford Score and PCS) were significantly better in the THA group. The detected significant differences in MCS between our operative groups at preoperative and at 6 months postoperative were small in magnitude, within one SD of the population mean, and unlikely to be clinically relevant. Physical outcome scores for both procedures follow the same trend of substantial improvement in the first 6 months following surgery, followed by subtle further improvement in the second 6 months (table 3). It is in the initial six-month period where the majority of difference in comparative improvement between the hip and knee procedures occurs. Both groups demonstrated similar improvement in average scores in the second six months. However, the further improvement between 6 and 12 months does suggest that patients functioning poorly at six months should still be considered for referral to targeted therapy at this point.

The (general health) SF-12 physical scores recorded for each group demonstrated dramatic improvement following surgery, and reached within one SD of the age adjusted population normative value of around 46 points.18 Dieppe21 notes that osteoarthritis should not be defined as a discrete disease entity but as joint failure, akin to cardiac or renal failure. As such the outcomes of total joint arthroplasty can be usefully compared with surgical treatments for these organs, and direct comparison drawn via the generic SF-12 scores. Interestingly, the PCS scores for THA and TKA reported here correspond to those reported following renal transplant22–24 (40–42 points) and coronary revascularisation25 (44–46 points). MCS scores reported are equivalent to population norms in all cases.

Strengths of this study are the large volume of data presented and the breadth of outcome measures employed. Previous authors have attempted to assess comparative post-operative function by way of generic health measures, though have generally reported on small cohorts. Ritter et al 5 assessed 85 THA and 93 TKA patients in terms of quality of life by the SF-36 questionnaire and reported no difference in results. Benroth et al 8 reported on 63 hips and 110 knees and were also unable to detect a difference using the SF-36. Norman-Taylor6 reported a small cohort of 41 THA and 31 TKA and suggested similar outcomes were achieved; however, in order to compare the different outcome scores, they carefully converted the Harris Hip scores and a modified British Orthopaedic Association Knee functional assessment chart into Rossiter distress and disability scores and then compared them on the Rossiter index matrix. From this they derived quality of life scores but found no significant difference between the operative groups. Conversely Bachmeier et al 7 reported significantly enhanced WOMAC and SF-36 scores in 86 THA patients compared with 108 TKA patients. In a review Ethgen et al 1 commented on the conflict in the literature concerning the results of THA and TKA. They considered that for quality of life outcome measures, patients did better following THA; however, meta-analysis was not performed. Bourne et al 9 assessed a large Canadian cohort using the generic lower limb WOMAC score and willingness to undergo the procedure again as outcome measures. They suggested superior outcome with THA, but their study suffered from a 30% loss to follow-up. Wylde et al 10 also reported superior outcome of THA compared with TKA using the Oxford Scores. However, their analysis was limited by not presenting any preoperative data. Further Dawson et al 11 have criticised the methodology of Wylde's paper, highlighting the different population characteristics of hip and knee patients and that 3 of 12 questions are different in the respective Oxford Hip and Knee Score questionnaires. Dawson et al 11 commented that direct comparison of the respective mean Oxford Scores is not valid and thus the results suggested by Wylde et al 10 potentially misleading.

We concur that directly comparing the means of the Oxford Hip and Oxford Knee Scores is controversial but consider that comparing the relative change in the scores to be of value in the context presented here, particularly as our operative groups had very similar distributions of baseline Oxford Score, age, gender and general health (SF-12). The difference of change in Oxford Score of approximately five points between THA and TKA is relatively large and this magnitude of change on either hip or knee score would be considered clinically significant. Of particular interest was that the overall patient satisfaction with the procedure also differed between hip and knee arthroplasty, with more patients in the THA group being satisfied (91.1%) compared with the TKA group (81.4%). Satisfaction with the outcome was found to be very strongly linked with patient report of improved pain and function via the Oxford Scores (figure 3). This is in contrast to the reported relationship between Oxford Shoulder Score and level of satisfaction following shoulder hemiarthroplasty, where more ambiguity was apparent between these measures.26

Our model identifies that the operation (THA or TKA) and baseline Oxford Score are strong predictors of mean 1 year Oxford Score. However, the discrete and bounded nature of the Oxford Scores makes it difficult to derive satisfactory regression equations for individual patient scores. The practice of prioritising access to surgery based on preoperative scores is thus drawn into question. While mean preoperative scores may predict mean postoperative scores, the individual patient outcome cannot be elucidated.

Study limitations

This study does not explore factors such as comorbidities and body mass index in the regression analysis which may be of interest for future analysis. Current work is seeking to create more effective models for the prediction of postoperative outcome, while statistical colleagues are addressing issues associated with discrete and bounded responses such as the Oxford Scores.27 Our model can be seen as a useful starting point with which to inform both patients and clinicians as to the likely average outcome of each procedure.

The association between the Oxford Scores and satisfaction is also far from clear. We demonstrate an association between greater postoperative Oxford Score and patient satisfaction, which is intuitive as satisfaction with outcome would be expected to reflect improved levels of pain and dysfunction. Interestingly however, recent work has demonstrated that patient satisfaction cannot be predicted form preoperative Oxford Score.28 Depression, pain in other major joints and poor preoperative mental health scores have previously been suggested as independent predictors of dissatisfaction following arthroplasty,29 ,30 though definitive analysis is lacking. A priority for future research must be the preoperative identification of those patients who are at risk of dissatisfaction with the outcome of the procedure.

In conclusion, results from this large prospective cohort demonstrate a very high level of satisfaction and considerable improvement in patient reported outcome measures after both THA and TKA. However, patients are more likely to report a greater improvement following THA compared with TKA. Most improvement in patient scores occurs in the initial six months period postoperation; however, further improvements in physical function can be seen between 6 and 12 months post surgery. Mean one year Oxford Hip / Knee Scores can be well predicted by mean preoperative Oxford Scores. Individual preoperative predictors may be difficult to derive.

Main messages

  • The results of total hip and knee arthroplasty are not equivalent; greater improvement in outcome is reported following hip replacement.

  • This difference is physical in nature and is evident within 6 months postoperation.

  • Outcomes for individuals cannot be predicted, but mean preoperative Oxford Scores are a useful indicator of mean 1 year scores, and can be used to monitor group outcomes.

Current research questions

  • The prediction of individual patient outcome and also of those patients who are dissatisfied with the procedure should be investigated.

  • The link between patient reported outcome measures and satisfaction requires further study.


The authors extend thanks to Cat Graham and Sandra Bonellie for additional statistical advice.


View Abstract


  • Funding Stryker UK provide funding to the Department of Orthopaedics and Trauma, University of Edinburgh to support the patient database. At the time of this work DH received a PhD studentship from the Medical Research Council doctoral training scheme and Stryker UK. These bodies had no role in the design; in the collection, analysis and interpretation of the data; or in the writing of the article and the decision to submit for publication.

  • Competing interests None.

  • Ethics approval Scotland A Research Ethics Committee.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.