The Peterborough Prison pilot: through a glass darkly

This article was originally posted on Monday June 13. Social Finance complained that the financial details were incorrect and, in line with its policy, Straight Statistics removed it from the website pending clarification. We are grateful for the details SF has now provided, for its responses to criticisms of its model, and to the Ministry of Justice for answers to questions. These do not, however, eliminate the concerns raised so we are now reposting the article, incorporating these changes.

Information on the pilot study designed to cut reconvictions at Peterborough Prison has dribbled out in an unsatisfactory way. But with the publication of a report by RAND Europe at the end of May we now know a little more than we did – though by no means enough – about how this flagship project will be run.

The report, called Lessons Learned from the planning and early implementation of the Social Impact Bond at HMP Peterborough, is based on interview with 22 insiders who have been involved with setting up the scheme. Social Finance raised £5 million from investors to fund the bond: they will get a return on their investment if reconviction rates at Peterborough are lower by a defined amount than those of a control group of prisoners released from other prisons.

The scheme aims to reduce re-convictions in the 18 months after release for offences committed in first year post-release for 3,000 consecutive short-sentence (up to 1 year) prisoners liberated from HMP Peterborough in 5+ years from September 2010. Crucially, short-sentence HMP Peterborough prisoners liberated from court are ineligible.

The design of three cohorts, each of 1000 ex-prisoners, allows for interim analyses and early pay-out to SF investors if at least a 10 per cent reduction in reconvictions is observed per cohort. Commendably, cohort-size was determined for statistical assurance on the precision of the estimated reduction. The metric is number of reconvictions in the 18 months post-release for offences committed in first year after release per cohort of 1,000 eligible ex-prisoners. Reimbursement by MOJ of SF-investors occurs if One Service (as the service is called) succeeds in reducing reconvictions by 10 per cent (or more) per cohort of 1,000 eligible ex-prisoners compared to controls; or by 7.5 per cent (or more) across all 3,000 eligible ex-prisoners. In the event that One Service delivers more than anticipated, a cap has been set to limit the total pay-out to SF at £8 million (£5.25 million from Big Lottery Fund and £2.75 million from MOJ).

The cap was not explicit in RAND Europe’s report, and so I’m grateful to Social Finance for making it so. The amount that MOJ needed to budget for (capped at £2.75 millions) would have been a key statistical issue because it derives from the prior plausibility (or probability) that MOJ attaches to each percentage point of reduction in number of reconvictions. Bayesians beware!

Working out how to select controls. This work was crucial but SF has achieved an open goal for itself! Randomization of eligible offenders within HMP Peterborough was rejected, inter alia, as an interference with SF’s business model  . . . which, of course, relies on maximizing the number of ex-prisoners released into the local area in which SF’s partners operate One Service. Randomization would have reduced their case-cohort by half (50:50 randomization) or a third (67:33 randomization) but bolstered the evidence-base.

SF denies that its business model influenced the decision. It says that there were operational and practical reasons for not running an RCT. Many of the charities involved are unwilling to turn away a client who requested services because he was randomly selected not to receive them, it says, adding that even if the organization involved agreed to the protocol it is unclear that key workers on the ground will reliably stick to it. SF was also concerned about unrest in the prison, arguing that it would be almost impossible for the prison staff to explain to prisoners that they can’t access services because they are part of a control group.

The alternative was to randomize by prisons, which was rejected as SF would have had to find partners in multiple areas, not just in one purposely-selected location – where most of HMP Peterborough’s releases live and where SF could enlist partners who claim a track-record, see RAND Europe’s report. Of course, the track-records in question, as RAND Europe points out, have not been subject to formal proof, that is:  by experimentation. 

SF says that Peterborough was chosen for practical reasons, not for its own advantage. The National Offender Management Service wanted to use that prison and it was in a region interested in testing the model. It is indeed, says SF, a local prison but there are plenty of other local distributing prisons. So it disputes the suggestion that its choice creates a risk of sample bias.

Lacking randomization, controls will be selected by an Independent Assessor who was commissioned only after the HMP Peterborough Pilot had kicked-off in a manner that has not yet been set-out explicitly . . . except that MoJ and SF have agreed that only characteristics that significantly impact on re-convictions will be used in the propensity-score. Of course, statisticians know that statistical significance depends not only on prognostic-strength but also how large the training dataset is, on which the propensity-score was devised. The national database base of short-sentence releases is not suitable because it contains releases from prisons that do not sufficiently resemble HMP Peterborough in size, function, private/public status, and localised release-area. A pilot study that could lead to £8 million of expenditure on the basis of non-random controls should be absolutely transparent, from the outset, about which prisons its controls will be selected from, and how. 

Ten controls per eligible HMP Peterborough release will be selected who are, I imagine, matched as a pre-requisite on some characteristics – eg same sex, eligible-comparator-prison, similar age (how closely-matched?), short sentence (up to 1 year: how closely matched?) ex-prisoners whose release-date was similar (how closely matched?) to that of the One Service case  - are to be further selected by propensity-score matching (which takes past criminal record into account, but how?). There are, however, some key questions about the comparator prisons [which are they? See i) below] and, more subtle questions, about eligible controls. Liberations from court have to be ineligible, as they are ineligible at HMP Peterborough, but will the Independent Assessor know whether controls were liberated from court? MoJ has given email assurance that controls who were liberated from court shall be ineligible. Once a prisoner-release is into the One Service cohort, he remains so for at least 18 months and is not re-recruited into the 1st One Service cohort if re-released from HMP Peterborough. The same condition has to apply for controls from prison X, see ii) below.

Questions to be answered now, because they were not answered by RAND Europe are:

i) which similarly-sized private or state prisons (Xj, for j = 1, 2,  . . . ) whose releases are mainly to a local area (as for HMP Peterborough) will control releases be selected from?

ii) are controls from prison Xj eligible for selection if and only if they have not already been released from prison Xj since 10 September 2010 after a previous short-sentence?

Answers will not be disclosed until September 2011 – a year after the HMP Peterborough Pilot started.

Lack of transparency (financial as well as study-design): The authors from RAND Europe were allowed to read, but not disclose, some redacted parts of the MOJ/SF contract that are concealed from public view.

Lack of transparency about study-design (see above), and how re-imbursement of SF was calibrated is unacceptable – especially as some hidden re-calibration of the re-conviction costs is ongoing, see RAND Europe report, which could influence how payback for reducing the number of  re-convictions is worked out. The public’s liability has been capped at £8 million (£5.25 million from Big Lottery and £2.75 million from MoJ) but our liability is not clear if the minimum pay-out threshold of just 7.5 per cent reduction for 3,000 HMP Peterborough release is achieved when assessed against uncertainly-selected controls. However, if a reduction below the threshold is achieved, the public’s liability is zero.

RAND Europe does indicate that SF’s investors expect an internal rate of return of 7.5 per cent to 13 per cent per annum on their 5 to 6-year investment of £5 millions. How much, on average, SF expects to pay its partners for delivering One Service for up to one year per eligible release from HMP Peterborough is unstated but prior estimates must exist for SF to have established its financial model.

Summing up: HMP Peterborough Pilot can outlay up to £5 million for One Service for 3,000 releases, £1,650 extra per release (to nearest 50 pounds); more is unlikely although interest on tranches of the initially-invested £5 millions will accrue. Control prisons do not receive any extra resource per release and there is no rival provider to SF. Nor are controls randomly selected, either within HMP Peterborough or by random (versus purposeful) choice of SF-prison(s). The comparator is thus an Aunt Sally (no matched extra investment on control ex-prisoners), no rival provider, and propensity-score matching can’t even things up when the playing-field is already tilted by non-random selection of SF’s site of operations.

RAND Europe has recognised that the metric & control for the HMP Peterborough Pilot can’t generalise unless multiple providers buy-in to the notion that their effectiveness at reducing reconvictions - when working either with prisons they select or, worse for them, with prisons selected for them by randomization - will be compared directly to all liberated prisoners, ansd that MoJ will alter the randomization ratio every three years to weed out under-performing providers. That’s how to generalise efficiently . . .

Only exceptionally should the UK be pay-master for evaluating a lone private-provider (SF) against a no-intervention control group that is still transparently ill-defined – even after RAND Europe’s first report. More will be revealed in September 2011 – which the control prisons are (even they don’t know this at present!) and how the propensity-score has been defined.

And so why has MOJ received plaudits for its Payment by Reconvictions pilot at HMP Peterborough?  The cost-effectiveness calculations are hidden, and being revised. However, let’s take a crude approach. Suppose that, without intervention, each release averages C reconvictions for offences committed in the first year so that 3,000 releases accumulate 3,000C reconvictions. Minimum payout for SF is a 7.5 per cent reduction on this, namely: 225C reconvictions. And so, if MOJ’s minimum effectiveness-payout were £M million, then the cost that the UK is willing to outlay to avert C convictions is at most M,000,000/225 = M x £4,440 (to nearest £10) for C convictions. If M were chosen to match The Big Lottery’s contribution of £5.25 million, then the implied payout for preventing C convictions would be £23,300 (to nearest £100). MOJ has capped public’s willingness to payout at £8 million so that even if the achieved reduction were 15 per cent, and therefore 450C convictions prevented, the UK public would not expend more than 8 x 2,220 (to nearest 10) = £17,760 for C convictions. Let’s hope that SF and MoJ soon disclose the real M.

Of course, we don’t know the average number of first-year reconvictions, C, nor their severity distribution but we can clearly see that, if £M million were to equal the Big Lottery’s contribution of £5.25 million, then MoJ’s future willingness to pay for their avoidance (between £17,700 and £23,300) would bracket £20,000. 

For proven-efficacious medicines licensed on the basis of properly designed RCTs, the National Health Service can only afford to pay around £20,000 per quality life-year gained, the threshold used by the National Institute for health and Clinical Excellence, NICE).

MoJ’s difficulty is that its efficacy trial is non-randomized (unlike the pharmaceutical example), but the stakes are just as high because the costs of a year’s convictions ( police, courts, sentence and impact on victims) are high, as MoJ’s willingness to pay clearly signals. 

In future, MoJ needs to devise experiments that play-off providers in a fair manner (based on randomization) and judge by results in an efficient manner (by sequentially playing the winners and deselecting the providers whom experimentation indicates are unlikely to be the winner).