https://identityassurance.blog.gov.uk/2016/01/25/estimating-what-proportion-of-the-public-will-be-able-to-use-gov-uk-verify/

Estimating what proportion of the public will be able to use GOV.UK Verify

GOV.UK Verify is on track to go live in April and we’ve set some ambitious objectives for what we want to achieve between now and then. One thing we’re aiming to do is increase our demographic coverage to around 90% by April, continuing to expand it beyond 90% after GOV.UK Verify goes from beta to live. This will mean that the vast majority of people who are expected to use GOV.UK Verify will be able to do so.

We’ve done some analysis and built a web tool to help us understand how many people can already be verified online using GOV.UK Verify and how this will change over time, as our certified companies introduce new methods and data sources.

We’ve shared an overview of this work over on the GDS blog and this longer post explains what we found and what we did in more detail.

New certified companies and methods mean that GOV.UK Verify will meet its target demographic coverage by April and exceed it by summer 2016

Most people (over 95%) at this stage are using GOV.UK Verify to access government services for people in employment, such as log in and file your self assessment tax return and claim a tax refund.  People in employment are more likely to be able to verify their identity using GOV.UK Verify at this stage, compared with people who are not in employment. This is because people who are in employment are more likely to have the identity evidence and digital footprint needed by certified companies to verify their identity at this stage.

When we look at the current services using GOV.UK Verify, we estimate at least 78% of people likely to use those services should be able to use GOV.UK Verify successfully based on the evidence and technology they are likely to have available. This will increase to around 91% by April.

Looking at the overall population of UK adults, we found that at least 73% of over 16s and 78% of employed people are likely to have the evidence and footprint needed to verify their identity using GOV.UK Verify. This will increase to at least 91% of employed people and at least 88% of over 16s by April 2016. By July, coverage will increase to 95% of employed people and 93% of over-16s.

The tool shows that by July coverage will increase to 95% of employed people
The tool shows that by July coverage will increase to 95% of employed people

We modelled 22 of the 52 potential pieces of evidence that can be used to verify people. We selected these 22 as they are the items that people are more likely to have in larger numbers, so will have greater impact on overall coverage. However, because there are a further 30 items that could be sampled, we believe the estimates we have are lower bounds, or underestimates.

This is because people who could successfully verify, but only if they have one or more of the 30 items not modelled, would not be included in our estimate. For example, if you have an EU photocard (which was not included in our sample) you may still be able to verify.

25 to 64 year-olds are more likely to have the information to enable GOV.UK Verify to verify their identity

We also looked at how this proportion varied across different age groups. You can explore this yourself using the tool. This research has allowed us to prioritise how to improve and expand GOV.UK Verify ensuring that, in future, it can cover groups who have less evidence and/or technology.

We found older (75+) and younger people (16-24) are less likely than other groups to have the evidence and technology needed to verify with our certified companies at this stage. However, our coverage of both groups is set to improve by April (73% of 75+ and 82% of 16-24 year olds), and again by July (85% of 75+ and 90% of 16-24 year olds), based on the improvements the certified companies are planning to introduce. We expect further improvements beyond July as well, as new methods and data sources currently in the early stages of development come to fruition.

For younger people, this is because they are less likely to have documents currently used by certified companies as identity evidence (particularly driving licences) and less likely to be financially independent, which provides evidence of activity history (element E in verifying a person’s identity).

For older people, this was because they are less likely to have loans and mortgages showing activity history. They’re also less likely to have smart phones or tablets - two technologies that enable you to scan your identity documents and take a photograph of yourself, so the images can be compared. Doing this gives the certified company a higher level of assurance about each piece of identity evidence, so fewer pieces of evidence are needed to reach the required overall score. That means it’s possible to verify identities for people who are able to use smartphones or tablets with 2 documents, whereas if the user manually enters all the details then usually 3 documents will be required. The certified companies are also developing other ways to achieve a high level of assurance about a piece of identity evidence that won’t require the use of a smartphone. We’ll share more about these methods once they’re ready to be released.

What you do for a living may impact whether GOV.UK Verify can verify you right now, but not in the future

People in professional and managerial occupations are more likely to be able to verify at this stage (87% compared to 74% of all over 18s), particularly compared to non classified groups - unemployed people, students and non classified occupations. This is because they are more likely than average to have items that certified companies use to verify people (identity documents, technology and financial products). Again, people in other occupations will see the greatest improvement in coverage over time, because of the new methods and data sources certified companies are planning to introduce.

GOV.UK Verify for more people

These two examples, in age and professions, both show that certified companies will improve their coverage the most in the groups that started with the lowest coverage. This is evidence that the market is developing to address the gaps in demographic coverage, with a focus on groups that were not covered by initial methods and data sources.

There’s not enough space to explain all our results and interesting variances in coverage across demographic groups, so please do explore yourself. We’re also going to release as much of the underlying data as we can so you can analyse it under an open government licence with the Office for National Statistics (ONS), whilst protecting the anonymity of survey respondents and the commercial confidentiality of our certified companies’ plans.

How we did it - it’s a combinations problem

So, those were the results, here’s how we got there.

Verifying an identity online requires a user to provide evidence - sometimes using technology such as smartphones, to do so - to a certified company. This requires a combination of pieces of evidence to meet government standards on identity assurance.

This combination set has grown over time as certified companies innovate and add more data sources and use technology to verify people. To date, there are 24,000 different combinations of 32 items (evidence and technology). This is only going to increase over time as certified companies continue to add more data sources and methods.

To estimate how many people can verify, we need to know how many people have these combinations that certified companies can use.

There is publicly available data on coverage of individual data sources. For example, we know there are 38 million driving licences issued in Great Britain, which is 78% of the mid year 2013 estimate of the population of Great Britain over the age of 18. We know from Her Majesty’s Passport Office that 80% of England and Wales residents have a UK passport. What we don’t know from this data is how many people have either a UK passport or a GB driving licence.

You might say we can estimate that. And it’s certainly possible. But an assumption has to be made on independence. Complete independence means assuming those who own a passport have just as much chance as those who do not have one (and vice versa). This would give us the result that 95.6% of people have either a GB driving licence or a UK passport.

However, the available data told us nothing about the correlation between different items - for example, if you have a driving licence, does that make it more or less likely you’ll also have a passport? There was also no available data on many of the items people can use to verify. So we commissioned the ONS through the Opinions and Lifestyle survey to ask people and generate a representative sample of the UK.

This gave us a list of the combinations of items people had that certified companies can use to verify their identity. We can immediately see how this improves our understanding compared with what we could glean from available sources about each item - for example the actual proportion of people in Great Britain who have either a UK passport or Great Britain driving licence is 89%, not 95.6%.

Comparing combinations

Now we have a representative sample with combinations of things users have, we need to compare this to the combinations of things certified companies can use to verify users. The number of different combinations will grow over time as more methods and datasets get added. To do this, we converted the methods or “user profiles” certified companies told us they were using - or planning to use - into combinations of evidence and technology. For example: a passport, smartphone, bank account and a loan is an example of a combination of things a certified company can use to verify a user.

To figure out how many people can be verified we then ran a matching programme to see how many people had at least one combination for that time period so that at least one certified company could verify them.

The tech behind the web tool

The verification process was modelled using Python, making use of pandas, a programmatic spreadsheet. Given key time-points and survey data the model produces verification rates, broken down by key demographic indicators, including age, sex and employment-status.

The model outputs a JSON data-file which is then used to create a browser-based visualisation using D3. The interactive line-chart and other elements of the tool are entirely bespoke, which gives full control over the presentation and makes it easy to add extra functionality. Being web-based, the tool can be easily shared.

Next steps

This is our first iteration and is based on a snapshot of certified company methods as they looked in November 2015. Our next iteration will re-visit these methods and add data from Northern lreland, so we get a UK-wide sample.

With help from ONS we’ve done the first difficult bit: data gathering, cleaning and model building. Further analysis will be our next priority. Our next piece of analysis will be conducting logistic regressions and correlation matrices to gain further insights across demographic groups.

We will also continue to improve the data, commissioning ONS to include more questions for the next rounds of the Opinions and Lifestyle survey sample to fill in any gaps.

In the meantime, explore the tool for yourself.

Update, 29 March 2016: We are now able to publish a CSV file (663 kb) containing the data used for the web tool for 7 of the 9 demographic variables provided by the ONS omnibus survey. This is combined with our model's estimate of the individual's probability of being verified by certified companies over time. This is the maximum number of variables we could make public, whilst preserving the anonymity of respondents. 

8 comments

  1. Caroline Miskin

    It would be interesting to know what proportion of the self-employed will be able to use Verify.

    Link to this comment
  2. Michael Clark

    Thanks for your comment, Caroline.

    We're working on getting that breakdown. We've got access to the relevant data and are currently working with ONS to analyse it with a view to including it in the next iteration iteration of the tool. We'll keep you updated on our progress here on the blog.

    Link to this comment
  3. Jay

    On the Tool, what does 'total verified' represent?

    Link to this comment
  4. Michael Clark

    Thank-you for the feedback - we realise this could be clearer, so we'll update the tool.

    'Total verified' means the percentage of people in Great Britain who could verify with at least one certified company.

    Link to this comment
  5. Ross Orange

    Of the 38 million diving licences how many are still the old paper version without the photo card required to be used with Verify? How does this impact on the analysis undertaken?

    Link to this comment
  6. Michael Clark

    Thanks for your comment.

    A deep dive of the latest ONS sample estimates 88.2% of those with a driving licence have a photocard and 11.8% have the paper version only. We plan to model this detail, together with the additional data we've collected with ONS for the next iteration, which we'll publish here on the blog.

    Link to this comment
  7. Malcolm Doody

    It would be interesting to understand the various step-changes in verification and what event(s) underpin the assumption. The obvious one is probably the January deadline for SA filing with HMRC, but what are some of the others? e.g April 2015 (most/all groups), December 2015 (again, most groups) and the predicted upturn in June 2016 ...

    Link to this comment
  8. Michael Clark

    Thanks for your comment.

    The step changes are based on new methods and data sources added by the certified companies.

    A past example was when a certified company added the ability to let users scan identity documents and take a photograph of themselves, so the images can be compared. This gives the certified company a higher level of assurance about each piece of identity evidence, so fewer pieces of evidence are needed to reach the required overall score. That means it’s possible to verify identities for people who are able to use smartphones or tablets with just 2 documents, whereas usually 3 documents will be required. The resulted is an increase in demographic coverage.

    We'll write about the new methods here once they're available to users and will show the impact on demographic coverage as part of future updates to the tool.

    Link to this comment