Point 8 of the Digital by Default Service Standard says:
Make all new source code open and reusable, and publish it under appropriate licences (or provide a convincing explanation as to why this cannot be done for specific subsets of the source code).
This post is about how we’re going to open up our code. It explains what we’ve opened so far, where we aim to get to, and the process we will follow to get there.
TL;DR
We’ve started open sourcing parts of GOV.UK Verify, and there’s lots more to come.
We’re keen to collaborate with others - other governments have expressed an interest already.
We’re blogging about the code as we open source it and we’re keen to make it easy to follow so that people can make suggestions and report any bugs or vulnerabilities they spot.
The story so far
When we started building GOV.UK Verify, more than 4 years ago, we didn’t start from a policy of coding in the open as the default way of doing things. At that time, working in private seemed like a good place to start given the nature of what we were doing and the prevailing approach to such issues at the time in government. Since then, a lot has changed, as James described in his recent post about opening up the codebase for the Register to Vote Service.
Retrospectively opening up your codebase is much more difficult and complicated than doing it in the open from the start. We’ve been learning from other teams who have faced similar challenges, and we’ve been working with experts from CESG and from other parts of GDS to help us plan our approach. We’d welcome feedback from readers of this post, too, to help us continue to develop our approach.
How we identify and prioritise components to open
It will take us some time to open all our code - we will carry on updating on our progress through our blog. We’re planning to prioritise our releases of code over the next few months based on the following considerations.
Interest in re-use
Both we and the wider community benefit from active participation, so anything that’s valuable and interesting to others is high priority. For example, public sector teams from other countries have expressed an interest in learning from, and possibly reusing, some of our components. There’s also the potential for private sector companies to build a hub and federation for non-government use. The hub frontend and SAML handling components may be useful to these parties.
“While we’re working on that anyway” opportunities
As we’re iterating our service, if we have a lot of work to do on a component for a product feature, it may make sense to open it as part of that work. For example, adding multi-language support to our service was a good opportunity for us to move the frontend into the open.
Transparency and public interest
Even if a piece of code has limited potential for re-use we’d like to be open about what it’s doing, why and how. We expect more public interest in how we’re handling private data than in, say, a one off script we’ve written to migrate some data from an old system onto a new one.
Our open source kanban
Done
We’ve already published some code:
- https://github.com/alphagov/eager-dropwizard-guice - code to make it easier to use Guice with Dropwizard
- https://github.com/alphagov/interaction-diagrams - visualisation of SAML messaging
- https://github.com/alphagov/gradle-gatling-plugin - tooling for automated testing
Doing
- We’re rebuilding the GOV.UK Verify frontend - you can track our progress on github
Todo
We aim to release the following code in the next few months:
- Piwik puppet module - this may be of use for other users of the piwik analytics tool
- Various libraries that we use to represent our SAML profile on top of OpenSAML-JAVA
- The Matching Service Adaptor - as this is both of interest to the governement services that rely on GOV.UK Verify and of public interest as a component that handles personal data
Out of scope (for now)
For now, we are not planning to prioritise opening up the following parts of our codebase. However, over time we think we will probably be able to open up some of the code in these categories, too.
Configuration that's specific to the environment in which we build and host our service, i.e. infrastructure code
We see little potential for re-use of this area of our codebase, and it currently contains sensitive information (for example, which users have access to particular systems). In the long term we may look at how we could make at least some of this code open.
Commercially confidential information
There may be circumstances where a certified company has invested in a new identity proofing method, wishes to bring it to the market ahead of its competitors, and is not yet ready to publicise its plans. This is commercially confidential information and, in those cases, we wouldn’t publish the code as it’s being prepared ahead of release but would wait until the new method is ready to be released. If we were unable to keep the method confidential until its release, companies would be discouraged from investing in innovative methods that improve the user experience and deliver better value for government.
Secrets
Cryptographic keys, which must be kept secret, are typically used in combination with publicly available cryptographic algorithms to ensure that data is kept secure. Some systems and APIs also use passwords which must likewise remain secret.
The document checking service
The document checking service provides a method for certified companies to check whether UK passports and driving licences provided by users are valid. We’ve decided not to open the code for this service for now for the following reasons:
- It connects on to systems at other departments via interfaces that we do not have permission to publish
- There appears to be little interest in re-use of this code at the moment
The steps to open up GOV.UK Verify’s code
For each component to be opened we start by considering any security risks in opening the code and work out a plan to deal with them - the security of our users is paramount so we won’t proceed unless we can do so safely.
We then tidy up the code so that:
- it can be built independently of other components that are not yet open
- it contains no secrets or sensitive information
We will decide on a case by case basis whether to retain the commit history as GOV.UK did when opening their puppet repository.
We think it’s really important to provide some context for each component that we open source - without this the opportunities for critical review and for re-use are very limited. We provide context via a combination of blog posts and code level documentation.
Some components may need a further independent code review before release.
Once a component is in the open, we will usually do our own builds directly from the public repository.
Over to you
We’re excited to be opening up our code, and we’re looking forward to learning and developing our approach as we go along. If you have any feedback on the approach we’re planning to take, and the code we’ve opened so far, let us know in the comments below.