--- lead: true title: Putting the Open Data Into Open Budgets authors: - Neil Ashton --- We have looked in detail in this report at criteria which make it difficult for organisations to use data that has been released by governments. In January 2013, we hosted a community call with to look at what the demands of the Open Data Community are with regard to Open Budgets. Despite both featuring the word “Open” - there is still a disconnect between the use of the word “open” in many circles to signify availability and “open” in technical spheres to signify absence of legal, technical and social restrictions. The purpose of the call was to investigate whether it would be possible to specify the demands of the Open Data Community with relation to budget and spending data. ## What do we need and how do we need it? ### Structured data So it’s not so labour-intensive to do analysis! For definitions of structured data, please see section below: *Structured data: What data formats to provide* ### Bulk access * *It should also be possible to download all of the budget information in bulk*. * Preventing bulk downloads by using systems such as CAPTCHA is not acceptable. * Some interviewees requested data to be released via an API. This is indeed a useful move particularly when data is updated regularly, but should not be the only method to acquire the data - many non-technical users require simply bulk download of the data. ### Updates and amendments If there is a requirement to update or change the budget documents e.g. as new drafts are produced - it's important to show the versions and keep track of the changes. Some suggestions: * Displaying what date the data was "updated on", or using version numbers would be acceptable. * Crucial is that there should be access to all drafts (i.e. they should not be removed from their place of publication and should remain available) even when new versions are published. ### Timely data (that stays around) Data is required: * Within a period of time that would allow change to take place * Early in budget formulation process so that it is possible to participate in discussion about where the funds should actually go * After budget formulation so that you could monitor whether things had actually happened * Planned versus execution data while such comparisons still matter - for example, so that one might complain that a project didn’t actually happen, and the guy who would have been responsible for that is still in that job, and the people who would have benefited from it are still going to be angry
The UK government have now issued very good clear, plain-language guides for service managers on which data formats are appropriate for publishing data. The US government has also decreed that all data shall be published in machine-readable formats. An extract from the UK service manual from gov.uk is copied below for the convenience of the reader:[...]
“For data, use CSV or a similar ‘structured data’ format (see also JSON and XML). Do not publish structured data in unstructured formats such as PDF. If you are regularly publishing data (financial reports, statistical data, etc.) then your users may well wish to process this data programmatically, and it becomes especially important that your data is ‘machine-readable’. PDFs, Word documents and the like are not suitable formats for data publication. In addition, you should consider making your data available through an API if this will simplify your users’ interactions with your publications. [...] If you are publishing a written report that contains statistical tables, provide the tables alongside or in addition to your report in suitable data formats.
Read the full version of the guidelines here.Don’t assume your users can read proprietary formats
Wherever possible, publish in accessible, patent-free, open formats, for which software is widely available on a variety of platforms. If publishing in proprietary formats, you should always make a non-proprietary alternative available. [...] For tabular data, provide CSV or TSV rather than Excel spreadsheets (.xls/.xlsx).