* [examples/openspending] - openspending v0.2 * [examples/openspending][m] - fix build * [examples/openspending][xs] - fix build * [examples/openspending][xs] - add prebuild step * [examples/openspending][m] - fix requested by demenech * [examples/openspending][sm] - remove links + fix bug
60 lines
5.8 KiB
Plaintext
60 lines
5.8 KiB
Plaintext
---
|
|
title: "Common arguments against not publishing data in machine-readable format"
|
|
---
|
|
|
|
#Common arguments against not publishing data in machine-readable format
|
|
|
|
##“PDFs are on my computer - therefore they are machine-readable”
|
|
|
|
FALSE: The fact they are on your computer means they are electronic copies, but not that they are machine-readable. PDFs are essentially a set of instructions for a printer on how to print a page, they look nice and appealing to the human eye, but to a computer, they are no more than a picture.
|
|
|
|
PDFs go from bad to worse from the perspective of someone trying to do data work:
|
|
- [Better PDFs are machine-generated](https://www.gov.uk/service-manual/design-and-content/resources/creating-accessible-PDFs.html), typically something like an Excel or Word Documents converted into a PDF (see example). Often, you can copy and paste information from them, but there may be some formatting or issues.
|
|
- Worse PDFs are typically scanned documents. Often, to add to the misery, they will be copies of faxes, smudged, speckled, tea- water- or mould-stained or crooked (sometimes all of the above).
|
|
- Image files are not machine-readable for the same reasons.
|
|
|
|
##“If we publish in machine-readable, open formats - someone will alter the data and use it to discredit us.”
|
|
|
|
Again, FALSE. If someone wants to use data badly enough, they will use it even if they have to get it out manually. If they have to get it out manually - mistakes could be introduced, which could also lead to embarrassment. Publishing the data in machine-readable format simply allows the user to start working with the data straight away.
|
|
|
|
Our advice would be the following:
|
|
- Publish both machine-readable and non-machine readable formats. We insist on the former for analysis, but the latter can also be useful e.g. to cross reference numbers and be an easily readable form of the data.
|
|
- Encourage users of the data to show their working.
|
|
|
|
A reputable data project will:
|
|
- Link back to the original source data
|
|
- Link to any modified data with an explanation of how it was changed, with the calculations to any underlying working clearly visible. When you provide such a clear audit trail others will be able to replicate your work and examine transparently that everything was done without errors. In journalism this is sometimes known as the “nerd box”.
|
|
- Offer data publishers to comment on calculations from the data in order to clear out misunderstandings.
|
|
- This allows anyone to check the accuracy of the working and verify the results.
|
|
|
|
##“We cannot release spending data as it contains personal information”
|
|
FALSE, public authorities holding spending data, which includes personal information should not refrain themselves from responsibility of publishing the data. Instead authorities should conduct the proper examination and redact personal data accordingly. We see real risks of local and national governments holding back spending data with this excuse and have therefore co-written a guide for public authorities on how to deal with personal information in spending data [(see the Annex)](http://openspending.org/resources/osi/privacyguide.html).
|
|
|
|
The current access to data from the EU farm subsidy programme is a clear example of a case where privacy (in this case for farmers) was used as argument to decide a case, which significantly reduced access to data.
|
|
|
|
##“We cannot release spending data due to third parties confidentiality”
|
|
FALSE, public authorities should publish information about transactions to contractors and commercial vendors. It is not uncommon however that either public officials or commercial contractors will attempt to block releases due to the reason of commercial confidentiality of the third party. The argument is however most commonly argued when requests are made for actual contracts (though even access to fine print contracts can often be released in full without redacting any information).
|
|
|
|
##“We cannot release granular data. You can get aggregated expenditures”
|
|
FALSE, access to line-by-line transactional spending data is essential in order to ensure accountability. In order to be able to investigate suppliers and procurement practices, detailed transaction-level spending data is required.
|
|
|
|
There are currently a few countries who release such data, the UK, US, Brazil and Slovenia being some of the leaders in this field. While they are leaders, there is still work to do there. In particular we have noticed a that several countries have introduced fairly high disclosure thresholds in relation to their decision to disclose transactional data. Such practises should be challenged and remain a serious concern, as large shares of public spending can be covered below such disclosure thresholds.
|
|
|
|
Between countries disclosure thresholds vary widely:
|
|
- United States (federal level): USDollar 25,000
|
|
- United Kingdom, Government: GBP 25,000
|
|
- United Kingdom, Councils: GBP 500 (for spending data), GBP 50,000 (for contracts)
|
|
- Slovenia: No minimum disclosure threshold
|
|
- Greece: No minimum disclosure threshold
|
|
|
|
Without knowing more about why these levels have been set across countries, it is hard to fathom why they were so positioned or whether they are reasonable (find more on this topic in the Spending Data Handbook).
|
|
|
|
##“You need to specify what payments you are asking for”
|
|
In some countries public authorities require requesters to specifically identify the data they would like to see released (eg. all payments made to company X).
|
|
|
|
It is a wrong principle to limit access to transactional spending by requiring detailed itemised FOI requests does not promote openness. Some agencies will not be able to publish spending data from sub-departments or local agencies.
|
|
|
|
Our advice would be for public agencies:
|
|
- to publish all spending data possible to release.
|
|
- to publish the data as granular as possible across all programmes and departments.
|