[examples/openspending] - openspending v0.2 (#907)
* [examples/openspending] - openspending v0.2 * [examples/openspending][m] - fix build * [examples/openspending][xs] - fix build * [examples/openspending][xs] - add prebuild step * [examples/openspending][m] - fix requested by demenech * [examples/openspending][sm] - remove links + fix bug
This commit is contained in:
@@ -0,0 +1,17 @@
|
||||
---
|
||||
lead: true
|
||||
title: Appendix
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
The appendix contains material which will later be integrated into the Spending Data Handbook, but which was important as direct reference material for this report.
|
||||
|
||||
* [Putting the Open Data into Open Budgets](./open-budgets-open-data/)
|
||||
* [Tool Ecosystem: What tools do people use to work with financial data?](./tool-ecosystem/)
|
||||
* [Common arguments against publishing data](./machinereadfaq/)
|
||||
* [How to publish spending data without disclosing personal information?](./privacyguide/)
|
||||
* [Other handy datasets](./other-handy-datasets/)
|
||||
|
||||
**Next**: [Putting the Open Data Into Open Budgets](./open-budgets-open-data)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,72 @@
|
||||
---
|
||||
lead: true
|
||||
title: Common arguments against publishing data
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
Across the community almost everyone can explain stories about how struggling with government officials for transactional spending data in machine-readable format. Often publishers simply do not know that civil society wants data in a particular format, but there are also deliberate obstructions. In this FAQ we provide a list of the most typical excuses for rejecting to release data in computer-friendly formats.
|
||||
|
||||
## ... in machine-readable format
|
||||
|
||||
### “PDFs are on my computer - therefore they are machine-readable”
|
||||
|
||||
FALSE: The fact they are on your computer means they are electronic copies, but not that they are machine-readable. PDFs are essentially a set of instructions for a printer on how to print a page, they look nice and appealing to the human eye, but to a computer, they are little more than a picture.
|
||||
|
||||
PDFs go from bad to worse from the perspective of someone trying to do data work:
|
||||
|
||||
* [Better PDFs are machine-generated](https://www.gov.uk/service-manual/design-and-content/resources/creating-accessible-PDFs.html), typically something like an Excel or Structured Word Documents converted into a PDF [(see example)](https://docs.google.com/a/okfn.org/file/d/1En9UbXiVwinRiMPf6gwL7LY-1rClPdEoM_aj75FWNgm5qLbIa42fg6y81YFv/edit). Often, you can copy and paste information from them, but there may be some formatting or issues.
|
||||
* Worse PDFs are typically scanned documents. Often, to add to the misery, they will be copies of faxes, smudged, speckled, tea- water- or mould-stained or crooked (sometimes all of the above).
|
||||
* Image files are not machine-readable for the same reasons.
|
||||
|
||||
### “If we publish in machine-readable, open formats - someone will alter the data and use it to discredit us.”
|
||||
|
||||
Again, FALSE. If someone wants to use data badly enough, they will use it even if they have to get it out of documents manually. If they have to get it out manually - mistakes could be introduced. Publishing the data in machine-readable format simply allows the user to start working with the data straight away.
|
||||
|
||||
Our advice would be the following:
|
||||
|
||||
<ul>
|
||||
<li>Publish both machine-readable and non-machine readable formats. We insist on the former for analysis, but the latter can also be useful e.g. to cross reference numbers and be an easily readable form to read and share reports. </li>
|
||||
<li>Encourage users of the data to show their working. A good data project will usually:
|
||||
<li>
|
||||
<ul>
|
||||
<li>Link back to the original source data </li>
|
||||
<li>Link to any modified data with an explanation of how it was changed, with the calculations to any underlying working clearly visible. When you provide such a clear audit trail others will be able to replicate your work and examine transparently that everything was done without errors. In journalism this is sometimes known as the “nerd box”. </li>
|
||||
<li>Offer the data source the chance to comment on calculations from the data in order to clear out misunderstandings.</li>
|
||||
<li>This allows anyone to check the accuracy of the working and verify the results.</li>
|
||||
</ul>
|
||||
</ul>
|
||||
## ... in sufficient levels of detail
|
||||
|
||||
### “We cannot release spending data as it contains personal information”
|
||||
|
||||
FALSE, public authorities holding spending data, which includes personal information should not refrain from responsibility of publishing the data. Instead authorities should conduct the proper examination and redact personal data accordingly (workflows can be developed so that this effort is minimal). We see real risks of local and national governments holding back spending data with this excuse and have therefore co-written a guide for public authorities on how to deal with personal information in spending data (see the <a href="../privacyguide/">privacy guide</a>).
|
||||
|
||||
The current access to data from the EU farm subsidy programme is a clear example of a case where privacy (in this case for farmers) was used as argument to decide a case at the European Court of Justice, which significantly [reduced access to data on farm subsidy payments](http://farmsubsidy.org/news/features/2012-data-harvest/).
|
||||
|
||||
### “We cannot release spending data due to third parties due to confidentiality concerns”
|
||||
|
||||
Public authorities should publish information about transactions between them, contractors and commercial vendors. It is not uncommon however that either public officials or commercial contractors will attempt to block releases due to commercial confidentiality of the supplier (the third party).
|
||||
|
||||
The argument is most commonly argued when requests are made for actual contracts, but even contracts are often [released in full](http://www.asktheeu.org/en/request/292/response/805/attach/2/Signed%20Framework%20Agreement%20with%20Eurocontrol.PDF.pdf) without redactions.
|
||||
|
||||
### “We cannot release granular data. You can get aggregated expenditures”
|
||||
|
||||
NOT USEFUL, access to line-by-line transactional spending data is essential in order to ensure accountability. In order to be able to investigate suppliers and procurement practices, detailed transaction-level spending data is required.
|
||||
|
||||
There are currently a few countries who release such data, the UK, US, Brazil and Slovenia being some of the leaders in this field. While they are leaders, there is still work to do there.
|
||||
|
||||
We have also noticed a that several countries have introduced fairly high disclosure thresholds in relation to their decision to disclose transactional data. Such practises should be challenged and remain a serious concern, as large shares of public spending can be covered below such disclosure thresholds.
|
||||
|
||||
Between countries disclosure thresholds vary widely:
|
||||
|
||||
* United States (federal level): USDollar 25,000
|
||||
* United Kingdom, National: GBP 25,000
|
||||
* United Kingdom, Councils: GBP 500 (for spending data), GBP 50,000 (for contracts)
|
||||
* Slovenia: No minimum disclosure threshold
|
||||
* Greece: No minimum disclosure threshold
|
||||
|
||||
Without knowing more about why these levels have been set across countries, it is hard to fathom why they were so positioned or whether they are reasonable.
|
||||
|
||||
**Next**: [How to publish spending data without disclosing personal information](./privacyguide/)
|
||||
|
||||
**Up**: [Appendix](../)
|
||||
@@ -0,0 +1,108 @@
|
||||
---
|
||||
lead: true
|
||||
title: Putting the Open Data Into Open Budgets
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
We have looked in detail in this report at criteria which make it difficult for organisations to use data that has been released by governments. In January 2013, we hosted a community call with to look at what the demands of the Open Data Community are with regard to Open Budgets. Despite both featuring the word “Open” - there is still a disconnect between the use of the word “open” in many circles to signify availability and “open” in technical spheres to signify absence of legal, technical and social restrictions.
|
||||
|
||||
The purpose of the call was to investigate whether it would be possible to specify the demands of the Open Data Community with relation to budget and spending data.
|
||||
|
||||
## What do we need and how do we need it?
|
||||
|
||||
### Structured data
|
||||
|
||||
So it’s not so labour-intensive to do analysis!
|
||||
|
||||
For definitions of structured data, please see section below: *Structured data: What data formats to provide*
|
||||
|
||||
### Bulk access
|
||||
|
||||
* *It should also be possible to download all of the budget information in bulk*.
|
||||
* Preventing bulk downloads by using systems such as CAPTCHA is not acceptable.
|
||||
* Some interviewees requested data to be released via an API. This is indeed a useful move particularly when data is updated regularly, but should not be the only method to acquire the data - many non-technical users require simply bulk download of the data.
|
||||
|
||||
### Updates and amendments
|
||||
|
||||
If there is a requirement to update or change the budget documents e.g. as new drafts are produced - it's important to show the versions and keep track of the changes. Some suggestions:
|
||||
|
||||
* Displaying what date the data was "updated on", or using version numbers would be acceptable.
|
||||
* Crucial is that there should be access to all drafts (i.e. they should not be removed from their place of publication and should remain available) even when new versions are published.
|
||||
|
||||
### Timely data (that stays around)
|
||||
|
||||
Data is required:
|
||||
|
||||
* Within a period of time that would allow change to take place
|
||||
* Early in budget formulation process so that it is possible to participate in discussion about where the funds should actually go
|
||||
* After budget formulation so that you could monitor whether things had actually happened
|
||||
* Planned versus execution data while such comparisons still matter - for example, so that one might complain that a project didn’t actually happen, and the guy who would have been responsible for that is still in that job, and the people who would have benefited from it are still going to be angry
|
||||
|
||||
<div class="well">
|
||||
<h3>How long should data be available online? </h3>
|
||||
<ul>
|
||||
<li>The costs of storing information online nowadays are so minimal, that this question is essentially redundant (i.e. the answer is "forever"). </li>
|
||||
<li>If a government feels it is absolutely necessary to remove data after a certain period of time *(this should be a minimum of several years after original publication, longer if the period to which the budget relates is greater than a year)*, they should **specify at time of publication, clearly the time and date on which the information will be removed**. This will allow civil society organisations sufficient time to make a backup copy for themselves.</li>
|
||||
</ul>
|
||||
</div>
|
||||
### Classifications
|
||||
|
||||
Different users are interested in different aspects of budgets. Not all classifications will be available, and the availability and structure of classifications, as well as the requirements of individuals and organisations, will vary from country to country.
|
||||
|
||||
* All available classifications should be published.
|
||||
* Functional classifications are often the most comprehensible to citizens. They explain the particular themes or sectors on which money is spent. There are also international standards for comparing functional spending (e.g. COFOG).
|
||||
* Programmatic classifications are used particularly in developing countries for relating to multi-year development plans
|
||||
* Administrative classifications show which department or agency received the money – and are therefore important for the accountability of funds down the chain.
|
||||
|
||||
##### Breakdown
|
||||
* Information can then be aggregated up to create more meaningful and digestible information, but the reverse (from aggregate to disaggregate information) is not possible.
|
||||
* Again, the availability of detailed information, as well as the requirements of individuals and organisations, will vary from country to country.
|
||||
* Therefore, budgets should be as detailed and disaggregated as possible.
|
||||
|
||||
### Spending standard
|
||||
|
||||
In the [Technology for Transparent and Accountable Public Finance Report](http://community.openspending.org/research/gift/), we identified the need for a global standard for opening up transaction-level spending data. A couple of further comments on this topic.
|
||||
|
||||
* This is probably going to be more useful at the international level – e.g. to pull all the data together and look at super-aggregate information.
|
||||
* It could also be useful at country level though, for inter-country comparisons.
|
||||
|
||||
The number one low-hanging fruit which could be solved in order to vastly improve the usability of available budget and spending (plus procurement and other types listed above) information is to make data **machine-readable**.
|
||||
|
||||
<div class="well">
|
||||
<h2>What does Machine-readable mean?: Implementation guidelines from the UK government.</h2>
|
||||
<quote>
|
||||
The UK government have now issued very good clear, <a href="https://www.gov.uk/service-manual/design-and-content/choosing-appropriate-formats.html">plain-language guides</a> for service managers on which data formats are appropriate for publishing data. The US government has also decreed that all data shall be published in machine-readable formats. An extract from the UK service manual from gov.uk is copied below for the convenience of the reader:
|
||||
|
||||
<ul>
|
||||
<li><quote><strong>“For data, use CSV or a similar ‘structured data’ format (see also JSON and XML). Do not publish structured data in unstructured formats such as PDF</strong></quote>.</li>
|
||||
<li><quote><strong>If you are regularly publishing data (financial reports, statistical data, etc.) then your users may well wish to process this data programmatically, and it becomes especially important that your data is ‘machine-readable’. PDFs, Word documents and the like are not suitable formats for data publication. In addition, you should consider making your data available through an API if this will simplify your users’ interactions with your publications. [...]</quote></strong> </li>
|
||||
<li><quote><strong>If you are publishing a written report that contains statistical tables, provide the tables alongside or in addition to your report in suitable data formats.</quote></strong>
|
||||
</ul>
|
||||
</quote>
|
||||
|
||||
[...]
|
||||
|
||||
<quote>
|
||||
|
||||
<h2>Don’t assume your users can read proprietary formats</h2>
|
||||
Wherever possible, publish in accessible, patent-free, <a href="https://en.wikipedia.org/wiki/Open_format">open formats</a>, for which software is widely available on a variety of platforms. If publishing in proprietary formats, you should always make a non-proprietary alternative available.
|
||||
[...] For tabular data, provide <strong> <a href="http://en.wikipedia.org/wiki/Comma-separated_values">CSV</a> or <a href="http://en.wikipedia.org/wiki/Tab-separated_values">TSV</a> </strong> rather than Excel spreadsheets (.xls/.xlsx).
|
||||
|
||||
</quote>
|
||||
Read the full version of the guidelines <a href="https://www.gov.uk/service-manual/design-and-content/choosing-appropriate-formats.html">here</a>.
|
||||
|
||||
</div>
|
||||
## Why is this so important?
|
||||
|
||||
Civil Society Organisations currently waste a huge amount of time and resources in converting data from non-machine-readable formats into ones that they can use for analysis, visualisations or other projects. Any data project has a data pipeline:
|
||||
|
||||

|
||||
|
||||
Typically, in the projects we have analysed in this report, finding data and getting data (including extracting data out of formats such as PDFs into usable formats) are the most time intensive part of all of the projects. Extracting data out of non-machine readable formats:
|
||||
|
||||
* **is a waste of time and resources for all involved**. In an ideal world, re-users of the data should be able to dedicate the majority of their time to analysis, presentation and action around the data. It can take weeks or months for organisations to extract all the information from one document or file, enough to make a visualisation or simple analysis.
|
||||
* **introduces transcription errors during conversion**. Even current software to extract information from PDFs automatically and can introduce errors.
|
||||
|
||||
**Next**: [Tool Ecosystem](./tool-ecosystem/)
|
||||
|
||||
**Up**: [Appendix](../)
|
||||
@@ -0,0 +1,46 @@
|
||||
---
|
||||
lead: true
|
||||
title: Other handy datasets
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
In our conclusions section, we highlighted the main types of data which are in demand (budgets, transaction-level spending, procurement...). We have kept the demands in the conclusion short for clarity's sake, however there are lots of other datasets which are essential for many organisations to be able to combine with the key data:
|
||||
|
||||
<ul>
|
||||
<li><strong>Information on targets or outputs</strong>, these should be clearly mappable to the project or programme area to which they relate in order to be able to answer questions such as “What is the delivery rate?” or “Did that injection of funds/stimulus package result in better performance?”. These are not always produced by governments, but frequently in demand.</li>
|
||||
<li><strong>Geographic information</strong>, should be available at reasonable granular levels. Governments often transfer grants and other payments to geographically identified areas, eg. as part of redistribution schemes. Providing access to such regionalised accounts can be crucial in order to enable CSOs to assess equality and distribution of budgetary priorities. Note that regional data does often provide too little granularity to expose local inequality.</li>
|
||||
</ul>
|
||||
Two cases exemplify to what extent including geographic information can be helpful for different missions:
|
||||
|
||||
> “What we would like to be able to do is pull out ward-level data [...] or very very micro-level data, neighbourhood level, most of the data which is released [in the UK] is Local Authority Level, and that’s just too big for us.” - <strong> Jez Hall of the Participatory Budgeting Unit (UK) </strong>
|
||||
|
||||
Additionally, in the [case study](../../case-studies-budgets/cbga/) from the <strong>Centre for Budget and Governance Accountability (India)</strong>, a lot of questions the group strove to answer could be answered simply by ensuring that the data contained information on, which state received the funds (this is pretty high level information, but still was unavailable.
|
||||
|
||||
<ul>
|
||||
<li><strong>Information on demographics:</strong> Most policy researchers want to be able to answer questions more specific than per-capita estimates. This makes data such as "Household Surveys" particularly important. They might ask questions such as:
|
||||
<ul>
|
||||
<li>“How many users of a particular service are there?”</li>
|
||||
<li>“How many households benefit from a particular policy?” </li>
|
||||
<li>“Of those households, how many are living below the poverty line?”</li>
|
||||
<li>“What is the income bracket of those people?” </li>
|
||||
<li>“How many young people/women/people with a disability/ people of a specific ethnicity/[...] benefit from a particular policy/programme?”</li>
|
||||
<li>“How much does a particular school place cost in different boroughs or regions?”</li>
|
||||
</ul>
|
||||
<li><strong>Information on the actual goods purchased as part of government funded work.</strong> The majority of questions related to state purchasing require details on the quantity, price and frequency of purchases. (e.g. journalists will often want to know "how many computers?" were purchased, or even "how many computers were published for X amount of money?", rather than "how much was spent on computers?") By way of illustration of the types of analysis groups need to be able to do, see a recent open data success story: <a href="http://www.bj-hc.co.uk/bjhc-news/news-detail.html?news=2327&lang=en&feed=130">Open Data probe shows NHS statin bill twice what it should be</a>. As an example the US Medicare programme published a database on prices charged by various hospitals for thousands of the most regular treatments.</li>
|
||||
<li><strong>Economic & Macroeconomic projections</strong> are becoming increasingly more important as national fiscal policies are measured against models from international organisations eg. the EU deficit thresholds. For both EU member states and EU Neighbor Countries economic reviews by the EU Commission can have substantial impacts locally on policy. The public accessibility of macroeconomic governance models has until now not had a prominent place in the debate around these often contested models. However it is clear that the public should be able to scrutinise conditions and calculations for such models.</li>
|
||||
<li><strong>Structured information on the planned pattern of cuts</strong> that could be tied to e.g. particular programmes / geographical area</li>
|
||||
<li><strong>Data showing how much governments / political candidates spend on media advertising</strong> (both through taxpayer funds and from campaign contributions).</li>
|
||||
</ul>
|
||||
This list is clearly not comprehensive, we list here only frequently recurring requests from users.
|
||||
|
||||
## Country-specific requests
|
||||
|
||||
Some countries have formulated their own detailed requests for information and reviews of currently available information, either as part of a public consultation, research or spontaneously:
|
||||
|
||||
* [Romania](https://docs.google.com/spreadsheet/ccc?key=0Anbfx9yMO3c8dGptNHF5aGhjdXdRN2U5aVlEMUJiMmc#gid=0) (In Romanian)
|
||||
* [India](http://www.accountabilityindia.in/accountabilityblog/2241-dating-data-what-are-characteristics-dream-government-data)
|
||||
* [Hungary](http://kmonitor.hu/files/page/OGP_ajanlasok_KM_TASZ.pdf) (In Hungarian)
|
||||
|
||||
See also the [user testimonials](http://community.openspending.org/research/gift/testimonials/) from the earlier report: “Technology for Transparent and Accountable Public Finance.”
|
||||
|
||||
**Up**: [Appendix](../)
|
||||
@@ -0,0 +1,379 @@
|
||||
---
|
||||
lead: true
|
||||
title: How to publish spending data without disclosing personal information
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<p class="c3">
|
||||
<div class="well">by OpenSpending team and Ian Makgill, Ticon</div>
|
||||
|
||||
<p class="c13 c3"><span class="c7"></span>
|
||||
|
||||
<p class="c3"><span>This guide is purposed to help governments to publish spending without compromising personal information. It has been drafted with UK local councils and other public authorities who wish to publish transactional spending in accordance with the UK regulations, but who are concerned if their data include personal information (such as personal names or addresses). While we recognise that governmental accounting systems as well as privacy regulations differs vastly across countries, we think this guide provide key practical advice, which should to some extent be replicable. </span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h2>Background</span></h2>
|
||||
|
||||
<p class="c3"><span>In</span><span>January 2013 a freelance data specialist</span><span> from the community used OpenSpending to identify a number of privacy breaches in an individual dataset published by a local council. This was due to inconsistent redaction of sensitive data by the local authority. Whilst the majority of these payments were to organisations (hence probably not highly sensitive), there were also a few unredacted payments to individuals. The person who uploaded the data immediately notified their local council, who in turn referred this to their audit committee. As a precaution the dataset in question, the UK Local Council £500 spending data was taken off the site immediately.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>Data privacy should never be a valid justification for shutting off access to public spending information, as there should be simple processes in place to prevent private data from being published. As more spending data becomes public, government agencies will have to implement release procedures, which prevents privacy breaches. For the financial transparency community the data privacy issue represents a challenge, as governments might be tempted to use this as reason for limiting public disclosures.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>
|
||||
<h2>So why are we writing this guide?</span></h2>
|
||||
|
||||
<ol class="c6" start="1">
|
||||
<li class="c4 c3"><span>because we care about </span><span class="c0"><a class="c1" href="http://blog.okfn.org/2013/02/22/open-data-my-data/">privacy of individual citizens</a></span></li>
|
||||
<li class="c4 c3"><span>because we care about Open Data, we think it is vital part of making Government transparent and accountable</span></li>
|
||||
<li class="c4 c3"><span>because the presence of personal data within transactional spending data has been identified as a barrier for making such data available to the public. (In April 2013 Copenhagen City Council rejected a Freedom of Information request for 1 mi. transactions worth EUR 2.5bn due to the fact that the data contained personal data, which could not be removed without extensive use of personal resources.)</span></li>
|
||||
</ol>
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h2>What are the rules on data privacy and obligations for publishing transactional level spending data?</h2>
|
||||
</span>
|
||||
|
||||
<p class="c3"><span>Incidents of privacy breaches highlights the importance of proper procedures to ensure that data from public sector bodies is properly redacted before being published. The UK government produces a</span><span><a class="c1" href="http://data.gov.uk/blog/local-spending-data-guidance"> </a></span><span class="c0"><a class="c1" href="http://data.gov.uk/blog/local-spending-data-guidance">guideline document for data publishers</a></span><span>, which ensures that issues like this are prevented and hence very rare.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>The Local Governments Association (UK) has published this</span><span><a class="c1" href="http://localnewcontracts.readandcomment.com/appendix-c-inclusions-and-exemptions-for-publishing-data-2/"> </a></span><span class="c0"><a class="c1" href="http://localnewcontracts.readandcomment.com/appendix-c-inclusions-and-exemptions-for-publishing-data-2/">guide</a></span><span>.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h2>Understanding the problem</span></h2>
|
||||
|
||||
<p class="c3"><span>In a broad sense, the law is quite simple: you can’t publish anything that might identify an individual. Complying with the law is less straightforward. It would be nice if we could just search our output files for</span><span class="c2"> </span><span>for all the occurrences of "Mr.", "Mrs." or "Miss" and redact accordingly, but personal data is often quite difficult to locate and successfully repressing the data requires diligent checking and good organisational practices.</span><span class="c2"> </span>
|
||||
|
||||
<p class="c3"><span> </span>
|
||||
|
||||
<p class="c3"><span>To complicate matters, many companies and organisations use personal names as their identifiers. Many companies in the Companies House register include “Mr.” in their name, and there’s still more companies with titles that could be confused with personal names.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h2>Where personal information is usually found in spending data</h2>
|
||||
</span>
|
||||
|
||||
<p class="c3"><span>The primary source of personal data is in the</span><span> “name” field</span><span> from the transaction. Ensuring that this data has been cleansed is likely to ensure that most of your potential breaches have been resolved. However, at times transactions can include privacy sensitive information in the “description” field of the invoice which could include name, case file or social security number. For this reason all columns in the dataset should be analysed.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>Columns to pay special attention to:</span>
|
||||
|
||||
<ol class="c6" start="1">
|
||||
<li class="c4 c3"><span>Name</span></li>
|
||||
<li class="c4 c3"><span>Address</span></li>
|
||||
<li class="c4 c3"><span>Narrative / description</span></li>
|
||||
<li class="c4 c3"><span>Department</span></li>
|
||||
<li class="c4 c3"><span>Category</span></li>
|
||||
</ol>
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">Identifying names</span>
|
||||
|
||||
<p class="c3"><span>There are a number of typical indicators that the payment is made to an individual:</span>
|
||||
|
||||
<ol class="c6" start="1">
|
||||
<li class="c4 c3"><span>Use of "Mr", "Mrs", "Miss" etc at the start of the supplier name</span></li>
|
||||
<li class="c4 c3"><span>Use of an initial followed by a name e.g. "D. Harrison"</span></li>
|
||||
<li class="c4 c3"><span>Payment is not associated with an invoice </span></li>
|
||||
<li class="c4 c3"><span>The payment instruction details specify a refund or specifics such as "Direct Payment"</span></li>
|
||||
</ol>
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>It is possible to use a procedure called 'pattern matching' that can highlight any items in a database that match a certain pattern of characters. Using these routines will make it possible to highlight entries that may include personal name data.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>Globally it seems to vary to what extent countries will allow companies and sole proprietorships to be disclosed to the public.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<h2 class="c3"><a name="h.sxy4odoqd9n2"></a><span>How to address the issue of personal data?</span></h2>
|
||||
<p class="c3"><span>As mentioned before the most important field in spending data is the supplier name, as this will most likely contain the most valuable personal data, but publishers need to be aware of the potential for identities to be triangulated from additional data, such as narratives and transaction descriptions. It is therefore necessary to review all fields in a data set.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h3>Step 1: Flagging at source</span></h3>
|
||||
|
||||
<p class="c3"><span>Evaluating the entries in the supplier name field to assess whether the data includes an individual's name is an inefficient and largely ineffective method for flagging personal data breaches. Instead the most effective method of suppressing publication of this data is to ensure that personal data is flagged as such when payments are made. Every Department in the Council will have a legitimate reason for issuing payments to an individual, so it is advisable to establish an organisation-wide protocol for flagging payments to individuals. Most Councils (UK) have a standard procedure for co-ordinating payments that includes raising a purchase order (PO). Users that generate POs for payments to individuals should simply append a predetermined code to the recipients name, which can then in turn be picked up by the IT department so that the data can be suppressed before publication.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h3>Step 2: Monitoring the Supplier Database</span></h3>
|
||||
|
||||
<p class="c3"><span>The data used in payment reports will originate from the organisation's finance system, which includes a record of suppliers to whom payments are made. To prevent fraud, there is normally a strict procedure for adding suppliers to this database, this procedure should include a requirement that any personal data is flagged for later suppression, effectively creating a second filter to prevent personal data breaches. It is important to note that this procedure should not form the primary prevention mechanism, as the name in the supplier field is often simply not enough to identify whether a payment is for an individual or not, however, this step should be used in order to flag any payments that appear to be to an individual.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h3>Step 3: Pre publication reviews</span>
|
||||
|
||||
</h3>
|
||||
<p class="c3"><span>All data that is to be published should go through a two-stage pre-publication review. The first part should include an automated review of the data, where a script is used to select entries that look like they may include personal data. </span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>The scripts should be capable of screening for the following:</span>
|
||||
|
||||
<ol class="c6" start="1">
|
||||
<li class="c4 c3"><span>Pre-determined flags that show that a payment is to an individual.</span></li>
|
||||
<li class="c4 c3"><span>Common pattern matches used in names (e.g. "Miss")</span></li>
|
||||
<li class="c4 c3"><span>Names of payees that are known to the Council, (e.g. they have been identified as personal payments before)</span></li>
|
||||
<li class="c4 c3"><span>Any specific funding codes that are likely to indicate that a payment is going to an individual (e.g. Social Care Direct Payments).</span></li>
|
||||
</ol>
|
||||
<p class="c3 c13"><span></span>
|
||||
|
||||
<p class="c3"><span>Once data has been selected, it should be reviewed manually to confirm whether the data provides sufficient information to identify an individual. There is no need to manually review data that has already been flagged as an individual by the Department making the payment, or has been previously been identified as an individual through previous work to prevent breaches. Data that has been flagged because it triggered a pattern match or through a funding code should be checked manually.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>Once a data line has been identified as a payment to an individual, then the key pieces of text should be stored as a record that allows the Council to suppress that data in the future, (see step 3 above) and for use in the automated flagging procedures for ensuing months.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>A further, manual review needs to be undertaken to ensure that personal payments are not missed. Typically payments to individuals will involve small sums (relative to the amount paid to companies) in a small number of transactions. Therefore ordering the data by the lowest value transactions and then looking through the payment lines to try and identify any payments to individuals. Care should be taken to ensure that reviewers are aware of the potential for foreign names to appear in the text and steps should be taken to ensure that a foreign language review is undertaken where necessary. Although this work sounds onerous, in actuality it is a very small task; a regular monthly review should occupy just minutes of staff time, not hours.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h3>Step 4: Removing data</span></h3>
|
||||
|
||||
<p class="c3"><span>You should never delete whole rows of data, instead you should over-write any data that might constitute a data breach. In particular, there should be no reason for removing either date or value fields from a transaction as these cannot be used to identify an individual. Additional data such as department and narrative information should only be overwritten if it contains data that could identify an individual. Councils should also take steps to detail why the data has been censored. For example, it would be suitable to replace a person's name with the following "Redacted to comply with the Data Protection Act". Providing this additional information gives the data user a good understanding of the nature of the underlying data and advises data consumers that the Council is undertaking it's role as a data producer responsibly.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<h2 class="c3"><a name="h.wi2gl3oiyddl"></a><span>What not to do: The use of overly restrictive disclosure procedures</span></h2>
|
||||
<p class="c3"><span>Data privacy issues should never act as justification for avoiding public disclosure. An example of this, might be the suppression of spend data to a Barrister, on the grounds that the data could be used to identify the individual. Whilst it is right to suppress personal data, the Barrister will be the member of a Chamber of Barristers and any transaction could be allocated to the Chambers rather than to the individual Barrister. The same applies to payments to Doctors, payments should be allocated to the Doctor's practice, not to the individual Doctor concerned.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>Ticon has noticed a worrying trend of Councils to redact data on the basis that the payment was made to a 'sensitive supplier', or that the transaction was 'commercially sensitive'. The LGA cites the issues of arbitration, commercial confidence and transactions relating to the underwriting of debt as suitable reasons to redact spend information (see below).</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<a href="#" name="db587461f48ca9d947b08d54b3901ee8dd196ccf"></a><a href="#" name="0"></a>
|
||||
<table border="1" cellpadding="0" cellspacing="0" class="c58">
|
||||
<tbody>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span class="c2">No</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span class="c2">Examples of transactions that may be excluded from publication</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span class="c2">Reason</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span class="c2">Redacted or Excluded</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>1</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Salary payments to staff (including bonuses), except when published under the senior salary scheme. These will be published separately </span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Personal information protected by the Data Protection Act</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Excluded</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>2</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Pension contributions (excluding service charge) and National Insurance Contributions</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Personal information protected by the Data Protection Act</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Excluded</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>3</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Severance payments</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Personal information protected by the Data Protection Act</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Excluded</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>4</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Payments to individuals from legal process - compensation payments, legal settlements, fraud payments</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Personal information protected by the Data Protection Act</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Redacted </span>
|
||||
|
||||
<p class="c3"><span>(in exceptional cases exclude the data)</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>5</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Competition prizes – where a normal part of operations</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Personal information protected by the Data Protection Act</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Redacted</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>6</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Settlements made with companies as an arbitration which is conditional on confidentiality</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Commercial-in-confidence – exempt under FOI</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Redacted</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>7</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Potential betrayal of a commercial confidence, or prejudice to a legitimate commercial interest</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Very rare and will need to be justified</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Redacted</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>8</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Transactions relating to the financing or underwriting of debt e.g. purchase of credit default swaps</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c3"><span>Outside the definition of expenditure for this purpose</span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Excluded</span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="c25">
|
||||
<td class="c38">
|
||||
<p class="c3"><span>9</span>
|
||||
|
||||
</td>
|
||||
<td class="c29">
|
||||
<p class="c3"><span>Provisions or promises to pay not yet realised</span>
|
||||
|
||||
</td>
|
||||
<td class="c17">
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
</td>
|
||||
<td class="c35">
|
||||
<p class="c3"><span>Excluded </span>
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>If Councils are to be genuinely open about their activities, there should be no suppression of payments to a commercial entity listed in Companies House (or any other Companies register). Individual companies can chose to protect their tender submissions and other commercial data that they send to the Council from release under the Freedom of Information Act, however, there really should be no position where the simple fact of a payment is seen as commercially sensitive. Whilst the suppression of data on payments to companies as part of settlements is a common legal practice, Councils should do their utmost to resist activity which is so protectionist and undemocratic. Perhaps financing payments should be recorded separately, but it is hard to justify their exclusion from open publication.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h3>Privacy for farmers receiving farm subsidies in EU</h3>
|
||||
</span>
|
||||
|
||||
<p class="c3"><span>In 2010 the European Court of Justice ruled that mandatory disclosure of names of</span><span><a class="c1" href="http://www.ft.com/intl/cms/s/0/16973ef0-ec2d-11df-9e11-00144feab49a.html#axzz2KIkhha4O"> </a></span><span class="c0"><a class="c1" href="http://www.ft.com/intl/cms/s/0/16973ef0-ec2d-11df-9e11-00144feab49a.html#axzz2KIkhha4O">German recipients from EU farm subsidies</a></span><span> amounted to a breach of their privacy. The ECJ decision has not been overturned since and has led to a substantial decrease in access to overall EU spending, due to the large share that the Common Agricultural Programme (CAP) occupy within the total budget. Much like the Barristers, these farms are commercial entities and the farms should be named on each transaction. </span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span>Whilst the need to protect individuals in receipt of non-commercial payments from Government, e.g. Housing Benefit or Pensions payments should be recognised, Open Spending believes that all commercial payments should be published openly. If individuals who receive monies from Government in exchange for services, or as part of a grant for a commercial enterprise need to remain anonymous, they can always choose to reject that payment.</span>
|
||||
|
||||
<p class="c13 c3"><span></span>
|
||||
|
||||
<p class="c3"><span class="c2">
|
||||
<h3>Summary</h3>
|
||||
</span>
|
||||
|
||||
<p class="c3"><span>The world is just getting used to the existence of open spending data, but as the data attracts increased usage governments will come under greater pressure to create dependable, consistent and accurate datasets. Now is the time to ensure that your data is correctly presented, free of data that may breach regulations and can be used by organisations like openspending.org.
|
||||
|
||||
**Next**: [Other handy datasets](./other-handy-datasets)
|
||||
|
||||
**Up**: [Appendix](../)
|
||||
@@ -0,0 +1,268 @@
|
||||
---
|
||||
lead: true
|
||||
title: Tool Ecosystem
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
## Spending Data: The Tool Ecosystem
|
||||
|
||||
There are a set of staple tools that can be used to tackle many of the issues highlighted by the organisations in this report. For each one - we’ve outlined the tool - what it’s useful for and what the barrier to entry is.
|
||||
|
||||
We continue to hunt for more and better tools to do the job and hope that some of the problems, such as governments publishing their data in PDFs or HTML, will soon be irrelevant, so that we can all focus on more important things.
|
||||
|
||||
*If you would like to suggest a tool to be added to this ecosystem - please email info [at] openspending.org*
|
||||
|
||||
### Key
|
||||
|
||||
For each tool - we’ve outlined the its use and what the barrier to entry is, here's a guide to the rough categorisation we used:
|
||||
|
||||
<strong>Basic = An off-the-shelf tool that can be learned and first independent usage made of within 1 day. No installation on servers etc required.</strong>
|
||||
|
||||
<strong>Intermediate = Between 1 day - 1 week to master basic functionality. May require tweaking of code but not new creation thereof. </strong>
|
||||
|
||||
<strong>Advanced = Requires code. </strong>
|
||||
|
||||
## Stage 1: Extracting and getting data
|
||||
|
||||
<table border="1">
|
||||
<tr>
|
||||
<td><strong>Issue</strong></td>
|
||||
<td><strong>Tools</strong></td>
|
||||
<td><strong>Level</strong></td>
|
||||
<td><strong>Notes</strong></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Data not available</td>
|
||||
<td>Freedom of Information Portals (e.g. <a href="https://www.whatdotheyknow.com/">What Do They Know</a>, <a href="https://fragdenstaat.de/">Frag den Staat</a>).
|
||||
|
||||
<td>Basic - though some education may be required to inform people that they have the right to ask, how to phrase an FOI request, whether it is possible to submit these requests electronically etc.</td>
|
||||
<td>While Freedom of Information portals are a good way of getting data - results often end up scattered. It would be useful to have results structured into data directories so that it was possible to search successful responses together with proactively released data so that there was one common source for data.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Data available online but not downloadable. (e.g. in HTML tables on webpages).</td>
|
||||
<td> <em>For simple sites</em> (information on an individual webpage) Google Spreadsheets and ImportHTML Function, or the <a href ="https://chrome.google.com/webstore/detail/scraper/mbigbapnjcgaffohmbkdlecaccepngjd?hl=en">Google scraper extension</a> (basic).
|
||||
<em>For more complex webpages</em> (information spread across numerous pages) - a scraper will be required. Scrapers are ways to extract structured information from websites using code. There is a useful tool to make doing this easier online - <a href="https://scraperwiki.com/">Scraperwiki</a>.(advanced).
|
||||
|
||||
</td>
|
||||
<td>For the basic level, anyone who can use a spreadsheet and functions can use it. It is not, however, a well-known command and awareness must be spread about how it can be used. (People often daunted because they presume scraping involves code). Scraping using code is advanced, and requires knowledge of at least one programming language. </td>
|
||||
<td>
|
||||
The need to be able to scrape was mentioned in <em>every</em> country we interviewed in the Athens to Berlin Series.
|
||||
|
||||
For more information, or to learn to start scraping, see the <a href="http://schoolofdata.org/handbook/courses/scraping/">School of Data course on Scraping</a>.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Data available only in PDFS (or worse, images)</td>
|
||||
<td>A variety of tools are available to extract this information. Most promising non-code variants are <a href="http://finereader.abbyy.com/">ABBYY Finereader</a> (not free) and <a href="http://tabula.nerdpower.org/">Tabula</a> (new software, still a bit buggy and requires people to be able to host it themselves to use.)</td>
|
||||
<td>Most require knowledge of coding - some progress being made on non-technical tools. For more info and to see some of the advanced methods - see the <a href "http://schoolofdata.org/handbook/courses/extracting-data-from-pdf/">School of Data course.</a></td>
|
||||
<td>
|
||||
Note: these tools are still imperfect and it is still vastly preferable to advocate for data in the correct formats, rather than teach people how to extract.
|
||||
|
||||
Recently published <a href="https://www.gov.uk/service-manual/design-and-content/choosing-appropriate-formats.html">guidelines</a> coming directly from government in the UK and US can now be cited as examples to get data in the required formats.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Leaked data</td>
|
||||
<td>Several projects made use of secure dropboxes and services for whistleblowers. </td>
|
||||
<td>Advanced - security of utmost concern.</td>
|
||||
<td>For example: <a href="http://atlatszo.hu/magyarleaks/">MagyarLeaks</a>
|
||||
</tr>
|
||||
</table>
|
||||
## Stage 2: Cleaning, Working with and Analyzing Data
|
||||
|
||||
<table border="1">
|
||||
<tr>
|
||||
<td><strong>Issue</strong></td>
|
||||
<td><strong>Tools</strong></td>
|
||||
<td><strong>Level</strong></td>
|
||||
<td><strong>Notes</strong></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Messy data, typos, blanks (various)</td>
|
||||
<td>Spreadsheets, <a href="http://openrefine.org/">Open Refine</a>, Powerful text editors e.g. <a href="http://www.barebones.com/products/textwrangler/">Text Wrangler</a> plus knowledge of Regular Expressions.</td>
|
||||
<td>Basic -> Advanced</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Need to reconcile entities against one another to answer questions such as, "what is company X?", "Is company X Ltd. the same as company X?" (ditto for other types of entities e.g. departments, people).</td>
|
||||
<td><a href="http://nomenklatura.okfnlabs.org/">Nomenklatura</a>, <a href="http://opencorporates.com/">OpenCorporates</a>, <a href="http://publicbodies.org/"</a></td>
|
||||
<td>Advanced (all)</td>
|
||||
<td>
|
||||
Reconciling entities is complicated both due to the tools needed as well due to the often inaccurate state of the data.
|
||||
|
||||
Working with data without common identifiers and data of poor quality makes entity reconciliation highly complicated and can cause big gaps in analysis.
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Need to be able to conceptualize networks and relationships between entities (See dedicated section on Network Mapping below).</td>
|
||||
<td>Gephi</td>
|
||||
<td>Intermediate - advanced.</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Need to be able to work with many many lines of data (too big to be able to fit in Excel).</td>
|
||||
<td> OpenSpending.org, Other database software (PostGres, MySQL), Command line tools</td>
|
||||
<td> OpenSpending.org - easy for basic upload search and interrogation, in OpenSpending and other databases some advanced queries may require knowledge of coding. </td>
|
||||
<td> Note: As few countries currently release transaction level data, this is not a frequent problem, but is already problematic in places such as Brazil, US and the UK. As we push for greater disclosure, this will be needed ever more.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Performing repetitive tasks or modelling </td>
|
||||
<td>Macros - Excel</td>
|
||||
<td>Basic - Intermediate.</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Entity Extraction (e.g. from large bodies of documents) </td>
|
||||
<td> <a href="http://www.opencalais.com/">Open Calais</a>, <a href="http://developer.yahoo.com/contentanalysis/">Yahoo/YQL Content Analysis API</a>, <a href="http://openup.tso.co.uk/des">TSO data enrichment service</a></td>
|
||||
<td> Intermediate</td>
|
||||
<td> This is far from a perfect method and it would be vastly easier to answer questions relating to entities if they were codified by a unique identifier. </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Analysis needs to be performed on datasets that are published in different languages (e.g. in India) </td>
|
||||
<td>To some extent: Google Translate for web based data.</td>
|
||||
<td>Basic</td>
|
||||
<td>Still searching for a solution to automatically translate offline spreadsheets. </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Figures change in data after publication </td>
|
||||
<td>
|
||||
For non-machine readable data - tricky.
|
||||
|
||||
For simple, machine readable file formats, such as CSV - version control is a possibility.
|
||||
|
||||
For web-based data - some scrapers can be configured to trigger (e.g. email someone) whenever a field changes.
|
||||
|
||||
<td> Intermediate to advanced </td>
|
||||
<td> Future projects that are likely to tackle this problem: <a href="http://dansinker.com/post/49856260511/opennews-code-sprints-do-some-spring-cleaning-on-data">DeDupe</a>.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Finding statistical patterns in spending data (such analysis is depends on high data quality) </td>
|
||||
<td> R (free), SPSS (proprietary) and other statistical software for clustering and anomaly detection (also see note). </td>
|
||||
<td> Advanced </td>
|
||||
<td> Examples: Data from Supervizor has been used to track changes in spending on contractors changes in government.
|
||||
(<a href="https://www.kpk-rs.si/en/project-transparency/supervizor-73">Supervizor.)
|
||||
<em>A note on statistical analysis software can be found below</em></a></td>
|
||||
</tr>
|
||||
</table>
|
||||
<strong>Note on SPSS and R:</strong> It’s our impression that interviewees seemed largely to have been trained to use SPSS. R is however important to mention as it offers a free access to a broad section of the same models, though based on a programming interface.
|
||||
|
||||
A few examples of analysis on spending data, which can be done with statistical software such as SPSS or R:
|
||||
|
||||
|
||||
<strong>a)</strong> <a href="http://en.wikipedia.org/wiki/Hidden_Markov_model">Hidden Markov</a>: Hidden Markov was originally developed for finding patterns in bioinformatics, but has turned out useful for predicting fraudulent and corrupt behaviour. Using Hidden Markov requires high quality data, and was for instance used to analyse spending data from 50 mio transactions in the Slovenian platform Supervizor.
|
||||
|
||||
|
||||
<strong>b)</strong> <a href="http://en.wikipedia.org/wiki/Benford%27s_law">Benford's law</a>: Benford's law examines the distribution of figures in your data, against how it should actually look. Diversions from the normal distribution can help detect fraudulent reporting (eg. if companies tend to report ernings less than $500 mio. in order to fit a particular regulation Benford’s law could be a tool to detect that). Check this example using Benford’s law to test the release of all <a href="http://friism.com/tax-records-for-danish-companies">Danish corporate tax filings</a> and check this <a href="http://friism.com/tax-records-for-danish-companies">R blog post on the topic</a>.
|
||||
|
||||
Finally a few notes on the differences between SPSS and R: Though SPSS is fairly easy to get started using, it can be difficult to collaborate around as it applies its own SPSS data format. Some models might also be unavailable from the basic SPSS package. R is the free alternative, uses a programme interface, where all extensions are accessible, and where community support and code samples are widely available. One possible compromise bridging the convenience of SPSS and the wide usability of R, is the proprietary software <a href="http://www.revolutionanalytics.com/">R Revolution</a>.
|
||||
|
||||
## Stage 3: Presenting Data
|
||||
|
||||
<table border="1">
|
||||
<tr>
|
||||
<td><strong>Issue</strong></td>
|
||||
<td><strong>Tools</strong></td>
|
||||
<td><strong>Level</strong></td>
|
||||
<td><strong>Notes</strong></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Basic visualisation, time series, bar charts </td>
|
||||
<td> <a href="http://datawrapper.de/">DataWrapper</a>, <a href="http://www.tableausoftware.com/public/">Tableau Public</a>, <a href="http://www-958.ibm.com/software/analytics/manyeyes/">Many Eyes</a>, Google Tools </td>
|
||||
<td> Basic</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> More advanced visualisation</td>
|
||||
<td> <a href="http://d3js.org/">D3.js</a> </td>
|
||||
<td> Advanced </td>
|
||||
<td> Used in e.g. <a href="http://openbudgetoakland.org/2012-2013-sankey.html">OpenBudgetOakland</a> </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Mapping </td>
|
||||
<td> <a href="http://www.mapbox.com/tilemill/">TileMill</a>, <a href="http://www.google.com/drive/apps.html#fusiontables">Fusion Tables</a>, <a href="http://kartograph.org/">Kartograph</a> <a href="http://www.qgis.org/">QGIS</a> </td>
|
||||
<td> Basic -> Advanced </td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Creating a citizen’s budget </td>
|
||||
<td> OpenSpending.org, Off-the shelf tools listed above. Disqus commenting module added to OS for commenting and feedback.</td>
|
||||
<td> OpenSpending.org - making a custom visualisation - basic. Making a custom site enabling discussion - advanced. </td>
|
||||
<td> Used in e.g. <a href="http://openbudgetoakland.org/2012-2013-sankey.html">OpenBudgetOakland</a> </td>
|
||||
</tr>
|
||||
</table>
|
||||
## Publishing Data
|
||||
|
||||
<table border="1">
|
||||
<tr>
|
||||
<td><strong>Issue</strong></td>
|
||||
<td><strong>Tools</strong></td>
|
||||
<td><strong>Level</strong></td>
|
||||
<td><strong>Notes</strong></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Need a place online to store and manage data, raw, especially from Freedom of Information Requests. </td>
|
||||
<td> DataNest, CKAN, Socrata - various Data Portal Software options. </td>
|
||||
<td> Basic to use. </td>
|
||||
<td> Can require a programmer to get running and set up a new instance. </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Individual storage of and online collaboration around datasets</td>
|
||||
<td> Google Spreadsheets, Google Fusion Tables, Github </td>
|
||||
<td> 1-3 Basic. Github - intermediate. </td>
|
||||
<td> </td>
|
||||
</tr>
|
||||
</table>
|
||||
### Notes
|
||||
|
||||
See also the resources section in the <a href="http://openspending.org/resources/handbook/ch014_resources.html">Spending Data Handbook</a>
|
||||
|
||||
Note: Many of these tools will have difficulty working on Internet Explorer (especially older versions), but we have yet to find more powerful tools which also work there.
|
||||
|
||||
## A note on Network Analysis
|
||||
|
||||
As you will see from the case studies in the videos, Network Analysis is an area that more and more people are looking into with regard to public procurement and contracts.
|
||||
|
||||
Network visualisations are commonly used as a solution to this problem, however, we offer a note of caution to use them sparingly; due to the amount of data on which they are often used, they can sometimes be overwhelming and the average non-expert can find them hard to interpret.
|
||||
|
||||
Often the types of information that it is possible to extract from a network visualisation e.g. “who is best connected?”, “are there links between person X and person Y?” - could be more easily be found with a searchable database of connections.
|
||||
|
||||
It may also be wise to separate tools suitable for investigating the data, and tools used to present the data. In the latter case, clarity and not-overloading the visualisation will most likely yield a clearer result - so this is one area in which custom infographics may win out in terms of delivering value.
|
||||
|
||||
### Existing solutions for network mapping:
|
||||
|
||||
For producing network visualisations there are currently open source solutions:
|
||||
|
||||
* [Gephi](https://gephi.org/) (Again note that Gephi has non-visualisation functions to explore the data, which at times may be more useful in exploring the interconnections than the visualisations themselves).
|
||||
* [Mapa 76](http://mapa76.com/) - This is also interesting due to the function which is being developed to extract individual entities.
|
||||
* [RelFinder](http://www.visualdataweb.org/relfinder.php) Based off DBPedia, this tool structures and maps out relations between entities based on which articles they feature in on Wikipedia.
|
||||
* Google Fusion Tables has a network function
|
||||
* NodeXL is a powerful network toolkit for Excel.
|
||||
* [Cytoscape](http://www.cytoscape.org/)
|
||||
|
||||
## Some favourite examples of (non) Network ways of presenting hierarchies, relationships and complex systems:
|
||||
|
||||
* [Connected China (Reuters)](http://connectedchina.reuters.com/) - enables the user to easily see family connections, political coalitions, leaders and connections. Additionally - it gives a detailed organisational diagram of the Communist Party of China, as well as timelines of people’s rise to power.
|
||||
* [Little Sis](http://littlesis.org/). This is an American database of political connections, including party donations, career histories and family members. Read their About Page for more details of the questions they seek to answer.
|
||||
|
||||
### Further reading:
|
||||
|
||||
<ul>
|
||||
<li><a href="http://www.ucl.ac.uk/secret/events/event-tabbed-box/seminars-accordian/social-network">UCL Materials</a></li>
|
||||
<li> <a href="http://www.cgi.com/sites/cgi.com/files/white-papers/Implementing-social-network-analysis-for-fraud-prevention.pdf">CGI Materials</a></li>
|
||||
</ul>
|
||||
<ul>
|
||||
<li>A pipeline for local councils to address privacy concerns about publishing transaction-level data. In the UK, despite clear guidelines about what should be removed from data before publication, a few councils have published sensitive data over the past year. Some companies are looking at maintaining suppression lists for this data, however ideally this should be done in government, prior to publication - so workflows need to be developed for this. </li>
|
||||
<li>Tools to help spot absence of publication as it happens. Currently, civil-society led initiatives such as the Open Budget Survey can only monitor publication of key budget documents retrospectively, and using large amounts of people power. There are a couple of possibilities which spring to mind:</li>
|
||||
<ul>
|
||||
<li>In the UK - the OpenSpending team have been working with the team of data.gov.uk to <a href="http://community.openspending.org/2012/09/uk-departmental-government-spending-improving-the-quality-of-reporting/">produce automated reports</a> to help those enforcing transparency obligations to see which departments are not compliant with said regulations. The reports check both timeliness as well as structure and format of the data. This proved very successful at prompting data release initially - departments were given advance warning that the tool would be featured and any departments without up to date data would be flagged up in red; by launch date, nearly all departments had updated data. This is possible where:</li>
|
||||
<ul>
|
||||
<li>The data are published via a central platform (e.g. data.gov.uk)</li>
|
||||
<li>The data are machine readable, so a computer program can quickly ascertain whether the required fields are present.</li>
|
||||
<li>There is a standard layout for the data, so a computer can quickly verify whether column headings are correct and all present.</li>
|
||||
</ul>
|
||||
<li>Introducing a calendar of expected dates of publication of a particular dataset so that organisations could know when a document is expected to be published and enforce that it is. This could be done either on a country by country basis, or simply by aligning with internationally recognised, best practice guidelines.</li>
|
||||
</ul>
|
||||
<li>Tools which help to remove duplication of effort. For example, if one organisation has already cleaned up or extracted data from a PDF, encouraging them to share that data so another organisation does not have to waste time doing the same. </li>
|
||||
</ul>
|
||||
**Next**: [Common arguments against publishing data](./machinereadfaq/)
|
||||
|
||||
**Up**: [Appendix](../)
|
||||
@@ -0,0 +1,69 @@
|
||||
---
|
||||
lead: true
|
||||
title: Centre for Public Interest Advocacy
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">After meeting the team from the Public Interest Advocacy Centre (CPI) at
|
||||
the POINT conference in Sarajevo in 2012, a proposal was launched to
|
||||
start the Budzeti.ba project (<a href="http://budzeti.ba/">beta version</a>). In November 2012, the project had a kick-off workshop to get
|
||||
the ball rolling and cover some of the key data-wrangling issues faced
|
||||
by CSOs wanting to create a budget monitoring site. </div>
|
||||

|
||||
|
||||
This workshop happened shortly after the [Spending Data Handbook](http://community.openspending.org/research/handbook/) sprint and was a great opportunity to do a test run on the material.
|
||||
|
||||
## What we covered
|
||||
|
||||
Some of the topics we covered were:
|
||||
|
||||
* An introduction to [DataWrapper](http://datawrapper.de/) for making simple charts and web visualisations
|
||||
* [Kartograph](http://kartograph.org/) for making elegant maps
|
||||
* Scraping using [ScraperWiki](http://scraperwiki.com/)
|
||||
* Using Optical Character Recognition to get data out of PDFs
|
||||
* Cleaning data using [Google Refine](http://code.google.com/p/google-refine/)
|
||||
|
||||
Also present at this workshop were the team from [Expert Grup in Moldova](../expert-grup/), prior to the launch of their Budget Stories project.
|
||||
|
||||
## Status
|
||||
|
||||
The Budzeti.ba project is still in progress and is due for a full launch
|
||||
in Autumn 2013. The aim of the project is to provide a one-stop shop for
|
||||
budget information in a format which is accessible for citizens.
|
||||
|
||||
## Comprehensibility of budget data
|
||||
|
||||
The training in Bosnia was a trigger to build on the material in the Spending Data Handbook by developing some guidelines on how to make data published by governments more accessible for citizens. In practice, this is often a process of simplification and aggregating large datasets so as not to overwhelm the viewer. One of the key methodologies which was used to produce [Where Does My Money Go?](http://wheredoesmymoneygo.org) was to think carefully about how we aggregated the data and opting for functional classifications.
|
||||
|
||||
### Demanding data in functional classifications: why and what's difficult?
|
||||
|
||||
1. Countries such as Bosnia [do not publish Citizen’s
|
||||
Budgets](http://survey.internationalbudget.org/#profile/BA) in the
|
||||
first place. This means that a functional classification has to come entirely
|
||||
from civil society, leading to worries from the CSO that the
|
||||
interpretation of the data may be contested.
|
||||
2. Some countries do not group their data by functional
|
||||
classifications. This is important, as the average citizen is more
|
||||
likely to want to know what money is spent on (i.e. what services
|
||||
they got in exchange for their tax money) than, for example, which
|
||||
government department is spending the money, which is all it is
|
||||
possible to infer from many budgets.
|
||||
3. For the purposes of visualisations such as OpenSpending's,
|
||||
organisations such as CPI must classify the complex information
|
||||
contained in budgets for themselves in a form which is accessible
|
||||
and yet not overwhelming for citizens. There are practical
|
||||
implications to this. Having more than 10 top-level
|
||||
items in a budget, for example, results in a very cramped visualisation, and there are only so many categories a person can take in at
|
||||
any one time. For visualisation purposes, the CPI team
|
||||
classified the data in a schema similar to the internationally
|
||||
recognised [Classifications of Functions of Government](http://unstats.un.org/unsd/cr/registry/regcst.asp?Cl=4), as was used
|
||||
in the UK project [Where Does My Money Go?](http://wheredoesmymoneygo.org/).
|
||||
|
||||
More information about the classification methodology they used can be
|
||||
found in the blog post: [Bosnian Budgets - grouping data by categories people
|
||||
care
|
||||
about](http://community.openspending.org/2012/11/bosnian-budgets-grouping-data-by-categories-people-care-about/).
|
||||
|
||||
**Next**: [Expert Grup](../expert-grup/)
|
||||
|
||||
**Up**: [Case Studies: Budgets](../)
|
||||
@@ -0,0 +1,29 @@
|
||||
---
|
||||
lead: true
|
||||
title: Centre for Budget and Governance Accountability
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">Before a tech project even gets off the ground, it needs a good source of data. On our trip to India, the Centre for Budget and Governance Accountability clearly outlined the problems they have with the data released by governments.</div>
|
||||

|
||||
|
||||
Centre for Budget and Governance Accountability (CBGA) is a policy research and advocacy organisation based in New Delhi, India. Its work promotes transparent, participatory, and accountable governance and a people-centred perspective in government budgets.
|
||||
|
||||
The main questions CBGA asks are about the priorities underlying budgets, the quality of government interventions in the social sector, the responsiveness of policies and budgets to disadvantaged sections of population (e.g. religious minorities, scheduled castes, women), the progressivity of the taxation policies, and some of the structural issues in India’s fiscal federalism.
|
||||
|
||||
They talked about some of the problems they had had in using the data on expenditure made available by the Union Government and State Governments in India. Here are a few of their most striking points.
|
||||
|
||||
* The data is very aggregated. There's no geographical breakdown of the data; for example, in the Union Budget, there is no disaggregation for statewise allocations or expenditures, and likewise in a State Budget, it's not possible to see how much is spent in a particular district.
|
||||
* There are particular budgetary strategies and categories for public expenditure on disadvantaged groups (viz. the "scheduled caste subplan" and "tribal subplan"), but the reporting using these categories is not very reliable. It has been found that a lot of general expenditure is reported under these heads. Suppose, for example, the government has spent some money on construction of a hospital; even though it's not meant specifically for the scheduled castes, the government may apportion around 20% of the cost for the hospital under the scheduled caste subplan. In the absence of programmes or schemes designed specifically for scheduled castes or scheduled tribes, some of the state governments have been relying on such a superficial process of reporting expenditures under "scheduled caste sub plan" and "tribal sub plan".
|
||||
* It is very difficult to get the complete picture of public spending in any particular sector. In many of the Union Government schemes (e.g. the National Rural Employment Guarantee Scheme, National Rural Health Mission, etc.), the Union Budget funds do not flow through the State Budgets or the State Government Treasury System; these funds are sent directly to autonomous bank accounts of implementing agencies. As a result, the State Budget documents of any State in India do not show the complete picture of government spending in that State.
|
||||
* In some of the Union Government schemes where the Union Budget funds are routed outside the State Budgets, fund advances to implementing agencies get reported as expenditures, and there is no way we can ascertain whether the money has actually been spent or is lying parked in the bank account at some level.
|
||||
|
||||
As regards State Budget data, CBGA gets PDFs from the websites of the State Governments in about one third of States – with the rest, they have to procure hard copies from offices of the Finance Departments of the States.
|
||||
|
||||
A matrix of available budget data sources for India and comments about their usability has been compiled by the CBGA team and can be found [online](http://www.cbgaindia.org/sources_for_budget_data.htm).
|
||||
|
||||
<em>See the full list of organisations we visited on the India trip on the <a href="http://in.okfn.org/2012/09/18/okfn-india-trip-the-roundup/">OKFN-India blog</a></em>.
|
||||
|
||||
**Next**: [Centre for Public Interest Advocacy](../bosnia/)
|
||||
|
||||
**Up**: [Case Studies: Budgets](../)
|
||||
@@ -0,0 +1,90 @@
|
||||
---
|
||||
lead: true
|
||||
title: Expert Grup
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">Project: <a href="http://www.budgetstories.md/">Budgetstories, Moldova.</a></div>
|
||||
In mid 2012, Expert Grup began to work to make sense of huge budget
|
||||
datasets. In November 2012, while finalizing the project concept,
|
||||
members of the organising team took part in the [Balkan Budget
|
||||
Workshop](http://openspending.org/blog/2012/11/26/Sarajevo-Workshop-Writeup.html) organised by the OpenSpending team.
|
||||
|
||||
> “It was really helpful to learn about data formats. It is the equivalent
|
||||
> of a data literacy 101 class.” - <strong> Victoria Vlad </strong>
|
||||
|
||||

|
||||
|
||||
## About the team
|
||||
|
||||
The core BudgetStories.md team at Expert Grup
|
||||
consists of three core staff members: the project director, one project
|
||||
communication and analytical expert, and one additional analytic expert
|
||||
as support. The implementation was made with an external web design
|
||||
company who executed the website based on input from the team. The web
|
||||
company had a total of five people working on BudgetStories.md: two
|
||||
designers, two developers, and one project manager.
|
||||
|
||||
Expert Grup has introduced several projects using open budget and
|
||||
spending data. The main challenges they've encountered have related to
|
||||
accessing data, cleaning the data (using Excel), and managing the data
|
||||
visualisation work, which was done by an an external web agency.
|
||||
|
||||
Most of the needed data was available to the public on
|
||||
[data.gov.md](http://data.gov.md/) and in the World Bank's [BOOST data
|
||||
tool](http://www.mf.gov.md/ro/BOOST/), and government agencies were open
|
||||
to providing direct requests for data requested in a timely manner. The
|
||||
data was analysed with Excel. During the research, Expert Grup used the
|
||||
<a href="http://community.openspending.org/resources/handbook/">Spending Data Handbook</a>, the <a href="http://opendatahandbook.org/">Open Data Handbook</a> and the <a href="http://datajournalismhandbook.org/">Data Journalism
|
||||
Handbook</a> as resources.
|
||||
|
||||
## Technical challenges
|
||||
|
||||
### Presenting the data
|
||||
> “It is still a challenge for us to make visualizations meaningful. We plan to improve with every infographic we
|
||||
> publish.”
|
||||
|
||||
Expert Grup is planning to work more on developing their
|
||||
presentations of their data. Most of their visualisations have been
|
||||
developed by a web agency based on the directions from Expert Grup. One
|
||||
exception is the visualisation of the [Moldova
|
||||
budget](http://www.budgetstories.md/bugetul-2013/), which was created with
|
||||
OpenSpending treemaps by Expert Grup.
|
||||
|
||||
In the future Expert Grup wil be looking to expand their work.
|
||||
|
||||
> “There is still a lot of work to be done. We have until now published three data
|
||||
> visualizations.”
|
||||
|
||||
So far their portfolio includes excellent examples like these:
|
||||
|
||||
* The [Budget
|
||||
calendar](http://www.budgetstories.md/anul-bugetar-2013/), which enables
|
||||
citizens to track and learn about the budget approval process for
|
||||
2012 - 2014 as an interactive module, including the documents which
|
||||
need to be published at each stage of the budget process.
|
||||
* An OpenSpending budget treemap, added in order to provide a meaningful
|
||||
visualization of the [annual
|
||||
proposal](http://www.budgetstories.md/afla-cat-ne-a-costat-parlamentul-in-2012/)
|
||||
* A visualisation of the [agricultural subsidy
|
||||
program](http://www.budgetstories.md/subventiile-pentru-agricultura-in-2012-pentru-ce-cui-si-unde-au-fost-alocate/).
|
||||
|
||||
## Community challenges
|
||||
|
||||
<strong>Engaging CSOs in spending data:</strong> According to Expert Grup civil society
|
||||
(NGOs, journalists, and universities) has until now shown little
|
||||
knowledge of or interest in the existence of open data. The goal has
|
||||
therefore been to engage with and educate journalists and policymakers.
|
||||
The main site was launched in February and gave journalists access to cleaned data sets and visualisations for the first time
|
||||
journalists, as well as
|
||||
encouraging their republishing and reuse in the public. The aim would be
|
||||
to gain increased public attention to inefficiencies identified in
|
||||
government spending and establish connections to more stakeholders. The
|
||||
results of the outreach are still being assessed.
|
||||
|
||||
Source: [Guest blog post on
|
||||
OpenSpending](http://community.openspending.org/2013/02/budgetstories-md-using-open-budget-data-to-create-meaningful-stories/).
|
||||
|
||||
**Next**: [Case Studies: Spending](../../case-studies-spending/)
|
||||
|
||||
**Up**: [Case Studies: Budgets](../)
|
||||
@@ -0,0 +1,25 @@
|
||||
---
|
||||
lead: true
|
||||
title: 'Case Studies: Budgets'
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||

|
||||
|
||||
The word "budget" can refer to different things in different circles, but for the purposes of this report we adopt a technical definition: a budget is a planning document that provides the details of a spending policy, typically breaking the policy down by aggregated categories rather than individual planned transactions.
|
||||
|
||||
Across communities and across the globe, there is a push for making budgets accessible in a timely and granular form. Groups have sprung up all over creating meaningful ways to explore and explain to citizens what's really in their budgets. Some groups pioneer detailed budget calendars enabling citizens to follow the budgeting process, while other groups challenge governments by tracking budget codes and how funds actually shift form year to year.
|
||||
|
||||
This section explores some of the diversity of groups using budget data as well as their common concerns and demands. It asks what these CSOs hope to achieve by drawing on budget data and what technological and political obstacles they encounter in doing so.
|
||||
|
||||
Several different kinds of work are covered in this section, including completed budget data projects ([Lost Money](./lost-money/), [OpenBudgetOakland](./openbudgetoakland/)), ongoing work by policy researchers and advocates ([Centre for Budget and Governance Accountability, India](./cbga/)), and the results of workshops building up to future projects ([Budzeti.ba](./bosnia/), [Expert Grup](./expert-grup/)).
|
||||
|
||||
* [Lost Money](./lost-money/)
|
||||
* [OpenBudgetOakland](./openbudgetoakland/)
|
||||
* [Centre for Budget and Governance Accountability, India](./cbga/)
|
||||
* [Budzeti.ba](./bosnia/)
|
||||
* [Expert Grup](./expert-grup/)
|
||||
|
||||
**Next**: [Bani pierduti? (Lost Money)](./lost-money/)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,109 @@
|
||||
---
|
||||
lead: true
|
||||
title: Bani pierduti? (Lost Money)
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
|
||||
<strong>This is a profile of a very interesting new project coming out of Romania, aiming to make government finances understandable for the average citizen. It is written based on contributions from <a href="https://twitter.com/MadamadePica" target="_blank">Elena Calistru</a>, who kicked off the project.</strong>
|
||||
|
||||
## Vital Statistics
|
||||
|
||||
<div class="well">
|
||||
<ul>
|
||||
<li><strong>Name of Project:</strong> Bani pierduti? (Or, in English, "Lost money?")</li>
|
||||
<li><strong>Link to project: <a href="http://www.banipierduti.ro" target="_blank">banipierduti.ro</a></strong></li>
|
||||
<li><strong>Approximate number of users engaged through the project:</strong> over 30,000</li>
|
||||
</ul>
|
||||
</div>
|
||||
<img alt="" src="http://farm9.staticflickr.com/8289/7607848026_21cbeef8ed_b.jpg" title="Lost Money" class="alignnone" width="640" height="480" />
|
||||
|
||||
## What is the background of the project?
|
||||
|
||||
The project is one of the five winners of the <a href="http://restartromania.netsquared.org" target="_blank">Restart Romania 2011</a> competition, initiated by <a href="http://www.techsoup.ro/" target="_blank">Techsoup Romania</a> with the support of the US Embassy to Bucharest.
|
||||
|
||||
Starting at the beginning of August 2011, 104 projects were registered for the Social Justice Challenge Restart Romania and went under the scrutiny of the community. In the end, a jury formed by representatives of the diplomatic community, business sector, and IT industry decided the selection of 10 finalist projects. Between 28 - 30 October, the Restart Romania Hackathon transformed the ten ideas with the help of programmers and communication specialists into more concrete platforms which were presented within the Restart Romania Gala. Bani Pierduti was voted within the Gala as one of the five winners of 5000 USD funding.
|
||||
|
||||
## What are the aims?
|
||||
|
||||
The project formerly known as *"Where’s my LEI, man?"* entered the competition aiming to centralize publicly available financial information regarding the projects financed through public money (budgets, annual reports, etc.). The main objective was make authorities accountable in the manner in which public funds are spent.
|
||||
|
||||
After winning the Restart Romania Gala, the project went through a re-thinking aiming to identify both the best technologies for a more complex platform than initially planned and the necessary datasets which would allow the best representation of how public funds are spent in Romania. If at the very beginning the project only aimed at using state budget data, it now operates with data comprising the budgets dedicated to social assistance and public health, the budgets at local level for the Romanian counties, projects financed through EU funds, comparisons with the percentage allocated to various sectors in other EU counties, and more.
|
||||
|
||||
The project is a now a permanent programme of a newly-established NGO, <a href="http://www.funkycitizens.org/" target="_blank">Funky Citizens</a> (website under construction at time of publication), which aims to engage civil society (taxpayers) in the decision-making processes related to public funds through the use of technology. Its major objectives are:
|
||||
|
||||
* Improving the number of people who are aware of this issue and improving the quality of public understanding
|
||||
* Offering information and tools for influencing the decision-making process
|
||||
|
||||
To achieve its objectives, the project relies on three pillars:
|
||||
|
||||
* Data & process presentation
|
||||
* Public participation
|
||||
* Understanding the bigger picture
|
||||
|
||||
## How does the platform tackle the issues you outlined?
|
||||
|
||||
The three pillars of the platform respond to the following problems:
|
||||
|
||||
### Problem #1: Fiscal policies represent a mystery for the majority of citizens
|
||||
|
||||
<strong>Consequences:</strong> Lack of information and understanding of the process; scarce public oversight of public funds administration; public spending is associated with corruption and distrust
|
||||
|
||||
<strong>How we respond:</strong> Educate citizens on the topic
|
||||
|
||||
### Problem #2: Little or no participation of the community in fiscal policy
|
||||
|
||||
<strong>Consequences:</strong> Limited use of existing tools for participation to the decision-making process; needs of the community not reflected in the resource allocation; no feedback to the policy makers on their decisions
|
||||
|
||||
<strong>How we respond:</strong> Facilitate direct participation
|
||||
|
||||
### Problem #3: Lack of vision from governments on investment / development priorities
|
||||
|
||||
<strong>Consequences:</strong> Short-term planning leading to limited predictability and accountability; bad administration, mismanagement, or corruption in public spending; incoherence between the fiscal policy and other public policies
|
||||
|
||||
<strong>How we respond:</strong> Analyse and understand data
|
||||
|
||||
<img src="http://farm9.staticflickr.com/8432/7787652558_79191020ee_o.png" title="Lost Money 2" class="alignnone" />
|
||||
|
||||
## What is the role of technology in the approach to solving that problem?
|
||||
|
||||
The role of technology is an important one, since the web-based platform is the main feature of the project. So far, transparency in fiscal policy can be achieved only through complicated documents published on the websites of the authorities or through FOIA requests. Also, there were previously no e-participatory budgeting experiences, the only means of organizing public debates on budgetary issues being offline events.
|
||||
|
||||
## What are the successes of this project?
|
||||
|
||||
The project is still very young and in its early stages. However, the evaluation of its outcomes already shows several approaches which proved successful:
|
||||
|
||||
* A consultation process with relevant governmental stakeholders prior to the launch of the project proved to be a good approach in ensuring a supportive or at least a not contentious interaction with the authorities, given the sensitivity of the subject.
|
||||
* The gradual implementation and launch of the features of the platform seems like a successful strategy to educate citizens on a difficult subject while creating interest in and awareness of the topic.
|
||||
* The engagement of different categories of supporters of the project (from young dynamic professionals to the diplomatic community) ensured greater visibility for the initiative and is expected to further enlarge the community of advocates for more transparency in fiscal matters.
|
||||
|
||||
## Are there areas where the project failed? What are the challenges?
|
||||
|
||||
The main challenges to the project are mostly related to two major issues encountered by such initiatives:
|
||||
|
||||
* The absence of an open data approach in the release of official information related to public spending makes the implementation of the project slower as well as resource-consuming.
|
||||
* A general perception that public money are lost due to corruption makes people less inclined to look closer at the entire policy cycle and thus the efforts to educate or to engage them harder.
|
||||
|
||||
<img alt="" src="http://farm9.staticflickr.com/8289/7787652740_dae031a763_o.png" title="Lost Money 3" class="alignnone" width="640" height="480" />
|
||||
|
||||
## Have you had particular problems with the data?
|
||||
|
||||
Romania has just joined the Open Government Partnership, and the implementation of the open data format for governmental data sets is expected to take at least a few years. The various data formats present on the websites of authorities, or even their absence in several cases, made data collection a rather difficult process.
|
||||
|
||||
## Are you actively seeking the involvement of the user groups?
|
||||
|
||||
The project also foresees that an entire pillar of the platform (“public participation”) will actively seek the involvement of the user groups. The implementation of this service started with two features (large investment projects timelines and legislative early warnings) which seek an interaction with the public, and future plans propose to increase the amount of citizen participation. For example, there are plans to do this by:
|
||||
|
||||
* encouraging direct feedback into laws already in draft stages, allowing users to cut, add to, and restructure proposed bills on the basis of desired budgetary outcomes
|
||||
* building a simulator for the central budget, allowing people to visualise and explore the effect of different revenue and expenditure policies (e.g. raising taxes)
|
||||
* promoting public participation in the annual budget cycle through a calendar of debates on budgets as well as pilot offline events with webcasts
|
||||
|
||||
The most consistent involvement features are expected to be implemented by the end of 2012 – early 2013, as a second stage in the development of the project.
|
||||
|
||||
## What are the plans for the future?
|
||||
|
||||
The project was planned as a continuously growing platform, and its scaling or additional features were taken into consideration from the very beginning. A mobile feature is expected to be implemented into the web platform in 2013, a plan which also involves the use of social audits for public contracts.
|
||||
|
||||
**Next**: [OpenBudgetOakland](../openbudgetoakland/)
|
||||
|
||||
**Up**: [Case Studies: Budgets](../)
|
||||
@@ -0,0 +1,74 @@
|
||||
---
|
||||
lead: true
|
||||
title: OpenBudgetOakland
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">Project: <a href="http://openbudgetoakland.org/">OpenBudgetOakland</a>, launched
|
||||
April 2013.</div>
|
||||
Open Budget Oakland is a group of civic hackers working on budget
|
||||
issues. At the beginning of 2013, they initiated the idea
|
||||
of launching a budget visualization site, with the goal to increase
|
||||
participation from citizens and stimulate citizen debate around the
|
||||
city budget. The platform should enable citizens to explore budget
|
||||
nuances, to discuss the implications of policy decisions, and to ask informed questions, and it should
|
||||
facilitate information flow from experts in the field to interested citizens.
|
||||
|
||||
In February 2013, Open Budget Oakland connected with OpenSpending in
|
||||
order to plan the launch of [OpenBudgetOakland.org](http://openbudgetoakland.org/).
|
||||
From February and up until April, the small team worked to complete the
|
||||
app and launch it in time for the Mayor’s annual release of the budget
|
||||
proposal for 2013-2015. The team succeeded launching the visualization
|
||||
of the proposed budget on the day of its release. Shortly after the
|
||||
launch, the project gained recognition from the City of Oakland as it was
|
||||
offered to [present the project at city
|
||||
hall](https://twitter.com/openbudgetOAK/status/329667951265472512/photo/1).
|
||||
|
||||
<iframe width='100%' height='400' src='http://openspending.org/oakland-adopted-budget-fy-2011-13-expenditures/embed?widget=treemap&state=%7B%22drilldowns%22%3A%5B%22department%22%2C%22unit%22%2C%22child-fund%22%5D%2C%22year%22%3A2012%2C%22cuts%22%3A%7B%7D%7D&width=700&height=400' frameborder='0'></iframe>
|
||||
|
||||
## Technological setup and challenges
|
||||
|
||||
The first budget of Oakland was accessed in July 2012 as a 350-page
|
||||
PDF file, which was copy-pasted into a coherent spreadsheet and an
|
||||
almost-interactive pie chart at a
|
||||
[one-day](http://codeforoakland.org/meet-our-2012-winning-apps) hackathon.
|
||||
The process led directly to this conclusion:
|
||||
>“accessing Oakland city budget data isn’t easy, and even once you have data, it isn’t
|
||||
> immediately clear how it can be shared in a way that helps people.”
|
||||
|
||||
The small volunteer team at Open Budget Oakland built the site within a
|
||||
few months of 2013 based on OpenSpending technology, adding two new
|
||||
distinct features. From March 2013, OpenOakland engaged several email
|
||||
conversations on the OpenSpending mailing lists and received
|
||||
contributions via GitHub from OpenSpending developers:
|
||||
|
||||
1. a [Disqus](http://disqus.com/) commenting module, which enables
|
||||
users to discuss each spending item and thus improve participation
|
||||
2. a browsing feature for the OpenSpending histogram view, enabling
|
||||
easier navigation in OpenSpending
|
||||
3. [D3](http://d3js.org) for visualising income and expenditure in one
|
||||
chart [budget visualisation
|
||||
developed](http://www.peterkrantz.com/2012/data-visualization-tools/) developed
|
||||
by Peter Krantz, another OpenSpending community member
|
||||
|
||||
Open Budget Oakland has also considered the possibility of pursuing
|
||||
transactional spending data, though this is not available at the moment
|
||||
from the Oakland City Council.
|
||||
|
||||
## Community
|
||||
|
||||
The community is considered key for generating a budget discussion going
|
||||
forward past the initial media coverage.
|
||||
|
||||
Open Budget Oakland is a part of Open Oakland, which has a volunteer
|
||||
team of 21 members working across the entire open government agenda. The
|
||||
group organises weekly meetups for volunteers and is supported by Code
|
||||
for America.
|
||||
|
||||
Coverage: [TechPresident](http://techpresident.com/news/23749/oakland-gets-new-data-visualization-site-its-budget) and
|
||||
[Oakland Local](http://oaklandlocal.com/article/open-oakland-opening-oakland%E2%80%99s-budget-community-voices) (community
|
||||
blog).
|
||||
|
||||
**Next**: [Centre for Budget and Governance Accountability](../cbga/)
|
||||
|
||||
**Up**: [Case Studies: Budgets](../)
|
||||
@@ -0,0 +1,18 @@
|
||||
---
|
||||
lead: true
|
||||
title: 'Case Studies: From Local to Global'
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||

|
||||
|
||||
Throughout the research phase for this report, we have been continually surprised by the creativity with which citizens and communities have approached financial transparency. We set out to discover how civil society organisations were using government spending data, but we also discovered interesting cases of citizen and community uptake of spending data—evidence that the goal of driving citizen engagement with public spending by opening data is succeeding.
|
||||
|
||||
In this section, we highlight two cases of community-driven projects around spending data: the [OpenSpending](./openspending/) project, a community-run global database of spending data that has been enthusiastically used as a repository for *local* spending data, and an app to open up the [University of Granada's budget](./opening-university/) built by a department within the university. These two projects provide an important reminder that grassroots local demand constitutes a significant source of pressure for increased spending transparency.
|
||||
|
||||
* [OpenSpending](./openspending/)
|
||||
* [Budget transparency for an open university](./opening-university/)
|
||||
|
||||
**Next**: [OpenSpending](./openspending/)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,41 @@
|
||||
---
|
||||
lead: true
|
||||
title: Budget transparency for an open university
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
The demand for financial data doesn't just come from CSOs, and it doesn't just target national or even regional governments. The University of Granada's budget data app provides an example of a demand for budget data originating and being met within the hyper-local space of a single university. Empowering local actors to demand and make use of data is the next frontier in spending transparency.
|
||||
|
||||
## University of Granada budget app
|
||||
|
||||
The idea to build an app to make the University of Granada's budget data more accessible emerged during International Open Data Day, a gathering of citizens in cities around the world to write applications using open public data to show support for and encourage the adoption open data policies by the world's governments. The Open Knowledge Foundation's Spanish chapter did a call for participants, and the Free Software Office at the University of Granada accepted the challenge.
|
||||
|
||||
The project began by scraping the budget documents published by the University of Granada in PDF format and converting their data to a machine-readable format. The goal of this step was to make it easier for citizens, journalists and even employees at the university to work with the data, using tools ranging from spreadsheet programs to visualization suites. The project then went on to build an app to allow users to download the income and expenditure budgets in CSV format and to provide a set of comprehensive visualizations.
|
||||
|
||||
The set of tools used for the project included:
|
||||
|
||||
* [Cometdocs](http://www.cometdocs.com) (online PDF-to-Excel converter)
|
||||
* [OpenRefine](http://openrefine.org) (data cleaning)
|
||||
* [DataHub](http://datahub.io) (data hosting)
|
||||
* [OpenSpending](http://openspending.org) API & [D3.js](http://d3js.org) (visualization)
|
||||
|
||||
<iframe width='100%' height='400' src='http://openspending.org/upo-income-budget/embed?widget=treemap&state=%7B%22drilldown%22%3A%22articulo%22%2C%22year%22%3A%222012%22%2C%22cuts%22%3A%7B%7D%2C%22drilldowns%22%3A%5B%22articulo%22%5D%7D&width=700&height=400' frameborder='0'></iframe>
|
||||
|
||||
## The importance of university budget transparency
|
||||
|
||||
Spanish Public Universities are almost solely funded by the various Public Administration Offices. In the University of Grenada's revenues, for example, the amount of income coming from public payments (including college tuition) only covers 11% of the total. As a result of the Spanish economic crisis, some college tuition rates rose, having a deep impact on the
|
||||
pockets of those on the verge of being unable to pay for their studies.
|
||||
|
||||
By releasing the University's budget data, the project:
|
||||
|
||||
* Highlighted the reality of the resources available at the university
|
||||
* Helped identify potential best practices in savings that could be used by other universities
|
||||
* Helped citizens to make smart proposals on why and where the public should invest in higher education
|
||||
|
||||
Spanish public universities are equipped with system called SIIU (Integrated University Information System), and they are required to report budget data using this system. Thus, in reality, most of the technical challenges around developing budgets in electronic and harmonized formats has already been completed. The question is therefore why the Ministry of Education does not make this information available to the public.
|
||||
|
||||
*Summary based on blog post by J. Félix Ontañón at OpenKratio.*
|
||||
|
||||
**Next**: [Conclusions](../../conclusions/)
|
||||
|
||||
**Up**: [Case Studies: From Local to Global](../)
|
||||
@@ -0,0 +1,44 @@
|
||||
---
|
||||
lead: true
|
||||
title: OpenSpending
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
The availability of elegant and intuitive visualisations can drive government data release at a local level. OffenerHaushalt in Germany, for example, was launched with a small note on the page saying, “If you are a local government and you are interested in producing your own visualisation site such as this, please write to us.” To date, the OffenerHaushalt team have received 70-90 requests for similar sites in Germany at different levels of government, often from government officials themselves.
|
||||
|
||||
Being able to cater to these different levels of government and different bodies within government was one of the prime reasons for building OpenSpending: a solution was needed that could produce something like <a href="http://wheredoesmymoneygo.org">Where Does My Money Go?</a> in less than half a day, without which scaling down to the local level simply would not be feasible.
|
||||
|
||||
## Local uses of OpenSpending
|
||||
|
||||
Since its launch, OpenSpending has had a continuous flow of requests from stakeholders at local levels. By May 2013, more than 80 cities had been added to OpenSpending from citizens, local political parties, and local administration officials. A substantial number of connections have also been made to local budgeting initiatives using technology to enhance participation.
|
||||
|
||||
It is our impression that local spending has a strong potential for a few reasons:
|
||||
|
||||
* Local spending has a clear and direct impact on the average citizen's daily life, as it is at the local level that many services are delivered. This fact seems to invite a wide group of citizens and communities to engage with the process.
|
||||
* Active citizens may know where to access data in their local community more easily than national data. They might even know who to speak to in the local council if the data is not available.
|
||||
|
||||
## Japan
|
||||
|
||||
The OpenSpending community in Japan has a largely city-based focus. Yokohama initiated a <a href="http://spending.jp/">local spending site</a> in 2012 using the Daily Bread and budget visualisations. Since then, eleven additional cities have had their budgets visualised. At Open Data Day February 2013, the group expanded the initiative with another site for the <a href="http://chiba.spending.jp/">city of Chiba</a>, which received a visit as well as positive feedback from the mayor of Chiba.
|
||||
|
||||
The community is characterised by strong representation of both governance experts from academia and as programmers with the technical skills to implement complex budget sites.
|
||||
|
||||
Since February 2013, the community has grown across Japan and spurred the development of budget visualisations for twelve additional cities. The community is now looking to explore more detailed budget data as well as transactional spending data. A plan for including transactional data from two prefectural governments is underway.
|
||||
|
||||
## Other uses of OpenSpending
|
||||
|
||||
We have seen a number of CSOs and citizens make use of OpenSpending to serve specific visualisation needs outside the realm of pure budget and spending transparency.
|
||||
|
||||
* <a href="http://www.fundacjafenomen.pl/">Fundacja Normalne Miasto Fenomen</a>, Poland, used OpenSpending to visualise data on transportation spending for the <a href="http://www.google.com/url?q=http%3A%2F%2Fopenspending.org%2Flodz_2013_transport_budget&sa=D&sntz=1&usg=AFQjCNGQheo8Wg1kQ7ztn27o2k7TqcsV8Q">city of Łódź</a> in order to advance their environmental agenda.
|
||||
|
||||
* The Social liberal party (D66) of Rotterdam, The Netherlands, used budget data from the city of Rotterdam to help inform elected officials and decide on local priorities within the party. The purpose was, in the first place, to help the party decide on budget priorities and secondly to advocate to the city itself to adopt the practise of visualising the budget to its citizens in a meaningful way. A similar example was seen with the budget of Uruguay for 2012 from the Uruguayan Office of Planning and Budget.
|
||||
|
||||
* The search interface of OpenSpending was used by Privacy International <a href="http://community.openspending.org/2012/02/how-spending-stories-fact-checks-big-brother-the-wiretappers-ball/">to do research into which companies are selling surveillance equipment had contracts with governments around the world</a>.
|
||||
|
||||
* OpenSpending was used to visualise the <a href="http://community.openspending.org/2013/04/visualising-urban-development-data-at-un-habitat/">UN-Habitat data on Urban Development</a>.
|
||||
|
||||
* In collaboration with Publish What You Fund, OpenSpending was used to provide the <a href="http://publishwhatyoufund.org/uganda/#/~/aid-and-domestic-spending-in-uganda-br----usd-">first consolidated view of the budget of Uganda</a>, including income from aid flows, which form a substantial part of the revenue flows for Uganda. Even the government of Uganda had previously not had access to this information.
|
||||
|
||||
**Next**: [Budget transparency for an open university](../opening-university/)
|
||||
|
||||
**Up**: [Case Studies: From Local to Global](../)
|
||||
@@ -0,0 +1,79 @@
|
||||
---
|
||||
lead: true
|
||||
title: Hutspace
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">Project: scraping, analysing and publishing procurement data. In progress, with version 1.0 scheduled for publication June, 2013.</div>
|
||||
<em>This chapter is based on an interview with Emmanuel Okyere, Hutspace (Ghana).</em>
|
||||
|
||||
Emmanuel Okyere is running a project to scrape, publish, and
|
||||
analyse the Ghana procurement register, while working on his own
|
||||
IT startup.
|
||||
|
||||
Emmanuel has built a database of contract awards for Ghana using Python
|
||||
scrapers and parsers, [Celery](http://www.celeryproject.org), and PostgresSQL. The
|
||||
preliminary result shows 4000 contract awards and 2000 current
|
||||
procurement opportunities. Future plans include building a searchable
|
||||
database and a flat CSV file download option in order to enable
|
||||
journalists and CSOs to work with the data.
|
||||
|
||||
## Technical challenges
|
||||
|
||||
### Cleaning data
|
||||
|
||||
> “Cleaning the data has been a substantial issue for the
|
||||
> project. There’s a lot of validation, which we need to do before we can
|
||||
> publish it simply because the data appear to be inaccurate. As an
|
||||
> example, we might have unrealistically small amounts appearing in a
|
||||
> contract award, or a date might not make sense. This has been the most
|
||||
> substantial bottleneck for the project.”
|
||||
|
||||
### Reconciling company entities
|
||||
|
||||
> “Many companies appear with a variety of
|
||||
> entities, and so finding a good way to reconcile companies which are
|
||||
> actually the same has been difficult.”
|
||||
|
||||
Emmanuel is planning to utilize a helpful codebase from the open
|
||||
parliament field, originally developed by MySociety for name
|
||||
reconciliation of parliament members, for reconciling the company
|
||||
entities.
|
||||
|
||||
### Identifying the correct amounts
|
||||
|
||||
A surprising problem in the procurement
|
||||
data has turned out to be the varying currency denominations appearing,
|
||||
such as GBP and USD. Finding appropriate historical exchange rates and
|
||||
calculating these has been cumbersome, but it is important in
|
||||
order to make the data as accessible as possible.
|
||||
|
||||
## Community challenges
|
||||
|
||||
Emmanuel points out that both the lack of knowledge about the
|
||||
availability of procurement data as well as the lack of skills to
|
||||
analyse it among journalists and CSOs are the main barriers for
|
||||
achieving more usage of the data.
|
||||
|
||||
> “For much of the work to be done on the data, having skills to use Excel
|
||||
> would actually be sufficient for journalists in order to get to work
|
||||
> with the data. However, skills to use Excel for analysis are lacking
|
||||
> among almost all journalists today. When it comes to more challenging
|
||||
> tasks which require coding skills for analysing the data, I know
|
||||
> actually only one journalist. She will be involved in this project.
|
||||
|
||||
> “Trainings could help equip more journalists to work with the
|
||||
> procurement data we are planning to release. We really need more people
|
||||
> to look and use the data, but that require that they have the skills. I
|
||||
> think that is what trainings like data bootcamps are for.
|
||||
|
||||
> “As publishers of the database, we would like to build visualisations to
|
||||
> spot trends in the data. For instance, we have noticed that when new
|
||||
> governments get into power, we see this reflected in the procurement
|
||||
> data as new contractors appear while others vanish. This is analytical
|
||||
> work we can do which I think journalists will not be able to do on
|
||||
> their own.”
|
||||
|
||||
**Next**: [Texty](../texty/)
|
||||
|
||||
**Up**: [Case Studies: Procurements](../)
|
||||
@@ -0,0 +1,23 @@
|
||||
---
|
||||
lead: true
|
||||
title: 'Case Studies: Procurements'
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<a href="http://www.flickr.com/photos/seemoredomore/4710878501/" title="Construction in Ghana by Twin Work & Volunteer"><img src="http://farm5.staticflickr.com/4072/4710878501_eb22b37418_z.jpg" width="640" height="480" alt="ConstructionsGhana (7)"></a>
|
||||
|
||||
An important category of government spending data is data on public procurements. Procurement data concerns works, services, and goods commissioned by public authorities. The tight regulations that generally apply to public procurements create an excellent opportunity for data publication and reuse.
|
||||
|
||||
In this section, we look at three CSO projects that have made use of procurement data, exploring what value these CSOs have found in the data, what challenges they've faced, and what tools they have used to address those challenges.
|
||||
|
||||
We have found that procurement data serves an important purpose for promoting financial transparency in many countries. In particular, it is often able to fill in the blanks when [transactional spending data](../case-studies-spending/) is not available. At the national level, most EU countries do not publish transactional spending data, with the exceptions of the United Kingdom and Slovenia; nor in general do public agencies outside national government such as regional or municipal government, despite a sizeable share of government spending taking place at these levels.
|
||||
|
||||
Global initiatives such as OpenContracting of the World Bank Institute and more recently the procurement initiative of the Sunlight Foundation confirm that momentum is growing to promote transparency in procurement. The case studies in this section show that accessing and analysing procurement data can provide substantial improvements to the state of financial transparency, but also that data formats, data quality, and disclosure policies remain barriers for utilizing the full potential from procurement data. These issues deserve attention as procurement transparency gains momentum.
|
||||
|
||||
* [Hutspace, Ghana](./hutspace/)
|
||||
* [Texty, Ukraine](./texty/)
|
||||
* [OpenTED, procurements from EU](./opented/)
|
||||
|
||||
**Next**: [Hutspace](./hutspace/)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,27 @@
|
||||
---
|
||||
lead: true
|
||||
title: OpenTED, Opening Tender Electronic Daily
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
|
||||
This post reviews how OpenTED and OpenSpending have worked to make procurement data from the EU site, Tender Electronic Daily (TED), available as a CSV download. More than 100,000 public sector contracts are published annually in the European procurement register originating from tiny municipalities to large government agencies.
|
||||
|
||||
## Why open up EU procurement data?
|
||||
|
||||
TED contains procurement data on contracts awarded from any public agency within the EU valued above the minimum threshold of EUR 200,000. In most EU countries, granular data from contract awards therefore comprises a significant share of procured and projected spending.
|
||||
|
||||
It is an often overlooked fact that EU procurement rules apply to all majority publicly owned companies. For this reason, the public can, for instance, access data on more than 500 contracts awarded by the Swedish state-owned Vattenfall in all EU countries of operation, such as a contract awarded from their Berlin based company, due to the fact that it is majority owned by the Swedish state.
|
||||
|
||||
## Project and issues
|
||||
|
||||
Data from TED is not available as a bulk download, and so in 2011, a small data journalism project, OpenTED, began exploring the options for scraping the data in order to make it openly available. In November 2012 and May 2013, this was explored further through community hack days in London and Brussels organised by OKF, where data was retrieved, parsed, and cleaned. The full TED data is now available as CSV files EU-wide, on a country-by-country basis, and by annual breakdown.
|
||||
|
||||
## Challenges
|
||||
|
||||
Several challenges remain, which are primarily tied to the data quality. Additional data cleaning is still needed before it is even possible to assess to what extent the TED data actually contains sufficiently useful information.
|
||||
A review of data quality is needed. Preliminary findings have shown that significant data fields such as contract amounts and contractor name suffer from low reporting due to what could be an absence of mandatory reporting requirements. The community involved in advancing procurement transparency, such as Transparency International and Sunlight Foundation, should examine how disclosure practices can be improved. The data quality review of TED is an example of a dataset which can only be improved if the transparency communities across countries join forces to argue for such improvements.
|
||||
|
||||
**Next**: [Case Studies: From Local to Global](../../case-studies-other/)
|
||||
|
||||
**Up**: [Case Studies: Procurements](../)
|
||||
@@ -0,0 +1,135 @@
|
||||
---
|
||||
lead: true
|
||||
title: Texty
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">Project: <a href="http://z.texty.org.ua/">z.texty.org.ua</a>, a procurement database based on data from the Ukrainian government.</div>
|
||||
Texty.ua was established in 2010 as an NGO by Anatoliy Bondarenko and
|
||||
Roman Kulchynsky (Editor in chief). They both have a background inside
|
||||
Ukrainian media outlets, Roman as Editor in Chief at the Ukrainian
|
||||
weekly, Tyzhden. Anatoliy has served as an editor and programmer with
|
||||
a scientific educational background.
|
||||
|
||||
Texty decided to pursue procurements, as this proved to be the
|
||||
best possible way to cover public spending due to the fact that
|
||||
transactional spending is not available. The result was <a href="http://z.texty.org.ua/">z.texty.org.ua</a>, a
|
||||
searchable database for public procurements completed in the spring
|
||||
of 2012. The database is updated weekly and contains procurement data
|
||||
from 2008 onwards.
|
||||
|
||||
State and local budgets also remain priorities for Texty, though they
|
||||
do not currently have the resources to conduct analysis more frequently than
|
||||
once a year. The state budget process in Ukraine is complex
|
||||
and difficult to follow, so the site is currently monitoring changes
|
||||
to the budget, and Texty would like to play a role in this.
|
||||
|
||||
## Tools
|
||||
|
||||
Texty work on budget and procurement data with a variety of tools.
|
||||
|
||||
* Open Refine: working with raw data
|
||||
|
||||
* R: analysis of data
|
||||
|
||||
* D3.js: online data visualization
|
||||
|
||||
## Model
|
||||
|
||||
Texty sustains its activities by providing data analysis and
|
||||
visualisations for both CSOs and media outlets.
|
||||
They delivered [data
|
||||
analysis for Forbes Ukraine](http://forbes.ua/ratings/people) concerning
|
||||
concentration in procurement contracts within the business elite.
|
||||
|
||||
## Challenges
|
||||
|
||||
Texty points to the lack of resources in the data journalism
|
||||
field as the biggest challenge. While both data and tools are available,
|
||||
the lack of resources for completing the required data analysis
|
||||
currently hinders more elaborate projects on spending transparency.
|
||||
While CSOs and media outlets regularly source data investigations with
|
||||
Texty, the demand is currently not enough for taking advantage of the
|
||||
data actually available. Texty is supplementing their investigations
|
||||
with offering data-journalism trainings.
|
||||
|
||||
### Open database for public procurements in Ukraine
|
||||
In 2011, when Texty began working on public procurements in Ukraine,
|
||||
getting the data was a top priority because of the huge volumes
|
||||
available and rumors about massive corruption in the field. In
|
||||
2012, spending on procurements was approaching 40% of the GDP of Ukraine, which
|
||||
could be one of the highest in the world.
|
||||
|
||||
### Problems with the govermental site
|
||||
|
||||
[http://tender.me.gov.ua](http://tender.me.gov.ua), the source of procurement data, presents several issues. It requires an account and login, and it only gives access to the
|
||||
data via an HTML table with max 100 results from one of the issues of the
|
||||
official bulletin about public procurements. No tables are sortable, and
|
||||
no records have been linked to one other. Finally and most
|
||||
importantly, the data is dirty; you can, for example, easily find several different
|
||||
versions of the same supplier (company) name.
|
||||
|
||||
## Getting data from the government site
|
||||
|
||||
The Texty team wrote a Ruby script to mimic user login, check for
|
||||
updates, and to scrape data from HTML webpages, all of which had a
|
||||
different structure. After cleaning, they imported the data into a relational
|
||||
database as normalised data, for example creating links between records
|
||||
for each participant. The database is updated approximately twice per
|
||||
week.
|
||||
|
||||
The tool stack:
|
||||
|
||||
* [nginx](http://wiki.nginx.org/Main)
|
||||
* [sinatra](http://www.sinatrarb.com/)
|
||||
* [mysql](http://www.mysql.com/)
|
||||
* [Tangle.js](http://worrydream.com/Tangle/) (for a novel approach to the user interface)
|
||||
|
||||
## Features
|
||||
|
||||
From the main page, it is possible to explore data about tenders in realtime and to change the textual query and immediately get information on the total volume for a particular industry, participant, and/or period of time.
|
||||
|
||||
Additionally, clicking on total volume yields all tenders therein. For each company participating in a tender, the database contains information on all other deals which the company has won. Recently, an "advanced search" page has been added, with the possibility to export result in form of a simple and portable CSV format
|
||||
|
||||
## Impact and coverage of the project
|
||||
|
||||
One year into the project's existence, the site reached about 1,500 daily
|
||||
users per day, despite having almost zero advertising. It has gained
|
||||
attention and been used by investigative journalists as well. Some
|
||||
stories were published in the biggest independent
|
||||
internet outlet, Ukrainian Pravda, which has approximately 200,000 readers per day.
|
||||
|
||||
In Autumn 2012, a joint project with Forbes.ua called "Champions of
|
||||
tenders" was launched. The Texty team shared the open part of their data, information about
|
||||
deals from their database (including the names of firms and volumes of money),
|
||||
through a simple web API. Next, the team from Forbes.ua used the data in
|
||||
their database to link firms to names of owners—Forbes.ua mantains a
|
||||
proprietary database of these. The Texty team also made an [interactive
|
||||
visualization of this data](http://forbes.ua/ratings/people) for Forbes.ua.
|
||||
|
||||
<a href="http://www.flickr.com/photos/94746900@N06/8895650387/" title="thumbnail by anderspedersenOKF, on Flickr"><img src="https://farm9.staticflickr.com/8123/8895650387_c1f6582979_o.jpg" width="600" height="373" alt="thumbnail"></a>
|
||||
|
||||
## Impact of open tender data
|
||||
|
||||
Since 2008, when information about tenders became openly available for
|
||||
the first time, there has been a shift in public opinion about
|
||||
tenders and public spending on procurement. Today there seems to be a
|
||||
real awareness about corruption in procurements, though still not a
|
||||
clear idea about the actual scale of the problem. For example, there is
|
||||
even a TV-programme on the channel TVi, opposing the government, called
|
||||
"Tenders News".
|
||||
|
||||
Ukraine has a couple of projects about tenders, though Texty appears to
|
||||
be the most sizeable and complete database. There has, however, been continuing lobby attempts to close down access to
|
||||
as much information about tenders as possible, and many of these have
|
||||
unfortunately been successful. The most recent example was a law accepted by a
|
||||
majority of the Ukrainian parliament in Autumn 2012, which meant that 35% of
|
||||
all volumes of tenders would be hidden from the public.
|
||||
|
||||
The ongoing hope for transparency in public procurement is based on a
|
||||
proposed agreement about association between Ukraine and the EU, which
|
||||
includes requirements about transparency in tenders.
|
||||
|
||||
**Next**: [OpenTED, Opening Tender Electronic Daily](../opented/)
|
||||
|
||||
**Up**: [Case Studies: Procurements](../)
|
||||
@@ -0,0 +1,29 @@
|
||||
---
|
||||
lead: true
|
||||
title: CaseCaring for My Neighbourhood
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
*This post was written by Gisele Craveiro, of the University of São Paulo, member of [OKFN Brazil](http://br.okfn.org/) and one of the coordinators of GPoPAI (Research Group of Public Politics in Access to Information).*
|
||||
|
||||
The public budget should express the population's needs and priorities and its implementation should be as transparent as possible. In Brazil, the municipal budget implementation details must be published on the web daily, but even in the case where this law is acted upon, the reality is that very few people understand them.
|
||||
|
||||
The ["Caring for my neighbourhood"](http://www.gpopai.usp.br/cuidando) project wants to provide means for society to know the budget thematics by better spending oversight.
|
||||
|
||||
<img alt="" src="http://farm8.staticflickr.com/7274/7604750834_a7ec37ee8a_z.jpg" title="Caring for My Neighbourhood" class="alignnone" width="640" height="480" />
|
||||
|
||||
To achieve the objective, all expenditure related to public equipments in São Paulo are geolocated and shown in a web site. This will support training activities in the community. We aim to promote citizen engagement by showing the user which projects can be found in their area.
|
||||
|
||||
By providing an easy visualisation of many individual expenses placed in a map, it may lead people to make a link between governmental action and something tangible of their everyday life. The tool shows on the map: the expense description, the amount of resources allocated to it and the amount spent so far. Thus data will be more understandable and the resident could take control of what is happening in his/her neighbourhood.
|
||||
|
||||
We hope that the comparison to other areas in the city can give to the community/citizens more skilled arguments during the budget formulation and other decision making processes. We hope that it can contribute to better income distribution and a more efficient fight against corruption.
|
||||
|
||||
Besides the tool, we will develop content about public budget concepts in order to support activities in the community. We will also organize mapping fests so participants can know better the neighbourhood and public equipments that are receiving investment.We intend that the collected information (maps, photos, videos, texts), produced during these activities or later, can constitute a crowdsourcing platform for future monitoring and also feed open platforms like OpenStreetMap.
|
||||
|
||||
Researchers from University of São Paulo (also OKFN members) and Our São Paulo Network (a network of over 600 civil society organizations operating in the municipality of São Paulo) are organising this initiative, but we'd like to invite anyone interested to contribute: sending suggestions, coding or just disseminating this idea/project to whom it may concern. More information with Gisele Craveiro (giselesc at usp dot br).
|
||||
|
||||
The tool beta version can be found at: <http://www.gpopai.usp.br/cuidando> (only in Portuguese). Code available in <https://github.com/fefedimoraes/orcamento>.
|
||||
|
||||
**Next**: [Open Knowledge Foundation Greece](../okfn-greece/)
|
||||
|
||||
**Up**: [Case Studies: Spending](../)
|
||||
@@ -0,0 +1,115 @@
|
||||
---
|
||||
lead: true
|
||||
title: EU Spending Data
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">This segment is based on a community call organised on 18 February
|
||||
2013 with
|
||||
additional input from <a href="https://twitter.com/ronpatz">Ronny Patz</a>,
|
||||
Transparency International, Brussels Office.</div>
|
||||
European spending programmes have been undergoing increased scrutiny
|
||||
among journalists and CSOs in recent years. In this section, we will examine the
|
||||
access to data from the EU structural funds as well as the EU Commission
|
||||
spending through the Financial Transparency System (FTS) and discuss
|
||||
what actors are involved in data-driven analysis and campaigning
|
||||
around this data.
|
||||
|
||||
## Structural funds
|
||||
|
||||
Following the Common Agricultural Policy, which has been covered by
|
||||
[Farmsubsidy.org](../farmsubsidy/), data from the structural funds have
|
||||
been considered the most important spending data by both journalists and CSOs. In 2010, the Financial Times and the Bureau of
|
||||
Investigative Journalism published a project including an extensive
|
||||
database [mapping the structural
|
||||
funds](http://datajournalismhandbook.org/1.0/en/case_studies_1.html) across
|
||||
the 2007-2013 EU budget ([more info](http://blog.okfn.org/2011/03/08/a-kafkaesque-data-trail-the-hunt-for-europes-hidden-billions/)).
|
||||
The project was rightly heralded as
|
||||
groundbreaking for its level of detail and dedication and its cross-border
|
||||
setup. Three years later, however, it is also clear that such centrally
|
||||
initiated projects have limitations, and therefore challenges
|
||||
remain when thinking about Europe-wide spending transparency.
|
||||
|
||||
### Data issues
|
||||
|
||||
The FT-TBIJ structural funds investigation exposed a series of barriers
|
||||
which limit the use of structural funds for CSOs and journalists:
|
||||
|
||||
* Poor data quality
|
||||
|
||||
* Lack of access to data in machine-readable formats; in practise, data
|
||||
is often published as PDF, since no format for spending data has
|
||||
been specified in EU regulation
|
||||
|
||||
* A dispersed model of distribution across regions in Europe from
|
||||
dozens of local sites without a centralised European clearinghouse
|
||||
|
||||
### Community challenges
|
||||
|
||||
We’ve identified a few important points, which should be noted from the
|
||||
project:
|
||||
|
||||
* **Media outlets are unlikely to build and sustain long-term data
|
||||
projects**. Though the project provided a unique insight into
|
||||
structural funds, it was not the intention of the publishers to
|
||||
develop a long-term model for tracing and publishing structural
|
||||
fund payments. Though non-profit media institutions do offer a few
|
||||
[important exceptions](http://projects.propublica.org/docdollars/),
|
||||
we also know that maintaining databases is costly and is widely
|
||||
considered to provide too little value for ongoing beat journalism inside
|
||||
newsrooms.
|
||||
* **CSOs have not addressed systemic needs for data**. Despite receiving
|
||||
wide recognition from CSOs regarding the importance of the FT-TBIJ
|
||||
structural funds investigation, the European CSO community has
|
||||
remained unable to address the need for continuous data flows and
|
||||
analytical capacity. Several CSOs do, however, provide extensive
|
||||
coverage of European data across topics like FOI ([AccessInfo](http://www.access-info.org)),
|
||||
lobbyism ([Corporate Europe](http://corporateeurope.org), [Alter-EU](http://www.alter-eu.org)) and the green economy
|
||||
([Bankwatch](http://bankwatch.org), [Friends of the Earth](http://foei.org)).
|
||||
* **Improvements in access to data seem still to be largely supply-driven**. There are national governmental initiatives addressing the
|
||||
lack of access to spending data such as [Open
|
||||
Coesione](http://www.opencoesione.gov.it/) from the Italian
|
||||
government (launched summer 2012), which publishes data on Italian
|
||||
structural funds from 450,000 development projects worth € 33.4
|
||||
billion. Project Lead [Luigi Reggi](http://luigireggi.eu) is
|
||||
regularly engaging with data journalists and the wider public
|
||||
through open data events and social media.
|
||||
|
||||
[Luigi Reggi](http://luigireggi.eu) has also mapped the accessibility of
|
||||
EU structural funds data across the EU.
|
||||
|
||||

|
||||
|
||||
<small>Rating of the accessibility of data
|
||||
from the structural funds. Red: PDF only. Dark green: machine
|
||||
readable. Source: [luigireggi.eu](http://luigireggi.eu)</small>
|
||||
|
||||
## Future perspectives
|
||||
|
||||
The data from the EU structural funds is an example where CSOs and media
|
||||
outlets are still falling short of the potential of covering already-available spending data in individual countries as well as across
|
||||
borders.
|
||||
|
||||
At this moment, there seems to be no clear political momentum within the
|
||||
EU for requiring data from the structural funds to be published at a
|
||||
central site (eg. on the European Data Portal).
|
||||
|
||||
New rules will, however, mandate publishing of structural fund data
|
||||
according to certain dimensions or fields, which provides a small step in
|
||||
the right direction ([page
|
||||
157](http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=COM:2012:0496:FIN:EN:PDF))
|
||||
|
||||
## Financial Transparency System (FTS)
|
||||
|
||||
Spending reported under the FTS consists of both [EU Commission spending
|
||||
and grant funding](http://community.openspending.org/research/eu/) provided to
|
||||
programmes such as research, education, and foreign aid. This is likely the best documented part of the EU budget, though it is not transactional spending data, as it only provides project funding data
|
||||
and not actual transactions from either EU agencies nor project recipients.
|
||||
|
||||
An increase in the minimum threshold is under consideration,
|
||||
however, which would cause a decrease in access to a substantial amount of
|
||||
payments.
|
||||
|
||||
**Next**: [FarmSubsidy.org](../farmsubsidy/)
|
||||
|
||||
**Up**: [Case Studies: Spending](../)
|
||||
@@ -0,0 +1,32 @@
|
||||
---
|
||||
lead: true
|
||||
title: Farm subsidies in Mexico
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
|
||||
Project: <a href="http://subsidiosalcampo.org.mx/">Farm subsidy database for Mexico</a> by <a href="http://www.fundar.org.mx/">Fundar</a>
|
||||
|
||||
## About the project:
|
||||
|
||||
Fundar joined with two other organisations including University of
|
||||
California of Santa Cruz to build the first farm subsidy database in
|
||||
Mexico. Fundar received some technical assistance from the Environmental
|
||||
Working Group, which operates the US farm subsidy database.
|
||||
|
||||
Currently Fundar is planning to relaunch the database with a new interface and new
|
||||
features. One part time consultant and 2 developers work on the database
|
||||
which is strongly prioritised by the organisation.
|
||||
|
||||
## Experiences
|
||||
|
||||
The farm subsidy database has seen substantial use from journalists
|
||||
using the various search functionalities. During the last federal
|
||||
election the database became an important source for both local and
|
||||
national media outlets for monitoring politicians.
|
||||
|
||||
Collaboration between communities working on farm subsidies in Mexico, USA and Europe could be explored in order to share experiences on how to create useful tools for displaying and analysing farm subsidy data.
|
||||
|
||||
**Next**: [India Spend](../india-spend/)
|
||||
|
||||
**Up**: [Case Studies: Spending](../)
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
lead: true
|
||||
title: FarmSubsidy.org
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">Project: <a href="http://farmsubsidy.org/">FarmSubsidy.org</a></div>
|
||||
Every year EU pays almost €60 billion to farmers and the farming industry under the Common Agricultural Policy, making it the biggest single spending programme under the EU. In 2005, journalists from the UK, Sweden, Netherlands, and Denmark teamed up to get ahold of the data country by country. Finland, Poland, Portugal, regions of Spain, Slovenia, and other countries soon followed. In some countries like Germany in 2007, the group had to go to court to get the data, which resulted in coverage from Stern and Stern online while raising the discussion of transparency.
|
||||
|
||||
The meaningful demand from Farm Subsidy for each farm subsidy payment to be made public became a powerful vehicle for measuring transparency in practise.
|
||||
Farm Subsidy publishes a transparency index annually, benchmarking all member states on the quality of their data releases. The data from Farm Subsidy as well as the analysis and outreach of the
|
||||
core team have resulted in substantial improvements in EU spending journalism
|
||||
since 2006. A selection of the [news stories generated from farm subsidy
|
||||
data](http://farmsubsidy.openspending.org/news/) is available.
|
||||
|
||||
## Challenges
|
||||
|
||||
- The uptake from journalists using the farm subsidy data as a continuous source for reporting was limited, most likely due due to the size of the dataset as well as the poor quality of the data often submitted by governments.
|
||||
- Limited access to data since 2010.
|
||||
|
||||
## Ruling to shut down access
|
||||
|
||||
In 2010, the European Court of Justice (ECJ) decided that individual farmers should have the right to privacy when receiving funds
|
||||
from the CAP. The ECJ decision has *de facto* enabled governments to release data of [highly
|
||||
varying quality, granularity and
|
||||
consistency](http://farmsubsidy.org/news/features/2012-data-harvest/).
|
||||
|
||||

|
||||
|
||||
<small>Source: Farm Subsidy. Along with the annual retrieval of farm subsidy
|
||||
payments, Farm Subsidy produced a comprehensive index on farm subsidy
|
||||
spending transparency.</small>
|
||||
|
||||
In early 2013, Farm Subsidy approached OpenSpending suggesting the two
|
||||
projects collaborate around the hosting of the site as well as the
|
||||
annual data collection. In May, OpenSpending officially began hosting the
|
||||
site at
|
||||
[farmsubsidy.openspending.org](http://farmsubsidy.openspending.org).
|
||||
|
||||
At the annual [DataHarvest
|
||||
2013](http://www.journalismfund.eu/dataharvest13), the opportunities around farm subsidy investigations were covered at several sessions:
|
||||
|
||||
* How the experiences gained in retriving farm subsidy data can be
|
||||
used for accessing other spending data
|
||||
* How journalism on farm subsidy spending can expand its focus as well
|
||||
as find more local user cases
|
||||
|
||||
**Next**: [Farm subsidies in Mexico](../farm-subsidies-mexico/)
|
||||
|
||||
**Up**: [Case Studies: Spending](../)
|
||||
@@ -0,0 +1,25 @@
|
||||
---
|
||||
lead: true
|
||||
title: 'Case Studies: Spending'
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||

|
||||
|
||||
In contrast to budget data, spending data reports on specific expenditures of funds, covering individual transactions that have actually taken place rather than categories of planned spending.
|
||||
|
||||
Our research has uncovered an interestingly contradictory state of affairs. In most countries, fine-grained transactional spending data is not easily available, and over the past few years, organisations and journalists in these countries have used FOI requests and otherwise created pressure for the release of such data. In a few countries, however, such spending data has been published—and it hasn't attracted much public attention!
|
||||
|
||||
In the following case studies, we examine how spending data can be used to strengthen community participation in public spending as well as to increase accountability on some of the biggest spending programmes in the EU. The cases also deal with the challenges of enhancing uptake of spending data among community members and journalists in the face of the data's often intimidating complexity and scale.
|
||||
|
||||
* [Caring for your neighbourhood](./caring-for-my-neighbourhood/)
|
||||
* [OKFN Greece](./okfn-greece/)
|
||||
* [EU Spending Data](./eu-spending-data/)
|
||||
* [Farmsubsidy.org](./farmsubsidy/)
|
||||
* [Farm subsidies in Mexico](./farm-subsidies-mexico/)
|
||||
* [India Spend](./india-spend/)
|
||||
* [Supervizor](./supervizor/)
|
||||
|
||||
**Next**: [CaseCaring for My Neighbourhood](./caring-for-my-neighbourhood/)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,47 @@
|
||||
---
|
||||
lead: true
|
||||
title: India Spend
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
[India Spend](http://www.indiaspend.com/) is one of India's first data journalism initiatives. Starting out with a tight remit to investigate spending practices in India from a journalistic standpoint, they have since branched out into other topics, such as the urbanisation of India, many of which have financial themes. Their reports are very well regarded, and other business newspapers pay a monthly fee for syndication of their reports.
|
||||
|
||||

|
||||
|
||||
India Spend mentioned a variety of issues in getting, working with, and presenting financial data in India. Here are a few of the most striking.
|
||||
|
||||
## Problem number 1
|
||||
|
||||
> "We have to start sourcing physical copies of the data, and the problem often is that paper copies are in local languages, which we don't speak."
|
||||
|
||||
### Any solutions?
|
||||
|
||||
To date, to our knowledge, there is no simple method of automatically machine-translating datasets, the most effective method of machine translation being Google Translate, which has a fee-based API, and even this does not cover all of India’s languages.
|
||||
|
||||
## Problem number 2
|
||||
|
||||
> "The average government website in India isn't even PDFs, it's images."
|
||||
|
||||
### Any solutions?
|
||||
|
||||
See the [Tools Ecosystem Section](../../appendix/tool-ecosystem/) for a few tools to help extract information from image-based documents.
|
||||
|
||||
## Problem number 3
|
||||
|
||||
> "The level of literacy for visualisations in India is not high".
|
||||
|
||||
People struggle to interpret anything besides the simplest charts, so the India Spend team try to keep it simple. They have been experimenting with simple visualisation tools such as Tableau and GeoCommons, but there have been some complications. When trying to map locations in India, for example, they often found that the given longitude and latitude of a particular place were recorded incorrectly. This was not so much the case when they tried to do mapping internationally—mainly just India.
|
||||
|
||||
## A few conclusions
|
||||
|
||||
Common trends in the types of data required:
|
||||
|
||||
* To advance their work, India Spend really need performance and program data, but this simply is not available in India.
|
||||
* Output-level data is needed to be able to compare what was promised against what actually happened.
|
||||
* Information on the original instructions given to people compiling the data within governments is needed in order to understand what assumptions were made and what is and is not included in a given category.
|
||||
|
||||
<em>See the full list of organisations we visited on the India trip on the <a href="http://in.okfn.org/2012/09/18/okfn-india-trip-the-roundup/">OKFN-India blog</a></em>.
|
||||
|
||||
**Next**: [Supervizor, Slovenia](../supervizor/)
|
||||
|
||||
**Up**: [Case Studies: Spending](../)
|
||||
@@ -0,0 +1,95 @@
|
||||
---
|
||||
lead: true
|
||||
title: Open Knowledge Foundation Greece
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
<div class="well">Project: Tracing financial reporting from the Cl@rity program.</div>
|
||||
*This project review is based on the detailed review of Thodoris
|
||||
Papadopoulos, Open Knowledge Foundation Greece, Local Group.*
|
||||
|
||||
Open Knowledge Foundation Greece is monitoring the financial reporting systems in
|
||||
Greece during the financial crisis, resulting in detailed reviews of
|
||||
the rapid legal and technical changes in the Greek financial
|
||||
reporting system. This has ultimately lead to several proposals for technical changes to the reporting standards.
|
||||
|
||||

|
||||
|
||||
## About the project
|
||||
|
||||
In Greece, the most important piece of legislation in recent years
|
||||
regarding the transparency of government action is the [Cl@rity](http://diavgeia.gov.gr/) (Diavgeia)
|
||||
program, which introduced the obligation to publish all the decisions of
|
||||
government and all administrative entities on the Internet.
|
||||
Cl@rity aims to generate maximum
|
||||
accessibility of government policy and administrative actions. Since the
|
||||
programme was launched in October 2010, almost 6 million administrative
|
||||
decisions have been uploaded to the Cl@rity website, with a daily
|
||||
average of 14,000 decisions.
|
||||
|
||||
With the Cl@rity program, the enforceability of any administrative act
|
||||
presupposes a previous announcement on the Internet. Furthermore,
|
||||
Cl@rity provides an open data API in XML and JSON formats through which
|
||||
everyone can have structured access to all decisions, along with their
|
||||
metadata, ensuring openness and further dissemination of public
|
||||
information.
|
||||
|
||||
The Cl@rity initiative has already had a quiet but significant
|
||||
effect on the way authorities handle their executive power. It leaves
|
||||
considerably less room for corruption and exposes it much more easily
|
||||
when it takes place, since any citizen or interested party can openly
|
||||
access any questionable acts. This is a scheme of “collective scrutiny”
|
||||
that can be effective, since it allows citizens directly involved or
|
||||
concerned with an issue to scrutinize it in depth rather than leaving
|
||||
it to the traditional media, whose choice of issues often is restricted
|
||||
and oriented towards “safe” topics.
|
||||
|
||||
## Challenges
|
||||
|
||||
Although Cl@rity was not designed with financial monitoring in mind, it
|
||||
includes various decision types that includes financial metadata
|
||||
(such as expenditure, budget, and contract data). From the onset, however,
|
||||
the Cl@rity programme has suffered from issues of poor data quality, including:
|
||||
|
||||
* Failure to provide a hierarchy of entities
|
||||
* Lack of validation rules for metadata fields and non-mandatory
|
||||
requirements
|
||||
|
||||
In reality, these issues have prevented citizens and journalists from
|
||||
utilizing the full potential of the Cl@rity program as a platform for
|
||||
public financial data.
|
||||
|
||||
## Project results
|
||||
|
||||
Open Knowledge Foundation Greece highlights two major achievements from their data quality review and proposals:
|
||||
|
||||
1. For the Cl@rity programme, substantial improvements have been
|
||||
implemented by the end of 2012. These corrections included changes
|
||||
of significant importance to journalists and CSOs, such as improving
|
||||
the quality of transactional spending data.
|
||||
2. For a supplementary information system to support the Cl@rity
|
||||
program, several suggestions brought forward by Open Knowledge Foundation Greece have been
|
||||
admitted to the data architecture. The new system will for this
|
||||
reason be more responsive and accurate and will provide a more detailed
|
||||
data model with many more metadata fields. The new system is
|
||||
expected to be delivered by the end of 2013. The owner of the system
|
||||
the [Greek Ministry of Administrative Reform and
|
||||
e-Governance](http://www.ydmed.gov.gr/) has undertaken to provide
|
||||
the source code of the system through the [European Open Source
|
||||
Observatory and
|
||||
Repository](http://joinup.ec.europa.eu/community/osor/description) under
|
||||
a [EUPL](http://joinup.ec.europa.eu/software/page/eupl) licence.
|
||||
|
||||
### Links about reuse of this data
|
||||
|
||||
* A [short video]((https://vimeo.com/46543472)) about researchers at the National Technical University
|
||||
of Athens and Students of WebScience and active members of OKFN
|
||||
Greece from the University of Thessaloniki
|
||||
* PublicSpending.gr (currently unavailable due to maintenance), also
|
||||
documented in [this academic
|
||||
article](http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2193600)
|
||||
* [http://greekspending.com/](http://greekspending.com/)
|
||||
|
||||
**Next**: [EU Spending Data](../eu-spending-data/)
|
||||
|
||||
**Up**: [Case Studies: Spending](../)
|
||||
@@ -0,0 +1,31 @@
|
||||
---
|
||||
lead: true
|
||||
title: Supervizor, Slovenia
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
In Slovenia, the anti-corruption agency has published the financial transparency site [Supervizor](http://supervizor.kpk-rs.si/) since 2011.
|
||||
|
||||

|
||||
|
||||
The Supervizor application:
|
||||
|
||||
- contains over 50 mio. transactions from both government and local agencies to government contractors from 2003 to 2013
|
||||
- matches such transactions to company records, including director lists and corporate leadership
|
||||
- use the actual bank transactions from public bank accounts at the Slovenian National Bank as source data, making the records highly reliable
|
||||
|
||||
According to the developer behind the platform, the granularity of the data has enabled statisticians and anaylysts to make statistical models available, including a [Hidden Markov model](http://en.wikipedia.org/wiki/Hidden_Markov_model) of the data.
|
||||
|
||||
Over the years, the platform has helped identify shifts in contractor purchases around changes in the political leadership.
|
||||
|
||||
## Lack of impact?
|
||||
|
||||
Despite its comprehensive content and site visualisations, however, the site has not received any significant response from journalists or CSOs. While it is not reasonable to speculate based on the available interview, we find it worth considering how one of the most transparent spending platforms should suffer from limited use.
|
||||
|
||||
Links:
|
||||
|
||||
* [Feature in Techcrunch](http://techcrunch.com/2011/08/23/slovenia-launches-supervizor-an-official-public-web-app-for-monitoring-public-spending/) from 2011
|
||||
|
||||
**Next**: [Case Studies: Procurements](../../case-studies-procurements/)
|
||||
|
||||
**Up**: [Case Studies: Spending](../)
|
||||
@@ -0,0 +1,42 @@
|
||||
---
|
||||
lead: true
|
||||
title: Conclusions
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
The voices and stories in this report highlight the diversity of the community working at the intersection of technology and government finance. We conclude this report by summarizing the demands CSOs are making for spending data and our recommendations for how they can pursue those demands and use the data to engage citizens with public finance.
|
||||
|
||||
## Data: demands and obstacles
|
||||
|
||||
It is impossible to deny that civil society is interested not only in open financial data but also in other forms of data that are necessary to make sense of public finance.
|
||||
|
||||
We encountered strong demand for the traditional categories of financial data, namely budgets, transactional spending data, and procurement and contract data. We also encountered interest in data on the *entities* referenced in spending data. Such information is necessary to contextualize and understand financial data, particularly procurement data:
|
||||
|
||||
* **Company information**. Data on companies is necessary in order to be able to answer questions related to policy and outsourcing decisions, the integrity and effectiveness of the contracting process, and the track records of companies contracting with governments, to name just a few applications. The <a href="http://opencorporates.com/downloads/ogp_company_data_report.pdf">OpenCorporates report</a> on open corporate data in Open Government Partnership Countries provides more information on company data.
|
||||
* Information on **people in positions of power** and their business and family associations. Such data is needed to answer key questions on responsibility for political decisions, the relationships between winners of tenders and government actors, and so on.
|
||||
|
||||
There have been a few attempts to standardise and collect this biographical information, including [Popit from MySociety](http://popit.mysociety.org/) and [Popolo from OpenNorth and the Participatory Politics Foundation](http://blog.opennorth.ca/2013/02/21/update-on-opengovernment/). A coordinated international effort is required in order to maintain this enormous and rapidly evolving dataset, however.
|
||||
|
||||
CSO demand for open spending data (and contextual data) goes well beyond simply asking for information. Our CSO interviewees have been in consensus about the *technical* impediments presented by existing disclosure practices and about the need for *machine-readable, bulk-downloadable data*. Lack of such data is an obstacle to transparency and accountability. Further details are provided in the [appendix on machine-readable data](../appendix/).
|
||||
|
||||
Besides bad data publishing practices, the CSOs we interviewed reported similar problems with obstruction from governments. Privacy concerns in particular were reported as frequently cited by governments as an excuse for not releasing granular data; similarly, many governments argue that contracts are commercially confidential and for this reason only agree to disclose redacted contracts (if any at all). Our interviewees believe these concerns are often raised in bad faith as an excuse. To help counter qualms about privacy, we have prepared a [privacy guide](../appendix/privacyguide/) showing how technology and good data management practices make it possible to develop a workflow which allows sensitive information to be redacted prior to publication.
|
||||
|
||||
## Recommendations: consolidation, training, strategy
|
||||
|
||||
From our interviews with CSOs, we have identified three areas of opportunity for enhancing CSO use of spending data. The variety of projects and diversity of skills in the CSO community present an opportunity for *knowledge consolidation* and knowledge sharing. We see a need for *training* to address both technical issues and lack of awareness of the potential of open spending data. This knowledge-sharing and training must take place in the context of a *diverse and inclusive* coalition, particularly a multilingual one. Finally, CSOs would benefit from articulating a shared *strategy* for pressuring governments to publish data.
|
||||
|
||||
We have observed a great diversity of CSO experience and expertise using spending data. We believe it is time for CSOs working with public finance data to share this diverse knowledge with one another. While most of the projects considered in this study developed as tailor-made solutions to context-bound problems, adapting the available data to the focus of a particular project, each of these projects could be used as a *template* for future projects. We hope the case studies featured in this report facilitate such reuse of good ideas.
|
||||
|
||||
Alongside this diversity of expertise, we have found recurring problems with key skills in the data-wrangling toolkit, namely web scraping, PDF liberation, and FOI requests. Focused trainings around these skills would prove valuable in promoting uptake of public spending data. The appendix on the [ecosystem of tools](../appendix/tool-ecosystem) provides further details on recurring issues and how they can be tackled, providing concrete anchors for technical trainings.
|
||||
|
||||
Technical skills are not the only opportunity for training: there is also need for training in the concept of data-driven investigation itself. In some countries where high-quality spending data has been made available, we have found no journalists or CSOs making use of it. Despite Slovenia arguably having one of the leading models for publishing government transactions since 2011, this data has seen very little uptake, and despite the existence of the Cl@rity programme in Greece, we struggled to find interested CSOs or journalists. This issue requires further attention, but it suggests that more needs to be done to bring the potential of open spending data to the attention of civil society.
|
||||
|
||||
Raising awareness and sharing knowledge will not happen unless a concerted effort is made to overcome the *anglocentric* tendency of the existing training materials and discussion around open data. This report is no exception to this tendency: besides being written in English, it focused on those CSOs who were able to conduct interviews in English. Efforts must be made to conduct trainings in the native languages of those countries were open data advocacy is taking place. Translations must also be made of key resources such as training materials and fact sheets on data standards. This will be especially necessary—and challenging—in massively multilingual countries like India.
|
||||
|
||||
Our final recommendation is that CSOs arm themselves with a coherent strategy for pressuring governments to publish open data based around the promotion of best practices. Countries like Slovakia and the UK present clear examples of successful open data policies that could be used by CSOs in consultations with governments in plans for Open Government Partnership meetings. Best-practice examples like these are valuable rhetorical tools in the fight for disclosure.
|
||||
|
||||
CSOs developing such strategies would benefit from being aware of the wide variety of factors that lead to the release of better spending data. Recent improvements to spending transparency in Greece and Slovenia, for example, appear to be driven by factors not related to public pressure for transparency. Greece's financial crisis has boosted centralised fiscal management, with the striking result that Greece has rapidly gone from one of the least to one of the most financially transparent European states. In Slovenia, the [Supervizor](https://www.kpk-rs.si/en/project-transparency/supervizor-73) project has been driven by a combination of a strong independent anti-corruption commission and access to pro-bono development resources to prototype and develop the models. Attention to the concrete historical conditions responsible for success documented in this report can help CSOs develop their approach to engagement with governments.
|
||||
|
||||
**Next**: [Appendix](../appendix/)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,67 @@
|
||||
---
|
||||
lead: true
|
||||
title: Summary of recommendations
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
In the rest of this report, we present the status quo in CSO usage of government spending data in the form of case studies on successful data-driven projects.
|
||||
|
||||
In this chapter, we highlight the common difficulties faced by these CSOs and the opportunities they see for improvements to the useability of fiscal data, as well as our recommendations for how those difficulties can be overcome and those improvements achieved. A fuller exposition of our recommendations can be found in the report's [conclusion](../conclusions/).
|
||||
|
||||
## More and better data needed
|
||||
|
||||
There is strong demand not only for more data about public revenues and expenditures but also for data that is of higher quality and is more useable. While many CSOs consulted for this study engaged in sophisticated analysis and uncovered subtle connections, in terms of person-hours, the bulk of their work consisted of merely collecting data and refining it.
|
||||
|
||||
Our first and most important recommendation is therefore that CSOs push their governments to proactively release data on the full range of public finances in a machine-readable and accessible form. A later chapter of the report provides our [guidelines for financial data](../appendix/open-budgets-open-data/).
|
||||
|
||||
### Types of data to demand
|
||||
|
||||
Many countries still do not make data available on important areas of public spending. There is a demand for data in each of the following categories:
|
||||
|
||||
* Budgets
|
||||
* Spending (crucially at transaction level)
|
||||
* Procurement and contracts
|
||||
* Revenues
|
||||
* [contextual information](../appendix/other-handy-datasets/) (e.g. demographics, geodata, targets & outputs)
|
||||
|
||||
### Relevance
|
||||
|
||||
As one interviewee remarked, it does not take much to render data unusable. It is important that data publishers take steps to ensure that published spending data actually contributes to citizen engagement with public finances. These steps include:
|
||||
|
||||
* releasing data proactively
|
||||
* releasing data regularly and in a timely fashion
|
||||
* making data available at international (e.g. EU farm subsisidies), national, and local levels
|
||||
* ensuring consistency of data (e.g. consistent identifiers for companies)
|
||||
* publishing reference data, code sheets, and metadata
|
||||
* publishing data with an [open license](http://opendefinition.org/licenses/) to promote reuse
|
||||
|
||||
### Machine-readability
|
||||
|
||||
Publishing data in unstructured and non-machine-readable formats wastes time and prevents many projects from getting off the ground. Data should be published in a form that is transparent to computational processing. CSOs should push governments to:
|
||||
|
||||
* publish data in a [machine-readable format](../appendix/machinereadfaq/) (no PDFs, Word documents, or HTML tables)
|
||||
* provide a bulk download option: no CAPTCHA codes, download limits, etc.
|
||||
|
||||
## Opportunities for knowledge-sharing and engagement
|
||||
|
||||
More needs to be done to promote publishing standards and best practises between countries. CSOs consulted for this study readily identified countries with exemplary publishing practices (e.g. Slovakia's procurement data and the UK's transaction-level spending data). While we acknowledge that there is still work to be done in both cases, country practices like these should be identified as the forefront of open data policies and used as examples to help civil society initiatives demand more from their own countries.
|
||||
|
||||
Examples of successful cooperation between CSOs and government have shown that CSO engagement with government can lead to greater financial transparency and better data. The [Supervizor](https://www.kpk-rs.si/en/project-transparency/supervizor-73) project in Slovenia, for example, was driven by a combination of a strong independent anti-corruption commission and access to pro-bono development resources to prototype and develop the models. CSOs will play a crucial role in the future of fiscal transparency, not least of all through their engagement with governments.
|
||||
|
||||
Finally, there is a major opportunity for transparency advocates and open data groups to work together. These two communities bring different focuses and areas of expertise, the combination of which may be very powerful. Transparency organisations bring knowledge of government policy and contextual experience that technical experts often lack; conversely, open data hackers understand data processing and the programmer community better than most policy experts. This opportunity has been partly obscured by superficial disagreements over terminology that could be clarified by greater explicitness around key terms like "open".
|
||||
|
||||
## Training needs
|
||||
|
||||
Civil society groups around the world could benefit from training and support in key areas. While the pool of skills is hugely diverse across the CSO community, key parts of the data pipeline have consistently proven to be problems that steal time from activism and other parts of an organisation's work.
|
||||
|
||||
We see strong potential in offering focused trainings around these key needs:
|
||||
|
||||
* Web scraping
|
||||
* Liberating and cleaning PDFs
|
||||
* FOI request skills
|
||||
|
||||
This last need, FOI request skills, should be emphasized: "technical" training is not the only training necessary. Despite recent increases in proactive data publication, FOI requests remain of prime importance for getting hold of information on public money, and CSOs interested in public spending require training in their effective use.
|
||||
|
||||
**Next**: [Case Studies: Budgets](../case-studies-budgets/)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,46 @@
|
||||
---
|
||||
lead: true
|
||||
title: Mapping the Open Spending Data Community
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
|
||||
How civil society organisations use data on public finances
|
||||
|
||||

|
||||
|
||||
* [Introduction](./introduction/)
|
||||
* [Athens to Berlin: Video Series](./introduction/videos/)
|
||||
* [Map of Contacts](./introduction/map/)
|
||||
* [Timeline](./introduction/timeline/)
|
||||
* [Executive summary](./executive-summary/)
|
||||
* [Case Studies: Budgets](./case-studies-budgets/)
|
||||
* [Bani pierduti? (Lost Money)](./case-studies-budgets/lost-money/)
|
||||
* [OpenBudgetOakland](./case-studies-budgets/openbudgetoakland/)
|
||||
* [Centre for Budget and Governance Accountability](./case-studies-budgets/cbga/)
|
||||
* [Centre for Public Interest Advocacy](./case-studies-budgets/bosnia/)
|
||||
* [Expert Grup](./case-studies-budgets/expert-grup/)
|
||||
* [Case Studies: Spending](./case-studies-spending/)
|
||||
* [Caring for my Neighbourhood](./case-studies-spending/caring-for-my-neighbourhood/)
|
||||
* [Open Knowledge Foundation Greece](./case-studies-spending/okfn-greece/)
|
||||
* [EU Spending Data](./case-studies-spending/eu-spending-data/)
|
||||
* [FarmSubsidy.org](./case-studies-spending/farmsubsidy/)
|
||||
* [Farm subsidies in Mexico](./case-studies-spending/farm-subsidies-mexico/)
|
||||
* [India Spend](./case-studies-spending/india-spend/)
|
||||
* [Supervizor, Slovenia](./case-studies-spending/supervizor/)
|
||||
* [Case Studies: Procurements](./case-studies-procurements/)
|
||||
* [Hutspace](./case-studies-procurements/hutspace/)
|
||||
* [Texty](./case-studies-procurements/texty/)
|
||||
* [OpenTED, Opening Tender Electronic Daily](./case-studies-procurements/opented/)
|
||||
* [Case Studies: From Local to Global](./case-studies-other/)
|
||||
* [OpenSpending](./case-studies-other/openspending/)
|
||||
* [Open University](./case-studies-other/opening-university/)
|
||||
* [Conclusions](./conclusions/)
|
||||
* [Appendix](./appendix/)
|
||||
* [Putting the Open Data into Open Budgets](./appendix/open-budgets-open-data)
|
||||
* [Tool Ecosystem](./appendix/tool-ecosystem/)
|
||||
* [Common arguments against publishing data](./appendix/machinereadfaq/)
|
||||
* [How to publish spending data without disclosing personal information](./appendix/privacyguide/)
|
||||
* [Other handy datasets](./appendix/other-handy-datasets)
|
||||
|
||||
**Next**: [Introduction](./introduction/)
|
||||
@@ -0,0 +1,37 @@
|
||||
---
|
||||
lead: true
|
||||
title: Introduction
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
The unprecedented quantities of information on public finances now becoming available present an opportunity to create stronger and more participatory democracies, to campaign more effectively for social justice, and to hold power to account.
|
||||
|
||||
In this report, we look at how citizens, journalists, and civil society organisations around the world are using data on government finances to further their civic missions. From visualisations of the Romanian budget to analyses of procurements in Ghana, we look at how civil society groups are using data on public finances, what tools they are using, and what their needs are in this area.
|
||||
|
||||
When this work started in early 2012, our aim was threefold:
|
||||
|
||||
* *To identify Civil Society organisations (CSOs) around the world who are interested in working with government financial data*, building on the existing network of contacts from the [OpenSpending.org](http://openspending.org) project
|
||||
|
||||
* *To connect these CSOs with each other*, with open data communities, and with other key stakeholders to exchange knowledge, experiences, and best practices in relation to spending data
|
||||
|
||||
* *To discover how CSOs currently work with spending data, how they would like to use it, and what they would like to achieve*, including:
|
||||
|
||||
* What existing tools are being used
|
||||
* What current technical needs are unmet
|
||||
* What would be required to meet these needs and how feasible is it to tackle them
|
||||
|
||||
This report is the output of that research. It brings together key case studies from organisations who have done pioneering work in using technology to work with public finance data.
|
||||
|
||||
We have kept this report short and readable in order to make it accessible to the broadest possible audience. We believe that there are some very quick ways to make the work that CSOs do a lot easier, more thorough, and more sustainable. In this report we therefore:
|
||||
|
||||
* Examine some key case studies of how organisations are using technology to do groundbreaking research, citizen engagement, and tracking of accountability
|
||||
* Outline how data could be improved in order to make it more usable
|
||||
* Discuss the training needs for CSOs to help them better use the data available and to demand better data
|
||||
|
||||
At the end of the report, you will find an [appendix](../appendix) which outlines key tools of use to those investigating or working with government financial information. We hope that this is a useful resource for those building training curriculae around financial data—and would welcome any feedback if there is anything we have missed.
|
||||
|
||||
The team behind the report can be reached at any time on *info [at] openspending.org*. We would like to thank the Open Society Foundations for their generous support for this project's research.
|
||||
|
||||
**Next**: [Athens to Berlin: Video Series](../introduction/videos/)
|
||||
|
||||
**Up**: [Mapping the Open Spending Data Community](../)
|
||||
@@ -0,0 +1,13 @@
|
||||
---
|
||||
lead: true
|
||||
title: Map of Contacts
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
As an additional appendix to this report, we have published links to all of the organisations who we encountered of spoke to during our research and created a [map of their projects](http://apps.openspending.org/oscontactsmap/). The majority have some kind of online presence and use technology either to get, clean, analyse, or present their data.
|
||||
|
||||
<iframe frameBorder="0" src="http://openspending.github.io/oscontactsmap/" width="600px" height="320px"></iframe>
|
||||
|
||||
**Next**: [Timeline](../timeline/)
|
||||
|
||||
**Up**: [Introduction](../)
|
||||
@@ -0,0 +1,13 @@
|
||||
---
|
||||
lead: true
|
||||
title: Timeline
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
A visual timeline of key activities from the project can be seen here:
|
||||
|
||||
<iframe src="http://timeliner.okfnlabs.org/view/?url=https://docs.google.com/a/okfn.org/spreadsheet/ccc%3Fkey%3D0AqwLVP6U7FhDdEZlb29nSHZkeU1ha3JJSEFMLTZVR1E%23gid%3D0&embed=1" frameborder="0" style="border: none;" width="100%" height="780;"></iframe>
|
||||
|
||||
**Next**: [Summary of recommendations](../../executive-summary/)
|
||||
|
||||
**Up**: [Introduction](../)
|
||||
@@ -0,0 +1,31 @@
|
||||
---
|
||||
lead: true
|
||||
title: 'Athens to Berlin: Video Series'
|
||||
authors:
|
||||
- Neil Ashton
|
||||
---
|
||||
As part of the research for this project, we travelled from [Athens to Berlin](http://community.openspending.org/?p=250) to conduct a series of interviews with Civil Society Organisations across Europe. Our goal was to find organisations working with government financial information and to discover what they were doing with it and how it could be made easier.
|
||||
|
||||
We found organisations working with varieties of data including budgets, procurements, and even news articles on corruption. Diverse as they were, these organisations encountered many of the same problems in working with their data, such as bad organisational identifiers, data in non-machine-readable formats, and obstruction from data providers. We discussed the impact of their work on public participation in government spending and asked them what changes they'd like to see in the world of open spending data.
|
||||
|
||||
The videos below present highlights from this series of interviews. They introduce the CSOs we talked to, explore the challenges they face working with data, shine a spotlight on their work's impact, and conclude by highlighting opportunities for future work.
|
||||
|
||||
## Introduction to the organisations
|
||||
|
||||
<iframe src="http://player.vimeo.com/video/66233020" width="400" height="300" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>
|
||||
|
||||
## Issues working with data & tools used
|
||||
|
||||
<iframe src="http://player.vimeo.com/video/66240855" width="400" height="300" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>
|
||||
|
||||
## Impact of projects
|
||||
|
||||
<iframe src="http://player.vimeo.com/video/66281152" width="400" height="300" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>
|
||||
|
||||
## Wishlist: making fiscal transparency easier
|
||||
|
||||
<iframe src="http://player.vimeo.com/video/66184506" width="400" height="300" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>
|
||||
|
||||
**Next**: [Map of Contacts](../map/)
|
||||
|
||||
**Up**: [Introduction](../)
|
||||
Reference in New Issue
Block a user