datahub/examples/openspending/content/resources/handbook/ch005_introduction-to-data-literacy.html
Luccas Mateus 14974edcbf
[examples/openspending] - openspending v0.2 (#907)
* [examples/openspending] - openspending v0.2

* [examples/openspending][m] - fix build

* [examples/openspending][xs] - fix build

* [examples/openspending][xs] - add prebuild step

* [examples/openspending][m] - fix requested by demenech

* [examples/openspending][sm] - remove links + fix bug
2023-05-30 20:22:58 -03:00

90 lines
12 KiB
HTML

---
title: Data-driven advocacy and research
layout: handbook
---
<div class="row">
<div class="span4">
<ul class="nav nav-list span3">
<li class="nav-header">Introduction</li>
<li><a href="ch001_introduction.html">Introduction</a></li>
<li><a href="ch002_working-with-others.html">Working with others</a></li>
<li><a href="ch003_gov-gov-collaboration.html">Helping the government help itself</a></li>
<li class="nav-header">Data Literacy</li>
<li><a href="ch005_introduction-to-data-literacy.html">Data-driven advocacy and research</a></li>
<li><a href="ch006_types-of-data.html">Types of data</a></li>
<li><a href="ch007_getting-cleaning.html">Getting & cleaning data</a></li>
<li><a href="ch008_anaysis.html">Analysis</a></li>
<li><a href="ch009_ngo-ngo-collaboration.html">Using technology in your work</a></li>
<li class="nav-header">Presentation and engagement</li>
<li><a href="ch011_defining-the-scopetopic.html">Presentation and engagement</a></li>
<li><a href="ch012_selecting-methods-and-tools.html">Selecting methods and tools</a></li>
<li class="nav-header">Appendices</li>
<li><a href="ch014_resources.html">Resources</a></li>
<li><a href="ch015_glossary.html">Glossary</a></li>
<li class="nav-header">Further information</li>
<li><a href="http://okfn.booktype.pro/spending-data-handbook/">Contribute to the book</a></li>
<li><a href="spending-data-handbook.pdf"><strong>Download a PDF version</strong></a></li>
<li><a href="Spending_Data_Handbook.epub">ePub Version (iPad)</a></li>
<li><a href="Spending_Data_Handbook.mobi">MOBI Version (Amazon Kindle)</a></li>
</ul>
</div>
<div class="span8">
<div><h2>Data-driven advocacy and research</h2>
<p>We are now in a phase where many governments around the world are proactively publishing documents about what they plan to spend (budgets) and actually spend (spending data). Increasingly, this material is available on the internet, so that anybody can access it at any time. Still, too much of the information is released in the form of '<em>documents</em>' rather than '<em>data</em>'. Ideally we need both so that inforamtion can be analyzed, re-used and understood. This chapter is a quick overview of some of the raw inputs required for data-driven advocacy and how it works in practice.&#160;</p>
<h3>What do we mean by machine-readable data?</h3>
<p>When we speak about data, what we usually refer to is the notion of machine-readable&#160;(<a title="http://en.wikipedia.org/wiki/Machine-readable_data" href="http://en.wikipedia.org/wiki/Machine-readable_data">http://en.wikipedia.org/wiki/Machine-readable_data</a>) data. Many of the formats most commonly used for policy papers and long-form reports published by most policy-making institutions are PDF files, Word documents, web pages or closed interactive infographics - do not structure information in a way that lends itself to automated analysis and extraction.</p>
<p>Such documents are formatted for humans (or printers) to interpret, and it can be hard (and in many cases nearly impossible) for a machine to re-construct the elements in the presentation.</p>
<p>Other formats, such as Excel and CSV files contain a higher level of structured information. For example, in an Excel file you can mark a number of cells and easily calculate their sum. Even more exotic and useful file formats, such as&#160;<em>XML documents</em>, <em>JSON API</em>s or <em>Shapefiles</em> may not have easy-to-use viewer applications. You can think of them as the glue that connects different systems on the web, so that different databases can work together in a seamless fashion.</p>
<h4>Why do CSOs need it?</h4>
<p>What asking for machine-readable bulk data means for CSOs is simple: you won't have to spend a lot of time manually extracting data from reports into spreadsheets to be able to filter, sort and analyse it - a process which is both time-consuming and can introduce errors.</p>
<h3>What to ask for when asking for data: a checklist</h3>
<p>In the next section 'Getting Data' - we will deal with asking governments for data (or getting it via other means). To set the scene for this and to work out whether your government actually publishes usable data already, have a quick look at the following questions:</p>
<ul><li><strong>Is the government's data published in a machine-readable format?</strong> E.g. CSV, XML, JSON. While there is nothing wrong with publishing a PDF to support a data release (in fact it is often nice to have a nicely-laid out document to cross reference and sanity-check data) it shouldn't be the only thing published and if you are asking for a policy document, ask for the underlying data in a spreadsheet so you can check the numbers.</li>
<li><strong>Does the government publish a '<em>data dictionary</em>' to explain the terms used in the dataset?</strong> This should include definitions of column headers, explanations of terms and ranges used within the main body of the data, explanations of any changes in terminology which have been introduced since last time the dataset was released</li>
<li><strong>How is the data that is being published <em>actually</em> used internally by governments?</strong> Do some sanity checks on the minimum and maximum values of different columns to make sure they fall into the documented ranges and don't seem out of place. Do you see negative values when you don't think you should? Negative values usually mean money owed.</li>
<li><strong>Is the structure of the data the same across years? If not is there a description of how it changes?</strong> It never hurts to contact the publisher and ask questions about the change and why it occurred. The publisher may have their name and contact details on the report or webpage. If there is no named contact then call the department's enquires number or send a message to their email address asking to meet or discuss the data.</li>
<li><strong>How aggregated is the data?</strong> What is the number of real-world financial transactions that are expressed by a single line of the dataset you have? For budgets this will mostly be hard to tell - but with transactional expenditure you want to make sure that the data is fairly disaggregated. Ideally, each entry represents a transaction - but even if this isn't true you'll still want to ensure the number is not in the tens or hundreds of thousands (e.g. government programmes as a whole).</li>
<li><strong>Ask for reference data.</strong> If your budget or spending data is augmented with reference data, make sure you have access to it. This might include functional or category codes on budget line items, location codes for describing recipient location, or codes that indicate the status of the record.&#160;</li>
<li><strong>Ask also for the guidelines people were given when creating the dataset.</strong> This will make it easier to understand what is included within the data, e.g. are the numbers in thousands / millions.&#160;</li>
<li><strong>Final tip: if the data you want is not given then narrow your scope.</strong> Your chances of success will be higher if you narrow the scope of the data you're requesting from the government and you are specific. Government is the de facto keeper of all kinds of data, so parameters that narrow your request are always helpful.</li>
</ul><h3>An introduction to data-driven advocacy</h3>
<p class="p1">Is going out and provoking a riot the best way to get a Government to take onboard your message? There are alternatives: hit them with the data hammer instead!</p>
<p class="p1">Making evidence-based policy proposals consists from three major phases: formulating your assumption, analysis (which often leads to re-formulating your assumption, and presenting your data in an engaging way in a policy proposal.</p>
<h4 class="p1"><strong>Analysing assumptions</strong></h4>
<p class="p1">Asking the right question is key to getting the most out of your data. We all make assumptions, and our organisation may have a particular standpoint on a given issue. Our first task is always to formulate our assumptions and then interrogate them ferociously. Although we try to be rational in this process, our judgement is often influenced by our subjective goals, values, and beliefs.&#160;Sometimes, you'll need to revisit your assumptions several times over to ensure they are valid and you can back them up with data. Once you know your policy problem is definitely a problem, you can work to package it in a way that's appropriate for your target audience.&#160;</p>
<h4 class="p1">What is public interest?&#160;</h4>
<p class="p1">Often our job is to act in the public interest by analysing conflicting assumptions and working out which one is more valid. For example, in Greece, Spain, and many other European countries people protest almost everyday as the Government cuts spending to bring down its budget deficit. If the Goverment wanted to keep its current level of spending, but increased taxes to increase its revenue, different citizens groups would still protest depending on which taxes are to be increased. In any case, there will always be more than one interpretation of any Government policy, and interested side to support it, or not.</p>
<h4 class="p1"><strong>Policy analysis</strong></h4>
<p class="p1">Once we have a well defined policy problem, specific goals, or results different stakeholders are trying to achieve, and corresponding instruments they are using in this process, we may systematically search for the specific data needed to create our own policy proposals. This data can be obtained either from the Government, some other sources e.g. academic journals, private companies, or generated by ourselves. When data is gathered we will use a specific methodology to analyze it, and based on this analysis we will approve or reject our assumptions. If the assumption is rejected, based on our findings we will have to make the new assumption, and start the process from the beginning. If our assumption is approved, we will use our results to make a policy proposal to the Government.</p>
<h4 class="p1"><strong>Policy proposals</strong></h4>
<p class="p1">For CSOs it is important to recognize who is a decision maker, hence, who you should be targeting with your policy proposal.&#160;Policy proposals should be methodologically well structured, evidence-based, open for debate, and scientificaly evaluated. Governments will seldom take our policy proposals as their own policymaking, but may actually change its course of action, get new insights, views, and understanding of the subject. We may also use policy briefs to approach Government officials, or press releases to get the attention of the public.&#160;</p>
<p class="p1"><img src="static/cso-3.jpg" alt=""></p>
<h3 class="p1"><strong>Case study&#160;</strong></h3>
<p class="p1"><em>Fish subsidies</em></p>
<p class="p1">The influence CSOs have on government policy comes from a wide and varied set of activities. These can range from producing a widely shared dataset or infographic which subtly influences the mood of policy makers, to more targeted CSO&#160;advocacy&#160;and lobbying on issues they are experts.</p>
<p class="p1">The Fish Subsidies group (<a title="http://en.wikipedia.org/wiki/Machine-readable_data" href="http://fishsubsidy.org">http://fishsubsidy.org</a>) are a nice example of a CSO engaged in targeted activites. Having collected a comprehensive set of data on Fishing subsidies&#160;paid under the European Union&#8217;s common fisheries policy and they break this down into payments for&#160;every EU member state, and then&#160;complemented this with&#160;activites of fishing. They have produced a report (<a title="http://en.wikipedia.org/wiki/Machine-readable_data" href="http://is.gd/XYPgq5">http://is.gd/XYPgq5</a>) assessing the environmental and social impacts of the Financial Instrument for&#160;Fisheries Guidance between 2000 and 2006. This extensive document fed directly into the EU political decision making process. &#160;</p></div>
</div>
</div>