In my previous blog post I described the Open Government Movement and how the Linked Data principles make publicly available data released by the UK and US governments open for citizen utility and economic opportunities.
A recent blog post made me aware that I, and many other, tend to use the term open data to mean publicly available data:
"Simply put, all open data is publicly available. But not all publicly available data is open. Open data does not mean that a government or other entity releases all of its data to the public. ... Rather, open data means that whatever data is released is done so in a specific way to allow the public to access it without having to pay fees or be unfairly restricted in its use."In this blog post I will adopt this when I now focus on publicly available data released by enterprises. And start to look into how linked data principles can be applied for data enterprises make publicly available as part their efforts for Corporate Transparency, and for Social Responsibility.
Source: What “open data” means – and what it doesn't, by Melanie Chernoff, published on the Open Knowledge Foundation Blog
What will the movement in Governments for
Linked Open Data mean for Enterprises?
How can Corporate Transparency be supported
by applying Linked Data principles?
Linked Open Data mean for Enterprises?
How can Corporate Transparency be supported
by applying Linked Data principles?
Let me first introduce the Linked Data principles and also the 5-star deployment scheme for Linked Open Data. With this in mind I will highlight examples of data made publicly available by two enterprises: Volvo Group and AstraZeneca. And then, outline steps for Linking Open Enterprise Data -- from a 1-star to a 5-star rating.
Linked Data principles
The four principles, or rules, of Linked Data have been outlined by Tim Berners-Lee, often referred to as the "inventor of the web", in his Design Issues: Linked Data note:
- Use URIs (global identifiers) to identify things.
- Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
- Provide useful information about the thing when its URI is dereferenced, using standard formats such as RDF/XML.
- Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.
Source: Linked Data page Wikipedia5-star scheme for Linked Open Data
What is required for 1-5 star ratings? What are the costs and benefits? I will elaborate on this for publicly available data released by enterprises based on Linked Open Data star scheme by example, Michel Hausenblas. To spread this nice idea you can buy your own 5-star mug and T-shirt.
Publicly available enterprise data
So, what does this mean for enterprises? Below two example of data made publicly available by two enterprises: Volvo Group and AstraZeneca. Two large international enterprises in different industries under regulations for different aspects and regions, such as the Corporate Integrity Agreement (CIA) for health services in US.
Volvo Group, Corporate Social Responsibility, publish a yearly Sustainability Report with a Scorecard including key sustainability performance indicators such as Energy consumption (example from the scorecard 2009) Data is formatted as a html table and the whole report as a pdf.
AstraZeneca, Corporate transparency, publish for example data on Physician Engagement, a summary of payments made to U.S. physicians who have spoken on behalf of AstraZeneca and/or its products. Data is published in a table of 2000+ rows as a pdf (Speaker compensation report, January - June 2010).
Linking Open Enterprise Data
These examples of data are made publicly available in a way that makes it possible for consumers to look at it, print it, store it locally, and to enter it in manually into another system. If this was done with an open licens (such as PDDL, ODC-by or CC0) they would have got a nice 1-star rating.
For a 2-star rating, data should be made available as structured data (e.g., Excel instead of pdf) so that it also can be reused. Consumer can now directly process it with proprietary software to aggregate it, perform calculations, visualize it, etc.. For a 3-star rating data should be in non-proprietary, open formats (e.g., CSV instead of Excel). Consumer can now manipulate the data in any they like, without being confined by the capabilities of any particular software.
Source: Linked Open Data star scheme by example and Star badgesAvailable vocabularies
An example of such a vocabulary of interest for the AstraZeneca physician engagement example to make it to the 4-star rating is the payments ontology being used for publishing UK government spending data as linked data (see COINS as Linked Data). The ontology (see Guide to the Payments Ontology) has been developed as a general purpose vocabulary for representing organizational spending information and is not specific to government or local government applications.
Of relevance for the Volvo Group example to make it to the 4-star rating is the work in the eGovernment Interest Group for Linked Environment Data that Environment Agencies from Europe and the US are setting up. The Statistical Core Vocabulary (scovo) for representing statistical data on the Web have been used by the German Federal Environment Agency (UBA) to publish linked environment data.
Thoughts for future posts
In future blog posts I will continue the exploration of the opportunities, and challenges, of Linking Open Enterprise Data. I am also interested in experiences of applying Linked Data principles for data sources available within enterprise networks to make it easier for employees and partners to consume it, and to combine it with other linked data sources -- internal, shared, licensed and publicly available sources.
While writing this post I was thinking of provenance, i.e. open history of data, in relation to the 5-star deployment scheme -- Maybe a 6-star rating for embedding provenance data using emerging provenance vocabularies? I wonder what Tim Berners-Lee thinks about that :-)
Thoughts for future posts
In future blog posts I will continue the exploration of the opportunities, and challenges, of Linking Open Enterprise Data. I am also interested in experiences of applying Linked Data principles for data sources available within enterprise networks to make it easier for employees and partners to consume it, and to combine it with other linked data sources -- internal, shared, licensed and publicly available sources.
While writing this post I was thinking of provenance, i.e. open history of data, in relation to the 5-star deployment scheme -- Maybe a 6-star rating for embedding provenance data using emerging provenance vocabularies? I wonder what Tim Berners-Lee thinks about that :-)
Kudos to Michel Hausenblas (@mhausenblas) for the great 5-star scheme examples with costs and benefits, and the nice star badges. And also to Bill Roberts (@billroberts) for excellent input for the payment data example. As well as to Melanie Chernoff (@melaniechernoff) for the interesting blog post on publicly available and open data.