Schrödinger’s data
I realise that starting an article describing quantum physics may not be the most exciting way to inspire an audience to read on but, it makes a point. Erwin Schrödinger described an experiment in which a cat is placed in a steel box along with a Geiger counter, a vial of poison, a hammer, and a radioactive substance. When the radioactive substance decays, the Geiger detects it and triggers the hammer to release the poison, which subsequently kills the cat. The radioactive decay is a random process so without directly observing the counter in the box or the impact on said cat. It means that the cat has been described as being both dead and alive at the same time as an assumption has to be made as to it’s state.
We see the same in the collection and sharing of data, pots of data are created when individuals and companies carry out activity, increasingly technology installed on farms and their machinery gather that data often as part of a routine processes and in an automated or at least semi-automated manner e.g. A feed system may need some programming to suggest feed types or ingredients but will then weigh record and chart ration distribution automatically. Importantly once in that database it rarely if ever, leaves.
What is interesting about this data is, if it ever makes it out of the database, is the way it is traded. Whilst the data is available to the farmer to use in various forms, it sits within a digital box into which it is entered and does not interact unless told to do so. This invariably means the user i.e. the farmer has to merge data together and review the output to gain any benefit. Frankly, who has time for that? The parties that supplied the box in which the data is sat i.e. the company that provided the technology or database often sees value in that data and resists the free movement to other data providers to use in informing the farmer regarding insights that the data can only provide when combined. For example a farmer may have a feed system in place that records feed intake and weight and calculates FCR for his animals, this could identify that there is a subset of animals that are underperforming. That has value to the farmer. An electronic medicine book can record incidents of illness and treatment, telling a farmer which animals underperform in terms of health. At slaughter the performance of those animals is recorded and this to has value to the farmer.
What is often missed, is the real value of these data sets, the value of them combined. That is not because people are unaware of the value but because each data holder assumes they should receive a return for that data to for the generation of value it may create before they pass it on. The technology company producing the feeder as an example will identify when looking into their box of data, that it has a value to provide insight however, until that data is viewed by others in the context of additional material it does not offer that value. Thus we end up with the same paradox that confronted Schrödinger. The data is both of value and not until someone else is allowed to look inside the box. There is a danger that if we try to monetise data before it is shared its intrinsic value is limited. Instead, the charging of services based on its combination should be explored.
There is also a fundamental issue in that farmer pays for the data more than once:
1. At the point of purchasing the equipment that creates the data
2. In labour to provide the work that generates the data
3. In infrastructure and process that generates the data
What the end user, the only one who would see maximum value in looking inside every data box, does not want is to pay for a fourth time.
So what to do? The solution is rarely if ever simple. We could take a socialist approach to data and aim for a utopia in which all data is freely accessed with the farmers permission, shared through a single common format i.e. we strip all value from it. This charges the service providers with adding value in service and interpretation and encourages innovation.
More likely we need to be realistic and compromise, if there are multiple holders of data that can create value through a single service then the value should be split according to input that leads to value for the end user only once it becomes a new service i.e. to charge to move data to someone else to provide a new service, is almost charging for the same service twice. It means that we need to share the data and take the risk that maybe, what we hold in our boxes isn’t as valuable as we think, or harder still, accept that to maximise the value of that data takes a lot of hard work before we can reap the rewards.