It is the provenance of what we express that makes us humans: the case of AI-generated photographs.
As a photographer, I have always been convinced that the value of the images I create, which I draw from reality with minimal, light-touch "post-processing" interventions, is primarily in their story. This is not to say that technical competence and appropriate use of the language of visual perception, as well as some sense of aesthetics are not required, however I believe that visual storytelling is fundamentally different from the art of writing stories, which similarly requires mastery of language and narrative skills, but is more akin to painting than photography. The difference is in the drawing from reality at the exact time and place when and where the story happens (and no, I don't do studio work).
There are actually two parts to the story. One is the intended message of the visual content, which can be purely aesthetic or perhaps social, ethical, emotional, provocative, reflective. Like for written text, this must be supported by the competent use of a language. The grammar of visual perception has been studied in depth over many decades (cf G. Kanizsa, Grammatica del vedere, Saggi su percezione e gestalt, (in Italian only) 1997); but, uniquely to photography, it also requires some mastery of image capture and processing technology.
The second part is the story of how the image was taken, and that is the story I am most interested in here. If you go back enough in time, it is often a fascinating story of apparently disconnected sequences of planned/unplanned events that eventually converge and culminate onto a unique point in time and space, a present moment where the visual story composes itself in front of your eyes, and you realise, often for just a split second, that you have the opportunity to "put your head, eye, and heart on the same axis", in the famous words of Henri-Cartier Bresson (The Mind's Eye: Writings on Photography and Photographers, Henri Cartier-Bresson, 1997)
So, provenance. This is the name we give to the second part of the story, and the art business has long accepted that it is an integral part of the value of artwork: without provenance, a masterpiece is commercially worthless. For this purpose, the provenance narrative generally begins with some certificate that provably links the artwork to its creator, possibly followed, especially for older artifacts, by a chain of "custodianships" that account for the journey of the artifact through time and interventions, sometimes over centuries. Thus, what gives value to a historical artifact is the combination of its physical embodiment along with its cultural and artistic significance, by also its deep provenance.
At a time where complex mathematical functions can mimick humans' ability to create fantasy images on command, my contention is that the concept of provenance has never been more important, as telling deep provenance stories about how we create our own images (or stories), is one way to remind ourselves of the essence of our human nature, and of what separates us from any artificial version of us.
AI-generated images may well find a place in art galleries by virtue of their aesthetic value, but their provenance is limited to the textual prompts (as in 2023 this is the preferred way to articulate visual intent) that triggered the generation process. Of course, the generation itself relies on an immense catalogue of pre-existing images, all mashed together and projected onto a mysterious and alien new representation (which we call embedding) from where impenetrable and highly non-linear mathematical functions draw their "knowledge". But calling the universe of images available for training "provenance" is not very helpful in practice.
However I find it ironic, and a bit sad, that humans should now replace the skills and talent to tell a story or express their emotions visually, with a string of text that offers a very poor approximation of what they would like to see represented. I feel that the provenance of my own images, questionable as they may be in terms of aesthetic value and visual and technical correctness, is a little more interesting. Here is one single example.
The provenance of one of the two images here below is the following prompt:
"a high quality black and white photo image of a tall bare tree in winter on a foggy day, in a wood on a steep mountain slope, next to the ruins of a stone shed"
The result is admittedly remarkable.
Recommended by LinkedIn
The provenance of the other image goes something like this.
"As in any solo mountain hike on a trail and in a place I have never been before, this was always going to be a bit unnerving. The approach by car was itself disconcerting, with a long, single-track road skirting tall mountains at the bottom of a deep narrow valley, all the way to a remote village where not even the only local bar was open. By the time I parked the car, I was feeling a bit uneasy about the whole plan.
I have never been interested in shooting memory postcards of my trips, but I would never go on a hike without my camera gear; it's on par with bringing water, a powerbank, and making sure the track is correctly loaded on my GPS watch -- the latter being especially important on an off-season hike when the trail is almost certainly going to be empty. What I did not account for was the weather, a combination of low clouds and drizzle, making for very low visibility and a rather cold and unpleasant uphill walking experience. But, on a short break abroad you can't postpone unique opportunities, and I wouldn't do that anyways on account of poor weather, so off I go into the mountains, following well-marked trail 24.
One hour and a half of steady climb later, in a wet mix of sweat and rain, I seem to have reached the ridge -- exciting, if it wasn't for the thick white wall in front of me where a spectacular scenery of mountains should have been. From this point on, the GPS indicates a way that I can vaguely make out as a steep scramble, again with no visibility to see what's beyond. Are we on the edge of a precipice? the guide says "expert hikers" from here on, but nothing more specific. There are signs pointing to another trail, dutifully numbered (technically these are the Alps, although we are in Tuscany, and the quality and reliability of signage across the entire chain is exceptional). My phone tells me that trail goes away from my destination, so I venture up the scramble. Wet rocks, no harness and no idea of what lies below and how far down I would be falling, and no sign of reverting to a domesticated trail.
I guess some adrenaline must be generated at some point on any good day out, but this was enough for me. Executive decision to climb down and return to a distant fork in the path, much further down and which I had passed on my way up, and to take the alternative route up, which was also on my recorded GPS track. So after another hour downhill, here we go again, hiking up the same mountain from a different angle.
This feels different. Still very steep, still very solitary, the occasional birds calling each other, the sound of your boots on the wet leaves that litter the trail, but also a kind of misty silence that invites reflection. It's strange how one's mind slowly goes into a special state, where the anxiety of being alone in the middle of an unknown forest is somehow balanced by a sense of serenity, a strangely liberating feeling, mostly free from conscious thoughts. At the same time, the scenery is dominated by countless bare trees, wet and just emerging from the late snow, some barely visible only a few meters away, very still in the windless air, an eerie presence for anyone who may have watched too many horror movies.
It is in this state of mind that the ruins of an old shed, presumably used by local loggers and then abandoned, suddenly emerge from the mist. It's made of local stones, grey on a dirty white background, a very low contrast scene but the perfect spot to pause and put the backpack down for a minute (after all, the mere sight of housing, albeit not accessible, is what gives us humans comfort)."
And this is where the story in the photograph diverges from its provenance. A good photographer knows that in the magical alignment between eye, heart, and mind, something inevitably gets "lost in translation". It would be delusional, and arrogant, to believe otherwise. The image cannot contain the fatigue, the sense of anxiety, the eerie yet comforting silence, the sum of all emotions and unplanned events that led a solitary hiker to be in this time and place, once in a lifetime, aiming a camera and carefully taking a shot, like putting a message in a bottle for random humans to one day pick up. Yet, that image would not exist without this entire story. I know that, because I lived it.
And yes, some generative AI can probably generate fake provenance similar to this. But it would be meaning-less.
PS. I feel that Cartier-Bresson's views on photography provide a much needed perspective today on what it means to be human. So here is another quote: “I believe that, through the act of living, the discovery of oneself is made concurrently with the discovery of the world around us, which can mold us, but which can also be affected by us. A balance must be established between these two worlds—the one inside us and the one outside us.” And: "To take photographs is to hold one's breath when all faculties converge in the face of fleeing reality. It is at that moment that mastering an image becomes a great physical and intellectual joy".
Software Engineer || Researcher
1yHello Paolo Missier, love the post! :) I came across your profile while looking into the experts on provenance and the W3C PROV model. I am working on something in the field as part of my PhD - is there any chance I have a discussion around this, please? Many thanks for considering my request. :)
Hi Paolo Missier, very nice account! I agree on the importance of provenance. Actually the relevance of provenance, in my opinion, relates weakly with its level of detail. Sometimes a pithy description accompanying an old B&W picture, where your imagination is left to fill the gaps, is priceless. A bit like poetry versus prose. The important thing is its authenticity, and by this I mean its capacity of linking the artifact to a story: to events, people, circumstances. Sometimes it may be foggy or there may be various versions of it that you come to know over time (e.g., was Le Baiser de l'Hotel de Ville staged?) but it always tells a story. Something lacking in AI creations. One of their limits is indeed their lack of provenance details - how was that artificial picture created, what "inspired" it? (at least in broad terms... if this could ever be summarized in a meaningful way for the viewer; what paths did the algorithm took to produce it, what artworks did influence it?). Or, more prosaically, in conversational AI, I'd like to know the sources that were "consulted" by the AI system to generate the output that I'm reading, hallucinations included. In a way relating me, albeit indirectly, towards some human piece of work.
Professor (Computer Science) & Dean (Academics) at Mohammad Ali Jinnah University (MAJU) HEC Approved PhD Supervisor
1yThanks for the article. Can't OPM work be extended for knowledge graph? Something like PKG. The specifications must be standardized which in my opinion OPM was proposed but not continued
Head of Client Strategy & Managing Director at Sputnik Digital
1yIf provenance about scarcity, and the perception that it's therefore worth more? And if so, the NFT dudes have a point about owning the "certificate" even if everyone else has copied and pasted their monkeys? Perhaps the future will no longer care about provenance? (Not meaning to be confrontational, just taking up your invitation to discuss 😃) Hope you're well Mr P. We're overdue a catchup!