The difference here is that a man who has spent decades showing people the natural world has his voice being used in disingenuous ways, and it should give us all pause. I use generative artificial intelligence, as do many others, but there would be no way that I would even consider misrepresenting what I write or work on in the voice of someone else.
Who would do that? Why? It dilutes it. Sure, it can be funny to have a narration by someone like Sir David Attenborough, or Morgan Friedman, or… all manner of people… but to trot out their voices to misrepresent truth is a very grey area in an era of half-truths and outright lies being distributed on the Babel of the Internet.
Somewhere – I believe it was in Lessig’s ‘Free Culture’ – I had read that the UK allowed artists to control how their works were used. A quick search turned this up:
The Copyright, Designs and Patents Act 1988, is the current UK copyright law. It gives the creators of literary, dramatic, musical and artistic works the right to control the ways in which their material may be used. The rights cover: Broadcast and public performance, copying, adapting, issuing, renting and lending copies to the public. In many cases, the creator will also have the right to be identified as the author and to object to distortions of his work.
It would seem that something similar would have to be done with the voices and even appearance of people around the world – yet in an age moving toward artificial intelligence, where content has been scraped without permission, the only people who can actually stop doing this are the ones who are scraping the content.
The world of trusted humans is being diluted by untrustworthy humans.
The recent news of Stack Overflow selling it’s content to OpenAI was something I expected. It was a matter of time. Users of Stack Overflow were surprised, which I am surprised by, and upset, which I’m not surprised by.
That seems to me a reasonable response. Who wouldn’t? Yet when we contribute to websites for free on the Internet and it’s not our website, it’s always a terrible bargain. You give of yourself for whatever reason – fame, prestige, or just sincerely enjoying helping, and it gets traded into cash by someone else.
But companies don’t want you to get wise. They want you to give them your content for free so that they can tie a bow around it and sell it. You might get a nice “Thank you!” email, or little awards of no value.
No Good Code Goes Unpunished.
The fallout has been disappointing. People have tried logging in and sabotaging their top answers. I spoke to one guy on Mastodon a few days ago and he got banned. It seems pretty obvious to me that they had already backed up the database where all the stuff was, and that they would be keeping an eye on stuff. Software developers should know that. There was also some confusion about the Creative Commons licensing the site uses versus the rights given to the owners of the website, which are mutually exclusive.
Is it slimy? You bet. It’s not new, and the companies training generative AI have been pretty slimy. The problem isn’t generative AI, it’s the way the companies decide to do business by eroding trust with the very market for their product while poisoning wells that they can no longer drink from. If you’re contributing answers for free that will be used to train AI to give the same answers for a subscription, you’re a silly person1.
These days, generative AI companies need to put filters on the front of their learning models to keep small children from getting sucked in.
Remember Huffington Post?
Huffington Post had this neat little algorithm for swapping around headlines til it found one that people liked, it gamed SEO, and it built itself into a powerhouse that almost no one remembers now. It was social, it was quirky, and it was fun. Volunteers put up lots of great content.
I knew a professional journalist who was building up her portfolio and added some real value – I met her at a conference in Chicago probably a few months before the sale, and I asked her why she was contributing to HuffPost for free. She said it was a good outlet to get some things out – and she was right. When it sold, she was angry. She felt betrayed, and rightfully so I think.
It seems people weren’t paying attention to that. I did2.
You live, you learn, and you don’t do it again. With firsthand and second hand experience, if I write on a website and I don’t get paid, it’s my website. Don’t trust anyone who says, “Contribute and good things will happen!”. Yeah, they might, but it’s unlikely it will happen for you.
If your content is good enough for a popular site, it’s good enough to get paid to be there. You in the LinkedIn section – pay attention.
The question we should be asking is whether it’s worth putting anything on the Internet at this point, just to have it folded into a statistical algorithm that chews up our work and spits out something like it. Sure, there are copyright lawsuits happening. The argument of transformative works doesn’t really work that well in a sane mind when it comes to the exponentially higher amount of content used to create a generative AI at this point.
So what happens when less people contribute their own work? One thing is certain: the social aspect of the Internet will not thrive as well.
Social.
The Stack Overflow website was mainly an annoyance for me over the years, but I understand that many people had a thriving society of a sort there. It was largely a meritocracy, as open source, at least at it’s core. You’ll note that I’m writing of it in the past tense – I don’t think anyone with any bit of self-worth will contribute there anymore.
The annoyance aspect for me came from (1) Not finding solutions to the quirky problems that people hired me to solve3, and (2) Finding code fragments I tracked down to Stack Overflow poorly (if at all) adapted to the employer or client needs. I also had learned not to give away valuable things for free, so I didn’t get involved. Most, if not all, of the work I did required my silence on how things worked, and if you get on a site like StackOverflow – your keyboard might just get you in trouble. Yet the problem wasn’t the site itself, but those who borrowed code like it was a cup of sugar instead of a recipe.
Beyond we software engineers, developers, whatever they call themselves these days, there are a lot of websites with social interaction that are likely getting their content shoved into an AI learning model at some point. LinkedIn, owned by Microsoft, annoyingly in the top search results, is ripe for being used that way.
LinkedIn doesn’t pay for content, yet if you manage to get popular, you can make money off of sponsored posts. “Hey, say something nice about our company, here’s $x”. That’s not really social, but it’s how ‘influencers’ make money these days: sponsored posts. When you get paid to write posts in that way, you might be selling your soul unless you keep a good moral compass, but when bills need to get paid, that moral compass sometimes goes out the window. I won’t say everyone is like that, I will say it’s a danger and why I don’t care much about ‘influencers’.
In my mind, anyone who is an influencer is trying to sell me something, or has an ego so large that Zaphod Beeblebrox would be insanely jealous.
Regardless, to get popular, you have to contribute content. Who owns LinkedIn? Microsoft. Who is Microsoft partnered with? OpenAI. The dots are there. Maybe they’re not connected. Maybe they are.
Other websites are out there that are building on user content. The odds are good that they have more money for lawyers than you do, that their content licensing and user agreement work for them and not you, and if someone wants to buy that content for any reason… you’ll find out what users on Stack Overflow found out.
All relationships are built on trust. All networks are built on trust. The Internet is built on trust.
I volunteered some stuff to WorldChanging.com way back when with the understanding it would be Creative Commons licensed. I went back and forth with Alex and Jamais, as did a few other contributors, and because of that and some nastiness related to the Alert Retrieval Cache, I walked away from the site to find out from an editor that contacted me about their book that they wanted to use some of my work. Nope. I don’t trust futurists, and maybe you shouldn’t either.↩︎
I always seemed to be the software engineer that could make sense out of gobblygook code, rein it in, take it to water and convince it to drink.↩︎
Recently, I’ve been active in a group on Facebook that is supposed to be a polite space to debate things. News articles fly around, and the news articles we see these days from different sources carry their own biases because rather than just presenting facts, they present narratives and narratives require framing.
I wondered how much of these articles were generated by what we’re calling artificial intelligence these days. In researching, I can tell you I’m still wondering – but I have found some things that are of interest.
The New York Times Lawsuit.
It’s only fair to get this out of the way since it’s short and sweet.
Of course, in the news now is the lawsuit that the New York Times has brought against Microsoft and OpenAI, where speculation runs rampant either way. To their credit, through that link, the New York Times presented things in as unbiased way as possible. Everyone’s talking about that, but speculation on that only really impacts investors and share prices. It doesn’t help people like me as much, who write their own content as individuals.
Either way, that lawsuit is likely not going to help my content stay out of a learning model because I just don’t have the lawyers. Speculating on it doesn’t get me anywhere.
Well, we don’t really know how many news agencies are using artificial intelligence or how. One would think disclosure would be the issue then.
The arguments against disclosure are pretty much summed up below (an extract from a larger well balanced article).
Against disclosure
One concern is that it could stifle innovation. If news organisations are required to disclose every time they use AI, they may be less likely to experiment with the technology.
Another is that disclosure could be confusing for consumers. Not everyone understands how AI works. Some people may be suspicious of AI-generated content. Requiring disclosure could make it more difficult for consumers to get the information they need.
In my opinion, placing innovation over trust, which is the actual argument being made by some with that argument, is abhorrent. To innovate, one needs that trust and if you want that trust, it seems to me that the trust has to be earned. This, given the present state of news outlets in their many shades of truth and bias might seem completely alien to some.
Arist von Harpe is cited in the article for saying, “We do not highlight AI-aided articles. We’re only using [AI] as a tool. As with any tool, it’s always the person using it who is responsible for what comes out.” This seems a reasonable position, and puts the accountability on the humans related to it. I have yet to see artificial intelligences be thrown under the bus for an incorrect article, so we have that landmark to look for.
The rest of that article is pretty interesting and mentions fact checking, which is peculiar given the prevalence of hallucinations and even strategic deception, as well as image generation, etc.
It is still disturbing we don’t have much insight into the learning models being used, which is a consistent problem. The lawsuit of the New York Times seems to be somewhat helpful there.
I honestly tried to find out what I could here and in doing so came up with my own conclusion that wasn’t what I would have expected it to be.
In the end, it is as Arist von Harpe is cited. We have to judge based on the stories we get because every newsroom will do things differently. It would have helped if we had less room to speculate on biases before the creation of these artificial intelligence tools, and whoever screws up should lose some trust. In this day and age, though, feeding cognitive biases seems to trump trust.
That’s probably the discussion we should have had some time ago.
These paywalls are super-annoying for we mere mortals who do not have the deep pockets of corporate America. How many subscriptions is a well informed person supposed to have? It’s gotten ridiculous. We’ve known that business models for news have been in such trouble that a ‘news story’ has a more literal definition these days, but… surely we can do better than this? ↩︎