When You Can’t Trust Voices.

November 18, 2024November 18, 2024 ~ Taran ~ Leave a comment

Generative AI is allowing people to do all sorts of things, including imitating voices we have come to respect and trust over the years. In the most recent case of Sir David Attenborough, he greatly objects to it and finds it ‘profoundly disturbing’.

His voice is being used in all manner of ways.

It wasn’t long ago that Scarlet Johannson suffered such an insult that was quickly ‘disappeared’.

The difference here is that a man who has spent decades showing people the natural world has his voice being used in disingenuous ways, and it should give us all pause. I use generative artificial intelligence, as do many others, but there would be no way that I would even consider misrepresenting what I write or work on in the voice of someone else.

Who would do that? Why? It dilutes it. Sure, it can be funny to have a narration by someone like Sir David Attenborough, or Morgan Friedman, or… all manner of people… but to trot out their voices to misrepresent truth is a very grey area in an era of half-truths and outright lies being distributed on the Babel of the Internet.

Somewhere – I believe it was in Lessig’s ‘Free Culture’ – I had read that the UK allowed artists to control how their works were used. A quick search turned this up:

The Copyright, Designs and Patents Act 1988, is the current UK copyright law. It gives the creators of literary, dramatic, musical and artistic works the right to control the ways in which their material may be used. The rights cover: Broadcast and public performance, copying, adapting, issuing, renting and lending copies to the public. In many cases, the creator will also have the right to be identified as the author and to object to distortions of his work.
The UK Copyright Service

It would seem that something similar would have to be done with the voices and even appearance of people around the world – yet in an age moving toward artificial intelligence, where content has been scraped without permission, the only people who can actually stop doing this are the ones who are scraping the content.

The world of trusted humans is being diluted by untrustworthy humans.

When The Internet Eats Itself

May 10, 2024 ~ Taran ~ 10 Comments

The recent news of Stack Overflow selling it’s content to OpenAI was something I expected. It was a matter of time. Users of Stack Overflow were surprised, which I am surprised by, and upset, which I’m not surprised by.

That seems to me a reasonable response. Who wouldn’t? Yet when we contribute to websites for free on the Internet and it’s not our website, it’s always a terrible bargain. You give of yourself for whatever reason – fame, prestige, or just sincerely enjoying helping, and it gets traded into cash by someone else.

But companies don’t want you to get wise. They want you to give them your content for free so that they can tie a bow around it and sell it. You might get a nice “Thank you!” email, or little awards of no value.

No Good Code Goes Unpunished.

The fallout has been disappointing. People have tried logging in and sabotaging their top answers. I spoke to one guy on Mastodon a few days ago and he got banned. It seems pretty obvious to me that they had already backed up the database where all the stuff was, and that they would be keeping an eye on stuff. Software developers should know that. There was also some confusion about the Creative Commons licensing the site uses versus the rights given to the owners of the website, which are mutually exclusive.

Is it slimy? You bet. It’s not new, and the companies training generative AI have been pretty slimy. The problem isn’t generative AI, it’s the way the companies decide to do business by eroding trust with the very market for their product while poisoning wells that they can no longer drink from. If you’re contributing answers for free that will be used to train AI to give the same answers for a subscription, you’re a silly person¹.

These days, generative AI companies need to put filters on the front of their learning models to keep small children from getting sucked in.

Remember Huffington Post?

Huffington Post had this neat little algorithm for swapping around headlines til it found one that people liked, it gamed SEO, and it built itself into a powerhouse that almost no one remembers now. It was social, it was quirky, and it was fun. Volunteers put up lots of great content.

When Huffingpost sold for $315 million, the volunteers who provided the content for free and built the site up before it sold sued – and got nothing. Why? Because they had volunteered their work.

I knew a professional journalist who was building up her portfolio and added some real value – I met her at a conference in Chicago probably a few months before the sale, and I asked her why she was contributing to HuffPost for free. She said it was a good outlet to get some things out – and she was right. When it sold, she was angry. She felt betrayed, and rightfully so I think.

It seems people weren’t paying attention to that. I did².

You live, you learn, and you don’t do it again. With firsthand and second hand experience, if I write on a website and I don’t get paid, it’s my website. Don’t trust anyone who says, “Contribute and good things will happen!”. Yeah, they might, but it’s unlikely it will happen for you.

If your content is good enough for a popular site, it’s good enough to get paid to be there. You in the LinkedIn section – pay attention.

Back To AI’s Intake Manifold.

I’ve written about companies with generative AI models scraping around looking for content, with contributed works to websites being a part of the training models. It’s their oil, it’s what keeps them burning through cash as they try to… replace the people whose content they use. In return, the Internet gets slop generated all over, and you’ll know the slop when you read it – it lacks soul and human connection, though it fakes it from time to time like the pornographic videos that make the inexperienced think that’s what sex is really like. Nope.

The question we should be asking is whether it’s worth putting anything on the Internet at this point, just to have it folded into a statistical algorithm that chews up our work and spits out something like it. Sure, there are copyright lawsuits happening. The argument of transformative works doesn’t really work that well in a sane mind when it comes to the exponentially higher amount of content used to create a generative AI at this point.

So what happens when less people contribute their own work? One thing is certain: the social aspect of the Internet will not thrive as well.

Social.

The Stack Overflow website was mainly an annoyance for me over the years, but I understand that many people had a thriving society of a sort there. It was largely a meritocracy, as open source, at least at it’s core. You’ll note that I’m writing of it in the past tense – I don’t think anyone with any bit of self-worth will contribute there anymore.

The annoyance aspect for me came from (1) Not finding solutions to the quirky problems that people hired me to solve³, and (2) Finding code fragments I tracked down to Stack Overflow poorly (if at all) adapted to the employer or client needs. I also had learned not to give away valuable things for free, so I didn’t get involved. Most, if not all, of the work I did required my silence on how things worked, and if you get on a site like StackOverflow – your keyboard might just get you in trouble. Yet the problem wasn’t the site itself, but those who borrowed code like it was a cup of sugar instead of a recipe.

Beyond we software engineers, developers, whatever they call themselves these days, there are a lot of websites with social interaction that are likely getting their content shoved into an AI learning model at some point. LinkedIn, owned by Microsoft, annoyingly in the top search results, is ripe for being used that way.

LinkedIn doesn’t pay for content, yet if you manage to get popular, you can make money off of sponsored posts. “Hey, say something nice about our company, here’s $x”. That’s not really social, but it’s how ‘influencers’ make money these days: sponsored posts. When you get paid to write posts in that way, you might be selling your soul unless you keep a good moral compass, but when bills need to get paid, that moral compass sometimes goes out the window. I won’t say everyone is like that, I will say it’s a danger and why I don’t care much about ‘influencers’.

In my mind, anyone who is an influencer is trying to sell me something, or has an ego so large that Zaphod Beeblebrox would be insanely jealous.

Regardless, to get popular, you have to contribute content. Who owns LinkedIn? Microsoft. Who is Microsoft partnered with? OpenAI. The dots are there. Maybe they’re not connected. Maybe they are.

Other websites are out there that are building on user content. The odds are good that they have more money for lawyers than you do, that their content licensing and user agreement work for them and not you, and if someone wants to buy that content for any reason… you’ll find out what users on Stack Overflow found out.

All relationships are built on trust. All networks are built on trust. The Internet is built on trust.

The Internet is eating itself.

_{I am being kind.} ↩︎
_{I volunteered some stuff to WorldChanging.com way back when with the understanding it would be Creative Commons licensed. I went back and forth with Alex and Jamais, as did a few other contributors, and because of that and some nastiness related to the Alert Retrieval Cache, I walked away from the site to find out from an editor that contacted me about their book that they wanted to use some of my work. Nope. I don’t trust futurists, and maybe you shouldn’t either.} ↩︎
_{I always seemed to be the software engineer that could make sense out of gobblygook code, rein it in, take it to water and convince it to drink.} ↩︎

How Much AI In Journalism?

January 8, 2024 ~ Taran ~ 2 Comments

Recently, I’ve been active in a group on Facebook that is supposed to be a polite space to debate things. News articles fly around, and the news articles we see these days from different sources carry their own biases because rather than just presenting facts, they present narratives and narratives require framing.

I wondered how much of these articles were generated by what we’re calling artificial intelligence these days. In researching, I can tell you I’m still wondering – but I have found some things that are of interest.

The New York Times Lawsuit.

It’s only fair to get this out of the way since it’s short and sweet.

Of course, in the news now is the lawsuit that the New York Times has brought against Microsoft and OpenAI, where speculation runs rampant either way. To their credit, through that link, the New York Times presented things in as unbiased way as possible. Everyone’s talking about that, but speculation on that only really impacts investors and share prices. It doesn’t help people like me as much, who write their own content as individuals.

In an odd twist though, and not too long after the announcement, OpenAI is offering to pay for licensing news articles (paywalled), which you can read more about here if you’re not paying TheInformation¹.

Either way, that lawsuit is likely not going to help my content stay out of a learning model because I just don’t have the lawyers. Speculating on it doesn’t get me anywhere.

How Much AI is being used?

Statista only has one statistic they cite in the amount of artificial intelligence used in media and entertainment: 78 percent of U.S. adults think news articles created by AI is not a good thing.

The articles there go on and tell us about the present challenges, etc, but one word should stand out from that: foggy.

So how would it be used, if it is used? With nearly 50 ‘news’ websites as of May last year, almost a year ago, and with one news site even going so far as having an AI news anchor as of late last year, we should have questions.

Well, we don’t really know how many news agencies are using artificial intelligence or how. One would think disclosure would be the issue then.

The arguments against disclosure are pretty much summed up below (an extract from a larger well balanced article).

Against disclosure

One concern is that it could stifle innovation. If news organisations are required to disclose every time they use AI, they may be less likely to experiment with the technology.

Another is that disclosure could be confusing for consumers. Not everyone understands how AI works. Some people may be suspicious of AI-generated content. Requiring disclosure could make it more difficult for consumers to get the information they need.
Should the media tell you when they use AI to report the news? What consumers should know, Jo Adetunji, Editor, TheConversationUK, TheConversation.com, November 14, 2023.

On the surface, the arguments make sense.

In my opinion, placing innovation over trust, which is the actual argument being made by some with that argument, is abhorrent. To innovate, one needs that trust and if you want that trust, it seems to me that the trust has to be earned. This, given the present state of news outlets in their many shades of truth and bias might seem completely alien to some.

I do encourage people to read that entire article because the framing of it here doesn’t do the article justice and I’ve simply expressed an opinion on one side of the arguments presented.

How Is AI presently used?

Again, we really don’t know because of the disclosure issue, but in October of last year Twipe published 10 ways that journalists are using artificial intelligence. It points from the onset to Klara Indernach, a tag used by Express.de to note when an article is created with artificial intelligence tools.

Arist von Harpe is cited in the article for saying, “We do not highlight AI-aided articles. We’re only using [AI] as a tool. As with any tool, it’s always the person using it who is responsible for what comes out.” This seems a reasonable position, and puts the accountability on the humans related to it. I have yet to see artificial intelligences be thrown under the bus for an incorrect article, so we have that landmark to look for.

The rest of that article is pretty interesting and mentions fact checking, which is peculiar given the prevalence of hallucinations and even strategic deception, as well as image generation, etc.

We’ll never really know.

In the end, I imagine the use of any artificial intelligence in newsrooms is evolving even as I write this and will be evolving well beyond when you read this. In a few years, it may not be as much of a big deal, but now we’re finding failures in artificial intelligences all the way to a court, in a matter that is simply fraught with political consequences. They were quick to throw Google Bard under the bus on that one.

It is still disturbing we don’t have much insight into the learning models being used, which is a consistent problem. The lawsuit of the New York Times seems to be somewhat helpful there.

I honestly tried to find out what I could here and in doing so came up with my own conclusion that wasn’t what I would have expected it to be.

In the end, it is as Arist von Harpe is cited. We have to judge based on the stories we get because every newsroom will do things differently. It would have helped if we had less room to speculate on biases before the creation of these artificial intelligence tools, and whoever screws up should lose some trust. In this day and age, though, feeding cognitive biases seems to trump trust.

That’s probably the discussion we should have had some time ago.

These paywalls are super-annoying for we mere mortals who do not have the deep pockets of corporate America. How many subscriptions is a well informed person supposed to have? It’s gotten ridiculous. We’ve known that business models for news have been in such trouble that a ‘news story’ has a more literal definition these days, but… surely we can do better than this? ↩︎

KnowProSE.com

Where one line can make a difference.

trust