When You Can’t Trust Voices.

Generative AI is allowing people to do all sorts of things, including imitating voices we have come to respect and trust over the years. In the most recent case of Sir David Attenborough, he greatly objects to it and finds it ‘profoundly disturbing’.

His voice is being used in all manner of ways.

It wasn’t long ago that Scarlet Johannson suffered such an insult that was quickly ‘disappeared’.

The difference here is that a man who has spent decades showing people the natural world has his voice being used in disingenuous ways, and it should give us all pause. I use generative artificial intelligence, as do many others, but there would be no way that I would even consider misrepresenting what I write or work on in the voice of someone else.

Who would do that? Why? It dilutes it. Sure, it can be funny to have a narration by someone like Sir David Attenborough, or Morgan Friedman, or… all manner of people… but to trot out their voices to misrepresent truth is a very grey area in an era of half-truths and outright lies being distributed on the Babel of the Internet.

Somewhere – I believe it was in Lessig’s ‘Free Culture’ – I had read that the UK allowed artists to control how their works were used. A quick search turned this up:

The Copyright, Designs and Patents Act 1988, is the current UK copyright law. It gives the creators of literary, dramatic, musical and artistic works the right to control the ways in which their material may be used. The rights cover: Broadcast and public performance, copying, adapting, issuing, renting and lending copies to the public. In many cases, the creator will also have the right to be identified as the author and to object to distortions of his work.

The UK Copyright Service

It would seem that something similar would have to be done with the voices and even appearance of people around the world – yet in an age moving toward artificial intelligence, where content has been scraped without permission, the only people who can actually stop doing this are the ones who are scraping the content.

The world of trusted humans is being diluted by untrustworthy humans.

The Dark Side of the AI.

It didn’t take as long as we expected. Last week, a former school athletic director got arrested for framing a principal.

Being a campaign year, I thought that most of the AI hijinx would revolve around elections around the world – and they are happening – but I didn’t think we’d see early adoption of AI in this sort of thing. And an athletic director, no less – not a title typically known for mastery of technology.

AI has a dark side, which a few of us have been writing. The Servitor does a good job of documenting what they coined as Dark ChatGPT, well worth a look. Any technology can be twisted to our own devices.

It’s not the technology.

It’s us.

Again.

Maybe the CEO of Google was right about a need for more lawyers.

DHS Artificial Intelligence Safety And Security Board Has Some Odd Appointments.

Now that we’ve seen that generative artificial intelligence can be trained ethically, without breaking copyright laws, the list of people to the DHS Artificial Intelligence Safety and Security Board seems less than ideal.

The Board is supposed to ‘advance AI’s responsible development and deployment’ (emphasis mine), yet some on that Board took shortcuts.

Shortcuts in relation to any national security issue seems like a bad thing.

Here’s the list.

There’s some dubious companies involved. The argument can be made – and it probably will – that the companies are a part of national infrastructure, but is it national infrastructure that controls the United States, or is it the other way around?

I don’t know that these picks are good or bad. I will say that there are some that, at least in the eyes of others, been irresponsible. That would fall under Demonstrated Unreliability.

Copyright, AI, And, It Doing It Ethically.

It’s no secret that the generative, sequacious artificial intelligences out there have copyright issues. I’ve written about it myself quite a bit.

It’s almost become cliche to mention copyright and AI in the same sentence, with Sam Altman having said that there would be no way to do generative AI without all that material – toward the end of this post, you’ll see that someone proved that wrong.

Copyright Wars pt. 2: AI vs the Public“, by Toni Aittoniemi in January of 2023, is a really good read on the problem as the large AI companies have sucked in content without permission. If an individual did it, the large companies doing it would call it ‘piracy’, but now, it’s… not? That’s crazy.

The timing of me finding Toni on Mastodon was perfect. Yesterday, I found a story on Wired that demonstrates some of what Toni wrote last year, where he posed a potential way to handle the legal dilemmas surrounding creator’s rights – we call it ‘copyright’ because someone was pretty unimaginative and pushed two words together for only one meaning.

In 2023, OpenAI told the UK parliament that it was “impossible” to train leading AI models without using copyrighted materials. It’s a popular stance in the AI world, where OpenAI and other leading players have used materials slurped up online to train the models powering chatbots and image generators, triggering a wave of lawsuits alleging copyright infringement.

Two announcements Wednesday offer evidence that large language models can in fact be trained without the permissionless use of copyrighted materials.

A group of researchers backed by the French government have released what is thought to be the largest AI training dataset composed entirely of text that is in the public domain. And the nonprofit Fairly Trained announced that it has awarded its first certification for a large language model built without copyright infringement, showing that technology like that behind ChatGPT can be built in a different way to the AI industry’s contentious norm.

“There’s no fundamental reason why someone couldn’t train an LLM fairly,” says Ed Newton-Rex, CEO of Fairly Trained. He founded the nonprofit in January 2024 after quitting his executive role at image-generation startup Stability AI because he disagreed with its policy of scraping content without permission….

Here’s Proof You Can Train an AI Model Without Slurping Copyrighted Content“, Kate Knibbs, Wired.com, March 20th, 2024

It struck me yesterday that a lot of us writing and communicating about the copyright issue didn’t address how it could be handled. It’s not that we don’t know that it couldn’t be handled, it’s just that we haven’t addressed it as much as we should. I went to sleep considering that and in the morning found that Toni had done much of the legwork.

What Toni wrote extends on the system:

…Any training database used to create any commercial AI model should be legally bound to contain an identity that can be linked to a real-world person if so required. This should extend to databases already used to train existing AI’s that do not yet have capabilities to report their sources. This works in two ways to better allow us to integrate AI in our legal frameworks: Firstly, we allow the judicial system to work it’s way with handling the human side of the equation instead of concentrating on mere technological tidbits. Secondly, a requirement of openness will guarantee researches to identify and question the providers of these technologies on issues of equality, fairness or bias in the training data. Creation of new judicial experts at this field will certainly be required from the public sector…

“Copyright Wars pt. 2: AI vs the Public”, Toni Aittoniemi, Gimulnautti, January 13th, 2023.

This is sort of like – and it’s my interpretation – a tokenized citation system built into a system. This would expand on what, as an example, Perplexity AI does by allowing style and ideas to have provenance.

This is some great food for thought for the weekend.

Our Technology And Ethics.

The headlines this past week have had Google’s relationship with Israel under scrutiny as they fired employees who were against what Israel has been doing and protested accordingly. I’ve looked at some of the news stories, some sympathizing with the former employees, some implicitly supporting Israel and the order that people expect within companies.

I won’t comment on that because that’s political and this isn’t about politics, or who is right or wrong.

Of Swords And Blacksmiths.

Throughout my career as a software engineer, I’ve had to deal with ethical issues and I’ve navigated them as best I could, as challenging as some of them were and some of them were personally quite challenging.

Ever since we figured out how to bonk each other over the heads with stones (stone technology), it seems we’ve found increasing occasion to do so. It could be that the first use of such weapons was for hunting or defense of the tribe from predators – likely both – but eventually we learned to turn them on ourselves.

I’m sure at some point there was a blacksmith who refused to make swords because of where the points and edges were aimed. Other blacksmiths just made them. There always seems to be someone else to kill, or to defend against. We could get into the Great Gun Debate, but we fall into the same problem with that. There’s always some human creeping around who wants to kill someone else for glorified reasons, and because of that we sleep with things under our pillows that could very well be used to kill us just as easily. It’s not a debate. It’s a criticism of humanity and an unfortunately honest one at that.

“We all lived for money, and that is what we died for.”

William T. Vollmann, No Immediate Danger: Volume One of Carbon Ideologies

Sometimes my ethics require me to move on, which I did without protest a few times over the decades: There’s always someone else who needs a job more than they care about an ethical issue if even they see the ethical issue. In the end we try, hopefully, to do more good than bad, but both of those are subjective.

Too often we use a technology as a scapegoat, an externalized criticism of ourselves that allows us to keep doing what we do. Technology can be used for good or bad; how we use that technology says something about ourselves and when we criticize the use of technology, we implicitly criticize ourselves but we don’t take the criticism because we have neatly placed the blame on a vague, externalized concept – a deflection at a species level, often because we are buying into the idea that the enemy is less than human. Yet we all are human despite ideologies, cultures, languages, and color coding that we don’t all neatly fit in.

We Are All Blacksmiths.

These days, with generative AI allowing us to paint the fence of the future once we give the corporations in control of them a few baubles, everything we do on the Internet is potentially a weapon to be used against someone else. While the firing of the Google employees who protested is news, those that still work there aren’t, and this is not to say that they aren’t faced with their own ethical dilemmas. We who work in technology hope that our work is used for good.

I worked at one place that started off with robo-calling software that was used to annoy people during elections that turned itself into an emergency communications service. Things can change, businesses can change, and controlling even a part of the infrastructure of a nation’s military can have unexpected consequences for everyone involved. What happens if Google suddenly doesn’t like something and turns something off?

The future is decidedly fickle. Our personal ethics should impact our collective ethics, but it often doesn’t. It can.

We build tools. Sadly, they aren’t used the way we would like sometimes, and we should try to influence things if we can – but ultimately, we are subject to a fickle future and good intentions that can be misdirected.

AI, Ethics, Us.

Most of us live in a lot of different worlds, and we see things differently because of it. Some of us live in more than one world at a time. That’s why sometimes it’s hard for me to consider the promise of artificial intelligence and what we’re getting and the direction that’s going.

There’s space in this world in research for what we have now, which allows previously isolated knowledge to be regurgitated in a feat of math that makes the digital calculator look mundane. It’s statistics, it gives us what we want when we hit the ‘Enter’ button, and that’s not too bad.

Except it can replace an actual mind. Previously, if you read something, you didn’t guess if a machine threw the words together or not. You didn’t wonder if the teacher gave you a test generated by a large language model, and the teacher didn’t wonder if you didn’t generate the results the same way.

Now, we wonder. We wonder if we see an image. We wonder if we watch a video. We wonder enough so that the most popular female name for 2023 should be Alice.

So let me tell you where I think we should be heading with AI at this time.

What Could Be.

Everyone who is paying attention to what’s happening can see that the world is fairly volatile right now after the global pandemic, after a lot of economic issues that banks created combined with algorithmic trading… so this is the perfect time to drop some large language models in the world to make things better.

Nope.

No, it isn’t working that way. If we were focused on making the world better rather than worrying about using a good prompt for that term paper or blog post, it maybe could work that way. We could use things like ChatGPT to be consultants, but across mankind we lack the integrity to only use them as consultants.

“If anyone takes an AI system and starts designing speeches or messages, they generate the narrative that people want to hear. And the worst thing is that you don’t know that you are putting the noose around your neck alone.” The academic added that the way to this situation is education.

The only way to avoid manipulation is through knowledge. Without this, without information, without education, any human group is vulnerable, he concluded.1

IA: implicaciones éticas más allá de una herramienta tecnológica“, Miguel Ángel Pérez Álvarez, Wired.com (Spanish), 29 Nov 2023.

There’s the problem. Education needs to adapt to artificial intelligence as well because this argument, which at the heart I believe to be true, does not suffer it’s own recursion because people don’t know when it’s ethically right to use it, or even don’t know that there should be ethics involved.

As it happens, I’m pretty sure Miguel Ángel Pérez Álvarez already understands this and simply had his thoughts truncated, as happens in articles. He’s also got me wondering how different languages are handled by these Large Language Models and how different their training models are.

It’s like finding someone using an image you created and telling them, “Hey, you’re using my stuff!” and they say, “But it was on the Internet”. Nevermind the people who believe that the Earth is flat, or who think that vaccinations give you better mobile connections.

AI doesn’t bother me. It’s people, it’s habits, and in a few decades they’ll put a bandaid on it and call it progress. The trouble is we have a stack of bandaids on top of each other at this point and we really need to look at this beyond the pulpits of some billionaires who enjoy more free speech than anyone else.

  1. actual quote: “Si cualquier persona toma un sistema de IA y se pone a diseñar discursos o mensajes, te generan la narrativa que la gente quiere escuchar. Y lo peor es que tú no sabes que te estás poniendo la soga al cuello solito”. El académico añadió que la manera de contrarrestar esta situación es la educación.

    “La única manera de evitar la manipulación es a través del conocimiento. Sin este, sin información, sin educación, cualquier grupo humano es vulnerable”, finalizó.” ↩︎

Exploring Beyond Code 2.0: Into A World of AI.

It’s become a saying on the Internet without many people understanding it: “Code is Law”. This is a reference to one of the works of Lawrence Lessig, revised already since it’s original publication.

Code Version 2.0 dealt with much of the nuances of Law and Code in an era where we are connected by code. The fact that you’re reading this implicitly means that the Code allowed it.

Here’s an example that weaves it’s way throughout our society.

One of the more disturbing things to consider is that when Alexis de Tocqueville wrote Democracy in America 1, he recognized the jury as a powerful mechanism for democracy itself.

“…If it is your intention to correct the abuses of unlicensed printing and to restore the use of orderly language, you may in the first instance try the offender by a jury; but if the jury acquits him, the opinion which was that of a single individual becomes the opinion of the country at large…”

Alexis de Tocqueville, Volume 1 of Democracy in America, Chapter XI: Liberty of the Press In the United States (direct link to the chapter within Project Gutenberg’s free copy of the book)

In this, he makes the point that public opinion on an issue is summarized by the jury, for better and worse. Implicit in that is the discussion within the Jury itself, as well as the public opinion at the time of the trial. This is indeed a powerful thing, because it allows the people to decide instead of those in authority. Indeed, the jury gives authority to the people.

‘The People’, of course, means the citizens of a nation, and within that there is discourse between members of society regarding whether something is or is not right, or ethical, within the context of that society. In essence, it allows ethics to breathe, and in so doing, it allows Law to be guided by the ethics of a society.

It’s likely no mistake that some of the greatest concerns in society stem from divisions in what people consider to be ethical. Abortion is one of those key issues, where the ethics of the rights of a woman are put into conflict with the rights of an unborn child. On either side of the debate, people have an ethical stance based on their beliefs without compromise. Which is more important? It’s an extreme example, and one that is still playing out in less than complimentary ways for society.

Clearly no large language model will solve it, since the large language models are trained with implicitly biased training models and algorithms which is why they shouldn’t be involved, and this would likely go for general artificial intelligences of the future. Machine learning, or deep learning, learns from us, and every learning model is developed by it’s own secret jury whose stewed biases may not reflect the whole of society.

In fact, they would reflect a subset of society that is as disconnected from society as the companies that make them, since the company hires people based on it’s own values to move toward their version of success. Companies are about making money. Creating value is a very subjective thing for human society, but money is it’s currency.

With artificial intelligence being involved in so many things and with them becoming more and more involved, people should at the least be concerned:

  • AI-powered driving systems are trained to identify people, yet darker shades of humanity are not seen.
  • AI-powered facial recognition systems are trained on datasets of facial images. The code that governs these systems determines which features of a face are used to identify individuals, and how those features are compared to the data in the dataset. As a result, the code can have a significant impact on the accuracy and fairness of these systems, which has been shown to have an ethnic bias.
  • AI-powered search engines are designed to rank websites and other online content according to their relevance to a user’s query. The code that governs these systems determines how relevance is calculated, and which factors are considered. As a result, the code can have a significant impact on the information that users see, and therefore what they discuss, and how they are influenced.
  • AI-powered social media platforms are designed to connect users with each other and to share content. The code that governs these platforms determines how users are recommended to each other, and how content is filtered and ranked. As a result, the code can have a significant impact on the experiences of users on these platforms – aggregating into echo chambers.

We were behind before artificial intelligence reared it’s head recently with the availability of large language models, separating ourselves in ways that polarized and made compromise impossible.

Maybe it’s time for Code Version 3.0. Maybe it’s time we really got to talking about how our technology will impact society beyond a few smart people.

1 This was covered in Volume 1 of ‘Democracy in America‘, available for free here on Project Gutenberg.

Of Digital Shadows And Digital Ghosts

Ice, Shadow and StoneIn writing about shadows and ghosts, it’s hard not to draw the line to how we process data – the phrase big data gets tossed around a lot in this way.

Data Science allows us to create constructs of data – interpreted and derived, insinuated and insulated, when in fact we know about as much about that data as we do the people in our own lives – typically insufficient to understand them as people, something I alluded to here.

Data only tells us what has happened, it doesn’t tell us what will happen, and it’s completely based on the availability we frame in and from data. We can create shadows from that data, but the real value of data is in the ghosts – the collected data in contexts beyond our frames and availability.

This is the implicit flaw in machine learning and even some types of AI. It’s where ethics intersects technology when the technologies have the capacity to affect human lives for better and worse, because it becomes a problem of whether it’s fair.

And we really aren’t very good at ‘fair’.

Google, War and Ethics.

When 3,000 employees send an email to Google’s CEO about not wanting to be involved with warfare projects, it shows that there are at least 3,000 people who are willing to stand up for their own personal ethics.

It also means that in this day and age, Google employees are paid enough to have the privilege to do so.

Many people are forced to compromise their own ethics to pay the bills. At least some working for one company aren’t in such a boat, who are willing to speak up to their CEO.

Let’s see what happens.

Technology And Arts

Sisyphean TechnologyPeople in technology of my era and later are strange creatures that delve into the depths of understanding the cold and relentless logic of systems that they create and maintain. We see the same in other fields, in Law, in Medicine, Accounting and so many others.

Today, as Lessig wrote, ‘Code is Law‘, and Law wrestles with technology even as technology works to circumvent existing Law. Law, as a freshman student will tell you, is not Ethics – it is an attempt at the codification of Ethics in a society. That distinction is important yet routinely forgotten by many – and that’s where some empowered by technology have an ax to grind. Others are just in it for the money, or for some political agenda.

One of the problems we face, as a global society of screen-watchers, is that we have separate silos of technology and arts – where technology is often used as a platform for the liberal arts.