Blocking AI Bots: The Opt Out Issue.

Those of us that create anything – at least without the crutches of a large language model like ChatGPT- are a bit concerned about our works being used to train large language models. We get no attribution, no pay, and the companies that run the models basically can just grab our work, train their models and turn around and charge customers for access to responses that our work helped create.

No single one of us is likely that important. But combined, it’s a bit of a rip off. One friend suggested being able to block the bots, which is an insurmountable task because it depends on the bots obeying what is in the robots.txt file. There’s no real reason that they have to.

How to Block AI Chatbots From Scraping Your Website’s Content” is a worthwhile guide to attempting to block the bots. It also makes the point that maybe it doesn’t matter.

I think that it does, at least in principle, because I’m of the firm opinion that websites should not have to opt out of being used by these AI bots – but rather, that websites should opt in as they wish. Nobody’s asked for anything, have they? Why should these companies use your work, or my work, without recompense and then turn around and charge access to these things?

Somehow, we got stuck with ‘opting out’ when what these companies running the AI Bots should have done is allow people to opt in with a revenue model.

TAANSTAAFL. Except if you’re a large tech company, apparently.

On the flip side, Zoom says that they’re not using data from users for their training models. Taken at face value, that’s great, but the real problem is that we wouldn’t know if they did.

Bubbles Distilled By Time.

We all perceive the world through our own little bubbles. As far as our senses go, we only have touch, taste, feeling, hearing, smell and sight to go by. The rest comes from what we glean through those things, be it other people, technology, language, culture, etc.

If the bubble is too small, we feel it a prison and do our best to expand it. Once it’s comfortable, we don’t push it outward as much.

These little bubbles contain ideas that have passed down through the generations, how others have helped us translate our world and all that is in it, etc. We’re part of a greater distillation process, where because of our own limitations we can’t possibly carry everything from previous generations.

If we consider all the stuff that creates our bubble as little bubbles themselves that we pass on to the next generation, it’s a distillation of our knowledge and ideas over time. Some fall away, like the idea of the Earth being the center of the Universe. Some stay with us despite not being used as much as we might like – such as the whole concept of, ‘be nice to each other’.

If we view traffic as something going through time, bubbles are racing toward the future all at the same time, sometimes aggregating, sometimes not. The traffic of ideas and knowledge is distilled as we move forward in time, one generation at a time. Generally speaking, until broadcast media this was a very local process. Thus, red dots trying to get us to do things, wielded by those who wish us to do things from purchasing products to voting for politicians with their financial interests at heart.

Broadcast media made it global by at first giving people information and then by broadcasting opinions to become sustainable through advertising. Social media has become the same thing. How will artificial intelligences differ? Will ChatGPT suddenly spew out, “Eat at Joes!”? I doubt that.

However, those with fiscal interests can decide what the deep learning of artificial intelligences are exposed to. Machine learning is largely about clever algorithms and pruning the data that the algorithms are trained on, and those doing that are certainly not the most unbiased of humanity. I wouldn’t say that they are the most biased either – we’re all biased by our bubbles.

It’s Pandora’s Box. How do we decide what should go in and what should stay out? Well, we can’t, really. Nobody is actually telling us what’s in them now. Our education systems, too, show us that this is not necessarily something we’re good at.

Education and AI

Having coffee with a friend yesterday, I was listening to her talk about meetings with teachers and encouraging her daughter to read at least 30 minutes a day. This is likely mundane conversation for many people, but the minutiae of it is interesting to me.

It is, after all, a system.

Everyone I have met or interacted with that was an educator (or librarian, they don’t get enough love) has been passionate about learning. Educational administrators I know have been passionate about test results. Test results and learning are mutually exclusive.

The tests are to prove to the masses that the educational system is working. They rarely demonstrate what one knows about anything, only that which was tested in tests that students are trained for. Learning, to me, is about not just having knowledge but applying that knowledge in practical ways. Deep understanding is not found through testing, it’s found through interaction- at least in my experience.

Sure, you can memorize equations and plug numbers into them, but unless you know more about the equations themselves, you’re just doing something a computer can do.

So let’s see what computers can do. I asked 3 different large language models, “Has the quality of education declined?”. Here are 3 different responses:

The question of whether the quality of education has declined is complex and can vary across different contexts and regions. It’s challenging to provide a definitive answer as education systems differ significantly worldwide, and numerous factors can influence quality.

While it’s true that some individuals and experts express concerns about declining educational quality, it is essential to consider various factors that impact education. These factors include changes in curriculum, teaching methods, funding, technological advancements, and societal shifts.

To assess the quality of education, it is necessary to examine specific indicators such as student performance on standardized tests, graduation rates, college enrollment rates, and feedback from educators, parents, and students. These indicators can provide insights into the overall effectiveness of education systems.

However, it is worth noting that generalizing the quality of education across all regions or institutions is not accurate, as there can be significant variations within and between countries. It is essential to consider local and contextual factors when evaluating educational quality.

ChatGPT.

As an AI language model, I cannot have an opinion. However, reports suggest that the quality of education varies by location and institution, with some improving and others declining. Factors like funding, teaching methodology, and teacher quality can also impact the quality of education. Additionally, the increasing demands of technology and the workforce may require updates to curriculum and teaching methods to effectively prepare students for the future.

DeepAI Chat.

There is no definitive answer to this question, as the quality of education can vary greatly depending on the country, school, and individual student. However, there are some general trends that suggest that the quality of education may have declined in recent years.

One trend is that students are spending less time on academic subjects and more time on extracurricular activities. This is likely due to a number of factors, including the increasing pressure to succeed in college and the growing popularity of sports and other activities.

Another trend is that teachers are facing increasing workloads and demands. This can make it difficult for them to provide students with the individual attention they need.

Finally, the curriculum in many schools is becoming more standardized, which can lead to a lack of creativity and innovation.

All of these factors can contribute to a decline in the quality of education. However, it is important to note that there are also many schools and teachers who are working hard to provide students with a high-quality education.

Here are some of the reasons why the quality of education may have declined:

  • The increasing focus on standardized testing has led to a narrowing of the curriculum and a focus on test-taking skills over critical thinking and problem-solving skills.
  • The shortage of qualified teachers has made it difficult to attract and retain the best teachers, and those who are teaching are often overworked and underpaid.
  • The increasing cost of education has made it more difficult for families to afford to send their children to college, which has led to a decline in the number of students who are graduating from college.
  • The rise of technology has made it easier for students to access information outside of the classroom, but it has also made it more difficult for them to focus and learn.

Despite these challenges, there are also many reasons to be optimistic about the future of education. There is a growing movement to reform the education system and to focus on teaching students the skills they need to succeed in the 21st century. There are also many dedicated teachers who are working hard to make a difference in the lives of their students. With hard work and dedication, we can reverse the decline in the quality of education and ensure that all students have the opportunity to succeed.

Bard.

I was a bit surprised at the differences in the responses and completely unsurprised about the commonality. I wouldn’t expect them to know since nobody knows, and all these large language models (LLM) do is access the same information the particular model is provided when prompted with a question. The reality is that we don’t know, therefore a LLM wouldn’t know.

If anyone thinks that’s not an issue, I think they may have had an education system land on their head in a most grotesque fashion.

We’re getting marketed “artificial intelligence”, machine learning, deep learning, neural networks, large language models and just about every cool phrase that causes nerdgasms.

When we consider the status of education systems for humans – human learning – we don’t even seem to have an idea of where we are. Further, students who were once copying and pasting from the Internet for assignments now get large language models doing their homework – if they can access them.

Something doesn’t seem quite right about that. Certainly, the technology progress is great, but I’m not sure we’re making smarter humans. I’m also not sure we’re making wiser humans.

What’s there to do? There seems to be at least some people thinking about the topic of education and AI, but as a society, are we too busy paying the bills and chasing red dots to have time for the future?

Silent Bias

_web_studying ourselvesOnce upon a time as a Navy Corpsman in the former Naval Hospital in Orlando, we lost a patient for a period – we simply couldn’t find them. There was a search of the entire hospital. We eventually did find her but it wasn’t by brute force. It was by recognizing what she had come in for and guessing that she was on LSD. She was in a ladies room, staring into the mirror, studying herself through a sensory filter that she found mesmerizing. What she saw was something only she knows, but it’s safe to say it was a version of herself, distorted in a way only she would be able to explain.

I bring this up because as a species, many of us connected to our artificial nervous system are fiddling with ChatGPT, and what we are seeing are versions of our society in a mirror.

As readers, what we get out of it has a lot of what we bring to it. As we query it, we also get out of it what we ask of it through the filters of how it was trained and it’s algorithms, the reflexes we give it. Is it sentient? Of course not, these are just large language models and are not artificial general intelligences.

With social media companies, we have seen the effect of the social media echo chambers as groups become more and more isolated despite being more and more connected, aggregating to make it easier to sell advertising to. This is not to demonize them, many bloggers were doing it before them, and before bloggers there was the media, and before then as well. It might be amusing if we found out that cave paintings were actually advertising for someone’s spears or some hunting consulting service, or it might be depressing.

All of this cycled through my mind yesterday as I began considering the role of language itself with it’s inherent bias based on an article that stretched it to large language models and artificial intelligence. The actual study was just about English and showed a bias toward addition, but with ChatGPT and other large language models being the current advertising tropism, it’s easy to understand the intention of linking the two in an article.

Regardless of intention, there is a point as we stare into the societal mirror of large language models. The training data will vary, languages and cultures vary, and it’s not hard to imagine that every language, and every dialect, has some form of bias. It might be a good guess that where you see a lot of bureaucracy, there is linguistic bias and that can get into a chicken and egg conversation: Did the bias exist before the language, or did the language create the bias? Regardless, it can reinforce it.

fake hero dogThen I came across this humorous meme. It ends up being a legitimate thing that happened. The dog was rewarded with a steak for saving the life of a child from drowning and quickly came to the conclusion that pulling children out of water got it steak.

Apparently not enough children were falling into water for it to get steaks, so it helped things along. It happened in 1908, and Dr. Pavlov was still alive during this. His famous derived work with dogs was published in 1897, about 11 years prior, but given how slow news traveled then it wasn’t as common knowledge as we who have internet access would expect. It’s possible the New York Times article mentioned him, but I didn’t feel like unlocking their paywall.

If we take this back to society, we have seen the tyranny of fake news propagation. That’s nothing new either. What is interesting is the paywall aspect, where credible news is hidden behind paywalls leaving the majority of the planet to read what is available for free. This is a product of publishing adaptation to the Internet age, which I lived through and which to an extent I gained some insight into when I worked for Linux Journal’s parent company, SSC. The path from print to internet remains a very questionable area because of how advertising differs between the two media.

Are large language models being trained on paywalled information as well? Do they have access to academic papers that are paywalled? What do they have access to?

What parts of ourselves are we seeing through these mirrors? Then we have to ask whether the large language models have access to things that most humans don’t, and based on who is involved, it’s not hard to come to a conclusion where the data being fed to them by these companies isn’t available for consumption for the average person. Whether that is true or not is up for debate.

All of this is important to consider as we deal with these large language models, yet the average person plays with them as a novelty, unaware of the biases. How much should we trust what comes out of them?

As far as disruptive technologies go, this is probably the largest we have seen since the Internet itself. As long as it gives people what they want, and it supports cognitive biases, it’s less likely to be questioned. Completely false articles propagate on the Internet still, there are groups of people who seriously believe that the Earth is flat, and we have people asking ChatGPT things that they believe are important. I even saw someone in a Facebook reel quoting a GPT-4 answer.

We should at the least be concerned, but overall we aren’t. We’re too busy dealing with other things, chasing red dots.