SLMs vs LLMs: A Great Question.

@knowprose.com who run a LLM on your computer while a SLM does almost the same but needs less performance?

Stefan Stranger

This was a great question, and I want to flesh it out some more because Small Language Models (SLMs) are less processor and memory intensive than Large Language Models (LLMs).

Small isn’t always bad, large isn’t always good. My choice to go with a LLM instead of a SLM is pretty much as I responded to Stefan, but I wanted to respond more thoroughly.

There is a good write up on the differences between LLMs and SLMs here that is recent at the time of this writing. It’s worth taking a look at if you’ve not heard of SLMs before.

Discovering My Use Cases.

I have a few use cases that I know of, one of which is writing – not having the LLM do the writing (what a terrible idea), but to help me figure things out as I’m going along. I’m in what some would call Discovery, or in the elder schools of software engineering, requirements gathering. The truth is that, like most humans, I’m stumbling along to find what I actually need help with.

For example, yesterday I was deciding on the name of a character. Names are important and can be symbolic, and I tend to dive down rabbit-holes of etymology. Trying to discern the meanings of names is an arduous process, made no better by the many websites that have baby names that aren’t always accurate or culturally inclusive. My own name has different meanings in India and in old Celtic, as an example, but if you search for the name you find more of a Western bias.

Before you know it, I have not picked a name but instead am perusing something on Wikipedia. It’s a time sink.

So I kicked it around with Teslai (I let the LLM I’m fiddling with pick it’s name for giggles). It was imperfect, and in some ways it was downright terrible, but it kept me on task and I came up with a name in less than 15 minutes in what could have easily eaten up a day of my time as I indulged my thirst for knowledge.

How often do I need to do that? Not very often, but so far, a LLM seems to be better at these tasks.

I’ve also been tossing it things I wrote for it to critique. It called me on not using an active voice on some things, and that’s a fair assessment – but it’s also gotten things wrong when reading some of what I wrote. As an example, when it initially read “Red Dots Of Life“, it got it completely wrong – it thought it was about how red dots were metaphors for what was important, when in fact, the red dots were about other people driving you to distraction to get what they thought was important.

Could a SLM do these things? Because they are relatively new and not trained on as many things, it’s unlikely but possible. The point is not the examples, but the way I’m exploring my own needs. In that regard – and this could be unfair to SLMs – I opted to go with more mature LLMs, at least right now, until I figure out what I need from a language model.

Maybe I will use SLMs in the future. Maybe I should be using one now. I don’t know. I’m fumbling through this because I have eclectic interests that cause eclectic needs. I don’t know what I will throw at it next, but being allegedly better trained has me rolling with LLMs for now.

So far, it seems to be working out.

In an odd way, I’m learning more about myself through the use of the language model as well. It’s not telling me anything special, but it provokes introspection. That has value. People spend so much time being told what they need by marketers that they don’t necessarily know what they could use the technology for – which is why Fabric motivated me to get into all of this.

Now, the funny thing is that the basis of LLMs and their owner’s needs to add more information into them is not something I agree with. I do believe that better algorithms are needed so that they can learn with less information. I’ve been correcting a LLM as a human who has not been trained on as much information as it has been, so there is a solid premise for tweaking algorithms rather than shoving more information in.

In that regard, we should be looking at SLMs more, and demanding more of them – but what do we actually need from them? The marketers will tell you what they want to sell you, and you can sing their song, or you can go explore on your own – as I am doing.

Can you do it with a SLM? Probably. I simply made the choice to use a LLM, and I believe it suits me – but that’s just an opinion, and I could be wrong and acknowledge it. Sometimes you just pick a direction and go and hope you’re going in the right general direction.

What’s right for you? I can’t tell you, that would be presumptuous. You need to explore your own needs and make as an informed decision as I have.

Installing Your Own LLM On Your Computer With Ollama.

As I wrote in the last post, there are some good reasons to install your own LLM on your computer. It’s all really simple using Ollama, which allows you to run various models of LLM on your computer.

A GPU is nice, but not required.

Apple and Linux Users can simply go right over to Ollama and just follow the instructions.

For Apple it’s a download, for Linux it’s simply copying and pasting a command line. Apple users who need help should skip to the section about loading models.

For Windows users, there’s a Windows version that’s a preview at the time of this writing. You can try that out if you want, or… you can just add Linux to your machine. It’s not going to break anything and it’s pretty quick.

“OMG Linux is Hard” – no, it isn’t.

For Windows 10 (version 2004 or higher), open a Windows Command Prompt or Powershell with administrator rights – you do this by right clicking the icon and selecting ‘with administrator rights’. Once it’s open, type:

WSL --install

Hit enter, obviously, and Windows will set up a distro of Linux for you on your machine that you can access in the future by just typing ‘WSL’ in the command prompt/PowerShell.

You will be prompted to enter a user name, as well as a password (twice to verify).

Remember the password, you’ll need it. It’s called a ‘sudo’ password, or just the password, but knowing ‘sudo’ will allow you to impress baristas everywhere.

Once it’s done, you can run it simply by entering “WSL” on a command prompt or powershell.

Congratulations! You’re a Linux user. You may now purchase stuffed penguins to decorate your office.

Installing Ollama on Linux or WSL.

At the time of this writing, you’re one command away from running Ollama. A screenshot of it:

Hit the clipboard icon, paste it onto your command line, enter your password, and it will do it’s job. It may take a while, but it’s more communicative than a spinning circle: You can see how much it’s done.

Windows users: if your GPU is not recognized, you may have to search for the right drivers to get it to work. Do a search for your GPU and ‘WSL’, and you should find out how to work around it.

Running Ollama.

To start off, assuming you haven’t closed that window1, you can simply type:

ollama run <insert model name here>

Where you can pick a model name from the library. Llama3 is at the top of the list, so as an example:

ollama run llama3

You’re in. You can save versions of your model amongst other things, which is great if you’re doing your own fine tuning.

If you get stuck, simply type ‘/?‘ and follow the instructions.

Go forth and experiment with the models on your machine.

Just remember – it’s a model, it’s not a person, and it will make mistakes – correcting them is good, but doesn’t help unless you save your changes. It’s a good idea to save your versions with your names.

I’m presently experimenting with different models and deciding which I’ll connect to the Fabric system eventually, so that post will take longer.

  1. If you did close the window on Windows, just open a new one with administrator privileges and type WSL – you’ll be in Linux again, and can continue. ↩︎

Why I Installed AIs (LLMs) On My Local Systems.

The last few days I’ve been doing some actual experimentation, initially begun because of Daniel Miessler’s Fabric, an Open Source Framework for using artificial intelligence to augment we lowly humans instead of the self-lauding tech bros whose business model falls to, “move fast and break things“.

It’s hard to trust people with that sort of business model when you understand your life is potentially one of those things, and you like that particular thing.

I have generative AIs on all of my machines at home now, which was not as difficult as people might think. I’m writing this part up because to impress upon someone how easy it was, I walked them through doing it in minutes over the phone on a Windows machine. I’ll write that up as my next post, since apparently it seems difficult to people.

For myself, the vision Daniel Miessler brought with his implementation, Fabric, is inspiring in it’s own way though I’m not convinced that AI can make anyone a better human. I think the idea of augmenting is good, and I think with all the infoglut I contend with leaning on a LLM makes sense in a world where everyone else is being sold on the idea of using one, and how to use it.

People who wax poetic about how an AI has changed their lives in good ways are simply waxy poets, as far as I can tell.

For me, with writing and other things I do, there can be value here and there – but I want control. I also don’t want to have to risk my own ideas and thoughts by uploading even a hint of them to someone else’s system. As a software engineer, I have seen loads of data given to companies by users, and I know what can be done with it, and I have seen how flexible ethics can be when it comes to share prices.

Why Installing Your Own LLM is a Good Idea. (Pros)

There are various reasons why, if you’re going to use a LLM, it’s a good idea to have it locally.

(1) Data Privacy and Security: If you’re an individual or a business, you should look after your data and security because nobody else really does, and some profit from your data and lack of security.

(2) Control and Customization: You can fine tune your LLM on your own data (without compromising your privacy and security). As an example, I can feed a LLM various things I’ve written and have it summarize where ideas I’ve written about connect – and even tell me if I have something published where my opinion has changed- without worrying about handing all of that information to someone else. I can tailor it myself – and that isn’t as hard as you think.

(3) Independence from subscription fees; lowered costs: The large companies will sell you as much as you can buy, and before you know it you’re stuck with subscriptions you don’t use. Also, since the technology market is full of companies that get bought out and license agreements changed, you avoid vendor lock-in.

(4) Operating offline; possible improved performance: With the LLM I’m working on, being unable to access the internet during an outage does not stop me from using it. What’s more, my prompts aren’t queued, or prioritized behind someone that pays more.

(5) Quick changes are quick changes: You can iterate faster, try something with your model, and if it doesn’t work, you can find out immediately. This is convenience, and cost-cutting.

(6) Integrate with other tools and systems: You can integrate your LLM with other stuff – as I intend to with Fabric.

(7) You’re not tied to one model. You can use different models with the same installation – and yes, there are lots of models.

The Cons of Using a LLM Locally.

(1) You don’t get to hear someone that sounds like Scarlett Johansson tell you about the picture you uploaded1.

(2) You’re responsible for the processing, memory and storage requirements of your LLM. This is surprisingly not as bad as you would think, but remember – backup, backup, backup.

(3) If you plan to deploy a LLM as a business model, it can get very complicated very quickly. In fact, I don’t know all the details, but that’s nowhere in my long term plans.

Deciding.

In my next post, I’ll write up how to easily install a LLM. I have one on my M1 Mac Mini, my Linux desktop and my Windows laptop. It’s amazingly easy, but going in it can seem very complicated.

What I would suggest about deciding is simply trying it, see how it works for you, or simply know that it’s possible and it will only get easier.

Oh, that quote by Diogenes at the top? No one seems to have a source. Nice thought, though a possible human hallucination.

  1. OK, that was a cheap shot, but I had to get it out of my system. ↩︎

Wikipedia, and It’s Trouble with LLMs.

Wikipedia, a wonderful resource despite all the drama that comes with the accumulation of content, is having some trouble dealing with the the large language model (LLMs) AIs out there. There are two core problems – the input, and the output.

“…The current draft policy notes that anyone unfamiliar with the risks of large language models should avoid using them to create Wikipedia content, because it can open the Wikimedia Foundation up to libel suits and copyright violations—both of which the nonprofit gets protections from but the Wikipedia volunteers do not. These large language models also contain implicit biases, which often result in content skewed against marginalized and underrepresented groups of people

The community is also divided on whether large language models should be allowed to train on Wikipedia content. While open access is a cornerstone of Wikipedia’s design principles, some worry the unrestricted scraping of internet data allows AI companies like OpenAI to exploit the open web to create closed commercial datasets for their models. This is especially a problem if the Wikipedia content itself is AI-generated, creating a feedback loop of potentially biased information, if left unchecked…” 

AI Is Tearing Wikipedia Apart“, Claire Woodcock, Vice.com, May 2nd, 2023.

The Input into Wikipedia.

Inheriting the legal troubles of companies that built AI models by taking shortcuts seems like a pretty stupid thing to do, but there are companies and individuals doing it. Fortunately, the Wikimedia Foundation is a bit more responsible, and is more sensitive to biases.

Using a LLM to generate content for Wikipedia is simply a bad idea. There are some tools out there (I wrote about Perplexity.ai recently) that do the legwork for citations, but with Wikipedia, not all citations are necessarily on the Internet. Some are in books, those dusty tomes where we have passed down knowledge over the centuries, and so it takes humans to be able to not just find those citations, but assess them and assure that other citations of other perspectives are involved1.

As they mention in the article, first drafts are not a bad idea, but they’re also not a great idea. If you’re not vested enough in a topic to do the actual reading, should you really be editing a community encyclopedia? I don’t think so. Research is an important part of any accumulation of knowledge, and LLMs aren’t even good shortcuts, probably because the companies behind them took shortcuts.

The Output of Wikipedia.

I’m a little shocked that Wikipedia might not have been scraped by the companies that own LLMs, considering just how much they scraped and from whom. Wikipedia, to me, would have been one of the first things to scrape to build the learning model, as would have been Project Gutenberg. Now that they’ve had the leash yanked, maybe they’re asking for permission now, but it seems peculiar that they would not have scraped that content in the first place.

Yet, unlike companies that simply cash in on the work of volunteers, like Huffington Post, StackOverflow, and so on, Wikimedia has a higher calling – and cashing in on volunteer works would likely cause less volunteers. Any sort of volunteer does so for their own reasons, but in an organization they collectively work toward something. The Creative Commons Licensing Wikipedia has requires attribution, and LLMs don’t attribute anything. I can’t even get ChatGPT to tell me how many books it’s ‘read’.

What makes this simple is that if all the volunteer work from Wikipedia is shoved into the intake manifold of a LLM, and that LLM is subscription based, and volunteers would have to pay to use it, it’s a non-starter.

We All Like The Idea of an AI.

Generally speaking, the idea of an AI being useful for so many things is seductive, from Star Trek to Star Wars. I wouldn’t mind an Astromech droid, but where science fiction meets reality, we are stuck with the informational economy and infrastructure we have inherited over the centuries. Certainly, it needs to be adapted, but there are practical things that need to be considered outside of the bubbles that a few billionaires seem to live in.

Taking the works of volunteers and works from the public domain2 to turn around and sell them sounds Disney in nature, yet Mickey Mouse’s fingerprints on the Copyright Act have helped push back legally on the claims of copyright. Somewhere, there is a very confused mouse.

  1. Honestly, I’d love a job like that, buried in books. ↩︎
  2. Disney started off by taking public domain works and copyrighting their renditions of them, which was fine, but then they made sure no one else could do it – thus the ‘fingerprints’. ↩︎

AI, Confirmation Bias and Our Own Insanity.

In unsurprising news, if you feed artificial intelligences the output of artificial intelligences they become a bit insane. I’d covered that before in Synthetic Recursion, which seemed pretty intuitive even before I wrote that, but scientists at Rice and Stanford University wrote a paper: “Self Consuming Generative Models Go MAD“.

So, we can say that’s been verified.

What’s even worse is apparently, Taylor Swift, Selena Gomez and Kim Kardashian have been saying things that they did not say – organized disinformation that has appeared all over, and if in vacuuming copyrighted content OpenAI’s ChatGPT might get infected. It’s not just artificial intelligences, output from people willfully misleading others can easily make it in.

Fortunately, I verified with ChatGPT4 and they got it right by… using Bing. I don’t use Bing. Why does ChatGPT4? Because of the same reason you can’t have a Coke with your Kentucky Fried Chicken.

While this time it has been caught – it started in November 2023 – it demonstrates how inaccuracies can crop up, how biases can be pushed, and how many problems we still have with misinformation without involving artificial intelligence. Every time we get anything on social media these days we have to fact check, and then we immediately get blowback about fact checking being flawed.

Why? It fits their confirmation biases. Given the way large language models are trained, we can say that we’re getting a lot right and yet we’re collectively also under a delusion that humanity has collected is right. What is true is that what we believe we know just hasn’t been proven wrong yet, with different thresholds for that varying from person to person.

With science, there’s a verification process, but science has been under fire increasingly because of who pays for the papers to be written and published. Academia has to be funded, and we don’t fund that as much as we should so others do sometimes, to their own ends. That’s why it’s important to read the papers, but not everyone has the time to do that. There is good science happening, and I’d like to think more of it is good than bad.

With AI tools, I imagine more papers will be written more quickly, which creates a larger problem. Maybe even an exponentially larger problem.

We accept a lot, and since we don’t know what’s in learning models, we don’t know what has been verified until we find things that aren’t. This means we need to be skeptical, just like when we use Wikipedia. There are some people who don’t like doing that footwork because what they see fits their confirmation biases.

Should we be surprised that our tools would have them too based on what we feed them?

It’s almost as if we need to make sure we’re feeding these learning models with things of value. That should come at a cost, because when we write, when we express ourselves in any way, it’s based largely on experience, sometimes hard won.

Meanwhile, artificial intelligence tools are being created to write summaries of books authors took years to write. Amazon is being flooded with them, apparently, and if I see another advertisement about microlearning on Facebook that seems to use these sort of precis notes, I might throw up through my monitor on someone else’s keyboards.

A Tale of Two AIs.

2023 has been the year where artificial intelligences went from science fiction to technology possibility. It’s become so ubiquitous that on Christmas Eve, chatting with acquaintances and friends, people from all walks of life were talking about it.

I found it disappointing, honestly, because it was pretty clear I was talking about one sort of artificial intelligence where others were talking about another sort of artificial intelligence.

One, a lawyer, mentioned that she’d had lunch with an artificial intelligence expert. On listening and with a few questions, she was talking about what sounded to be a power user of ChatGPT. When I started talking about some of the things I write about here related to artificial intelligence, she said that they had not discussed all of that. Apparently I went a bit too far because she then asked, “But do you use the latest version of ChatGPT that you have to pay for like this expert does?”

Well, yes, I do. I don’t use it to write articles and if I do use ChatGPT to write something, I quote it. I have my own illusions, I don’t need to take credit for any hallucinations ChatGPT has. I also don’t want to incorporate strategic deception in my writing. To me, it’s a novelty and something I often find flaws with. I’m not going to beat up ChatGPT, it has usefulness, but the fact that I can use DALL-E to generate some images, like above, is helpful.

What disturbed me is that she thought that was what an artificial intelligence expert does. That seems a pretty low bar; I wouldn’t claim to be an artificial intelligence expert because I spend $20/month. I’m exploring it like many others and stepping back to look at problematic consequences, of which there are many. If we don’t acknowledge and deal with those, the rest doesn’t seem to matter as much.

That’s the trouble. Artificial intelligence, when discussed or written about, falls into two main categories that co-exist.

Marketed AI.

The most prominent one is the marketing hype right now, where we get ‘experts’ who for whatever reason are claiming a title for being power users of stabs at artificial intelligence. This is what I believe Cory Doctorow wrote about with respect to the ‘AI bubble’. It’s more about perception than reality, in my mind, and in some ways it can be good because it gets people to spend money so that hopefully those that collect it can do something more about the second category.

Yet it wasn’t long ago that people were selling snake oil. In the last decades, I’ve seen ‘website experts’ become ‘social media experts’, and now suddenly we have ‘artificial intelligence experts’.

Actual Artificial Intelligence.

The second category is actually artificial intelligence itself, which I believe we may be getting closer to. It’s where expert systems, which have been around since the 1970s, have made some quantum leaps. When I look at ChatGPT, as an example, I see an inference engine (the code) and the knowledge base which is processed from a learning model. That’s oversimplified, I know, and one can get into semantic arguments, but conceptually it’s pretty close to reality.

If you take a large language model like ChatGPT and feed it only medical information, it can diagnose based on symptoms a patient has. Feed it only information on a programming language like COBOL, it can probably write COBOL code pretty well. ChatGPT has a learning model that we don’t really know, and it is apparently pretty diverse, which allows us to do a lot of pretty interesting things besides generating silly images on blog posts. I’ve seen some code in JavaScript done this way, and I just generated some C++ code as a quick test with ChatGPT 4 that, yes, works and it does something better than most programmers do: it documents how it works.

I’d written about software engineers needing to evolve too with respect to artificial intelligence.

It has potential to revolutionize everything, all walks of life, and it’s going to be really messy because it will change jobs and even replace them. It will be something that will have psychological and sociological consequences, impacting governments and the ways we do… everything.

The Mix of Marketed vs. Actual

The argument could be made that without marketing, businesses would not make enough money for the continued expense of pushing the boundaries of artificial intelligence. Personally, I think this is true. The trouble is that marketing takes over what people believe artificial intelligence is. This goes with what Doctorow wrote about the bubble as well as what Joe McKendrick wrote about artificial intelligence fading into the background. When the phrase is over-used and misused in businesses, which seems to already be happening, the novelty wears off and the bubble pops in business.

That’s kind of what happened with social media and ‘social media experts’.

The marketing aspect, too, also causes people to worry about their own jobs, which maybe they don’t want but they want income because there are bills to pay in modern society. The fear of some is tangible, and with good reason. All the large language models use a very broad brush in answering those fears, as do the CEOs of the companies: We’ll just retrain everyone. There are people getting closer to retirement, and what companies have been doing to save money and improve their stock performance is finding reasons to ‘let people go’, so that comfort is spoken from on high with the same sensitivity as, “Let them eat cake“. It’s dismissive and ignores the reality people live in.

Finding the right balance is hard when there’s no control of the environment. People are talking about what bubbles leave behind, but they don’t talk as much about who they leave behind. Harvard Business Review predicted that the companies that get rid of jobs with artificial intelligence will eventually get left behind, but eventually can be a long time and can have some unpredictable economic consequences.

‘Eventually’ can be a long time.

The balance must be struck by the technology leaders in artificial intelligence, and that seems to be about as unlikely as it was with the dot-com boom. Maybe Chat-GPT 4 can help them out if they haven’t been feeding it enough of their own claims.

And no, you aren’t an ‘artificial intelligence expert’ if you are a paid user of artificial intelligence of any platform alone, just like buying a subscription to a medical journal doesn’t make you a medical professional.

AI, Ethics, Us.

Most of us live in a lot of different worlds, and we see things differently because of it. Some of us live in more than one world at a time. That’s why sometimes it’s hard for me to consider the promise of artificial intelligence and what we’re getting and the direction that’s going.

There’s space in this world in research for what we have now, which allows previously isolated knowledge to be regurgitated in a feat of math that makes the digital calculator look mundane. It’s statistics, it gives us what we want when we hit the ‘Enter’ button, and that’s not too bad.

Except it can replace an actual mind. Previously, if you read something, you didn’t guess if a machine threw the words together or not. You didn’t wonder if the teacher gave you a test generated by a large language model, and the teacher didn’t wonder if you didn’t generate the results the same way.

Now, we wonder. We wonder if we see an image. We wonder if we watch a video. We wonder enough so that the most popular female name for 2023 should be Alice.

So let me tell you where I think we should be heading with AI at this time.

What Could Be.

Everyone who is paying attention to what’s happening can see that the world is fairly volatile right now after the global pandemic, after a lot of economic issues that banks created combined with algorithmic trading… so this is the perfect time to drop some large language models in the world to make things better.

Nope.

No, it isn’t working that way. If we were focused on making the world better rather than worrying about using a good prompt for that term paper or blog post, it maybe could work that way. We could use things like ChatGPT to be consultants, but across mankind we lack the integrity to only use them as consultants.

“If anyone takes an AI system and starts designing speeches or messages, they generate the narrative that people want to hear. And the worst thing is that you don’t know that you are putting the noose around your neck alone.” The academic added that the way to this situation is education.

The only way to avoid manipulation is through knowledge. Without this, without information, without education, any human group is vulnerable, he concluded.1

IA: implicaciones éticas más allá de una herramienta tecnológica“, Miguel Ángel Pérez Álvarez, Wired.com (Spanish), 29 Nov 2023.

There’s the problem. Education needs to adapt to artificial intelligence as well because this argument, which at the heart I believe to be true, does not suffer it’s own recursion because people don’t know when it’s ethically right to use it, or even don’t know that there should be ethics involved.

As it happens, I’m pretty sure Miguel Ángel Pérez Álvarez already understands this and simply had his thoughts truncated, as happens in articles. He’s also got me wondering how different languages are handled by these Large Language Models and how different their training models are.

It’s like finding someone using an image you created and telling them, “Hey, you’re using my stuff!” and they say, “But it was on the Internet”. Nevermind the people who believe that the Earth is flat, or who think that vaccinations give you better mobile connections.

AI doesn’t bother me. It’s people, it’s habits, and in a few decades they’ll put a bandaid on it and call it progress. The trouble is we have a stack of bandaids on top of each other at this point and we really need to look at this beyond the pulpits of some billionaires who enjoy more free speech than anyone else.

  1. actual quote: “Si cualquier persona toma un sistema de IA y se pone a diseñar discursos o mensajes, te generan la narrativa que la gente quiere escuchar. Y lo peor es que tú no sabes que te estás poniendo la soga al cuello solito”. El académico añadió que la manera de contrarrestar esta situación es la educación.

    “La única manera de evitar la manipulación es a través del conocimiento. Sin este, sin información, sin educación, cualquier grupo humano es vulnerable”, finalizó.” ↩︎

Strategic Deception, AI, and Investors.

‘Strategic deception’ in large language models is indeed a thing. It should be unsurprising. After all, people do it all the time when trying to give the answer that is wanted by the person asking the question.

Large Language Models are designed to… give the answer wanted by the person asking the question.

That there had to be a report on this is a little disturbing. It’s the nature of the Large Language Model algorithms.

Strategic deception is at the very least one form of AI Hallucination, which potentially reinforces biases that we might to think twice about. Like Arthur Juliani, I believe the term ‘hallucinate’ is misleading, and I believe we’re seeing a shift away from that. Good.

It’s also something I simply summarize as ‘bullshitting’. It is, after all, just statistics, but it’s statistics toward an end, which makes the statistics pliable enough for strategic deception.

It’s sort of like AI investors claiming ‘Fair Use’ when not paying for copyrighted materials in the large language models. If they truly believe that, it’s a strategic deception on themselves. If they wanted to find a way, they could, and they still may.

NYT Says No To Bots.

The content for training large language models and other AIs has been something I have written about before, with being able to opt out of being crawled by AI bots. The New York Times has updated it’s Terms and Conditions to disallow that – which I’ll get back to in a moment.

It’s an imperfect solution for so many reasons, and as I wrote before when writing about opting out of AI bots, it seems backwards.

In my opinion, they should allow people to opt in rather than this nonsense of having to go through motions to protect one’s content from being used as a part of a training model.

Back to the New York Times.

…The New York Times updated its terms of services Aug. 3 to forbid the scraping of its content to train a machine learning or AI system.

The content includes but is not limited to text, photographs, images, illustrations, designs, audio clips, video clips, “look and feel” and metadata, including the party credited as the provider of such content.

The updated TOS also prohibits website crawlers, which let pages get indexed for search results, from using content to train LLMs or AI systems…

The New York Times Updates Terms of Service to Prevent AI Scraping Its Content“, Trishla Ostwal, Adweek.com, August 10th 2023.

This article was then referenced by The Verge, which added a little more value.

…The move could be in response to a recent update to Google’s privacy policy that discloses the search giant may collect public data from the web to train its various AI services, such as Bard or Cloud AI. Many large language models powering popular AI services like OpenAI’s ChatGPT are trained on vast datasets that could contain copyrighted or otherwise protected materials scraped from the web without the original creator’s permission…

The New York Times prohibits using its content to train AI models“, Jess Weatherbed, TheVerge.com, Augus 14th, 2023.

That’s pretty interesting considering that Google and the New York Times updated their agreement on News and Innovation on February 6th, 2023.

This all falls into a greater context where many media organizations called for rules protecting copyright in data used to train generative AI models in a letter you can see here.

Where does that leave us little folk? Strategically, bloggers have been a thorn in the side of the media for a few decades, driving down costs for sometimes pretty good content. Blogging is the grey area of the media, and no one really seems to want to tackle that.

I should ask WordPress.com what their stance is. People on Medium and Substack should also ask for a stance on that.

Speaking for myself – if you want to use my content for your training model so that you can charge money for a service, hit me in the wallet – or hit the road.

Coding and AI: Programmer Evolution.

Non technical people may not realize that every bit of software they use, each ‘app’, has it’s own life cycle. It’s called the software development life cycle, and we call it that because the focus is not on the software but on the business model that supports it.

There are some that argue that Agile programming has made the software development life cycle antiquated or even obsolete, but I believe that they have that wrong because Agile Programming is just a different approach to the software development life cycle.

Software is developed based on need, or expected need. It is planned based on requirements given by someone – generally the person who is funding the project – and developed and eventually maintained. Most software developers spend their time maintaining existing projects rather than creating that ‘killer app’ that just about every developer wishes to do.

Eventually, the software becomes obsolete for some reason, and generally speaking it means it’s making less money than it costs to support it. Free Software and Open Source software defy obsolescence because it’s generally about interest.

Artificial intelligence isn’t going to end life cycles, or programming. In fact, it’s just changing the face of programming because when you write a prompt, you’re really… programming. Programming has evolved from the lower level languages where we flipped ones and zeroes to increasingly higher level languages so that more people could program. Software development has constantly evolved and artificial intelligence is not that different.

What is kind of interesting is potentially being able to throw away some of these high level programming languages and replace them with lower level programming languages (that tend to be more efficient for a computing device to run) and just have a large language model write the code.

Regardless, people who write code will need to evolve. When I started out decades ago, a single person could write a game or application, then the needs became more complex and software development became increasingly social and multidisciplinary – and even specialized. Coders simply have to adapt again to writing better prompts – which also means better communication skills with those who want the code in the first place, as flawed as their requirements generally are.

Even as people write about the artificial intelligence life cycles, the core concepts aren’t different. In fact, for someone who has experience with software processes (not just one life cycle), it looks pretty much the same.