Those of us that create anything – at least without the crutches of a large language model like ChatGPT- are a bit concerned about our works being used to train large language models. We get no attribution, no pay, and the companies that run the models basically can just grab our work, train their models and turn around and charge customers for access to responses that our work helped create.
No single one of us is likely that important. But combined, it’s a bit of a rip off. One friend suggested being able to block the bots, which is an insurmountable task because it depends on the bots obeying what is in the robots.txt file. There’s no real reason that they have to.
“How to Block AI Chatbots From Scraping Your Website’s Content” is a worthwhile guide to attempting to block the bots. It also makes the point that maybe it doesn’t matter.
I think that it does, at least in principle, because I’m of the firm opinion that websites should not have to opt out of being used by these AI bots – but rather, that websites should opt in as they wish. Nobody’s asked for anything, have they? Why should these companies use your work, or my work, without recompense and then turn around and charge access to these things?
Somehow, we got stuck with ‘opting out’ when what these companies running the AI Bots should have done is allow people to opt in with a revenue model.
TAANSTAAFL. Except if you’re a large tech company, apparently.
On the flip side, Zoom says that they’re not using data from users for their training models. Taken at face value, that’s great, but the real problem is that we wouldn’t know if they did.