KnowProse.com off WordPress.com, Now on Hostinger.

It’s been a while since I wrote something on the site – that was largely to do with not wanting my content scraped, and being WordPress.com did not fill me with trust or confidence in what the company was doing.

Nevermind the whole WordPress vs. WPEngine debacle, that I have not read much into because my life has sufficient drama and I do not wish to overflow with it. I did do some initial reading and quickly realized the whole thing seemed engineered.

Instead, I switched to Hostinger (referral link). It was fairly easy since I opted to continue using WordPress for the site after shopping around a bit, though I am working on a semi-personal project with Drupal 11 – which Hostinger’s love for on the command line is as deprecated as the command line PHP version is. This related to running Composer – the command line is PHP 8.2.19, and Composer2 on there presently requires 8.3+ as Drupal 11 does… bleeding edge requires blood or it’s not bleeding edge, right?

The domain transfer was about the full 7 days, and I could speculate on why that is but that has no value.

The site is more plain, at least for now, and eventually there will be likely be some advertising on it – but not in the way advertising has manifested itself on sites I visit. No, the site will not spam you to give you updates. No, the site will not have pop-ups that just annoy you. No, the site will not… well, you get the point.

I did consider Bluehost. Over a decade ago, I had a really bad experience with Bluehost whose pain this site still feels – their automatic backups, at least then, did not really work on a daily level. The site went down when I was at a CARDICIS conference – I forget which one – and by the time I could have unfettered access to the site when I returned home, a lot of the site was gone. Bluehost may have improved since then – I certainly hope they have – and even though it was likely an outlier event for me, and they may have improved, I opted not to go with them.

This does not mean my experience should color yours, mind you. It would appear that they’re still in business, so they’re doing something right. At the time, I had a tendency to be bleeding edge with the sites that I write on and that may too have bitten me in the posterior. We are, though, creatures that remember pain even beyond rationality.

So yes, KnowProse.com is back from hiatus.

What to do about scraping for LLM learning is the only real thing left.

WordPress.com, Tumblr to Sell Information For AI Training: What You can do.

I accidentally posted this on RealityFragments.com, but I think it’s important enough to leave it there. The audiences vary, but both have other bloggers on them.

While I was figuring out how to be human in 2024, I missed that Tumblr and WordPress posts will reportedly be used for OpenAI and Midjourney training.

This could be a big deal for people who take the trouble to write their own content rather than filling the web with Generative AI text to just spam out posts.

If you’re involved with WordPress.org, it doesn’t apply to you.

WordPress.com has an option to use Tumblr as well, so when you post to WordPress.com it automagically posts to Tumblr. Therefore you might have to visit both of the posts below and adjust your settings if you don’t want your content to be used in training models.

This doesn’t mean that they haven’t already sent information to Midjourney and OpenAI yet. We don’t really know, but from the moment you change your settings…

  • WordPress.com: How to opt out of the AI training is available here.

    It boils down to this part in your blog settings on WordPress.com:


  • With Tumblr.com, you should check out this post. Tumblr is more tricky, and the link text is pretty small around the images – what you need to remember is after you select your blog on the left sidebar, you need to use the ‘Blog Settings’ link on the right sidebar.

Hot Take.

When I was looking into all of this, it ends up that Automattic, the owners of WordPress.com and Tumblr.com is doing the sale.

If you look at your settings, if you haven’t changed them yet, you’ll see that the default was set to allowing the use of content for training models. The average person who uses these sites to post their content are likely unaware, and in my opinion if they wanted to do this the right way the default setting would be to have these settings opt out.

It’s unclear whether they already sent posts. I’m sure that there’s an army of lawyers who will point out that they did post it in places and that the onus was on users to stay informed. It’s rare for me to use the word ‘shitty’ on KnowProSE.com, but I think it’s probably the best way to describe how this happened.

It was shitty of them to set it up like this. See? It works.

Now some people may not care. They may not be paying users, or they just don’t care, and that’s fine. Personal data? Well, let’s hope that got scrubbed.

Some of us do. I don’t know how many, so I can’t say a lot or a few. Yet if Automattic, the parent company of both Tumblr and WordPress.com, will post that they care about user choices, it hardly seems appropriate that the default choice was not to opt out.

As a paying user of WordPress.com, I think it’s shitty to think I would allow the use of what I write, using my own brain, to be used for a training model that the company gets paid for. I don’t see any of that money. To add injury to that insult of my intelligence, Midjourney and ChatGPT also have subscription to offer the trained AI which I also pay for (ChatGPT).

To make matters worse, we sort of have to take the training models on the word of those that use them. They don’t tell us what’s in them or where the content came from.

This is my opinion. It may not suit your needs, and if you don’t have a pleasant day. But if you agree with this, go ahead, make sure your blog is not allowing third party data sharing.

Personally, I’m unsurprised at how poorly this has been handled. Just follow some of the links early on in the post and revel in dismay.