From Inputs to The Big Picture: An AI Roundup

This started off as a baseline post regarding generative artificial intelligence and it’s aspects and grew fairly long because even as I was writing it, information was coming out. It’s my intention to do a ’roundup’ like this highlighting different focuses as needed. Every bit of it is connected, but in social media postings things tend to be written of in silos. I’m attempting to integrate since the larger implications are hidden in these details, and will try to stay on top of it as things progress.

It’s long enough where it could have been several posts, but I wanted it all together at least once.

No AI was used in the writing, though some images have been generated by AI.

The two versions of artificial intelligence on the table right now – the marketed and the reality – have various problems that make it seem like we’re wrestling a mating orgy of cephalopods.

The marketing aspect is a constant distraction, feeding us what helps with stock prices and good will toward those implementing the generative AIs, while the real aspect of these generative AIs is not really being addressed in a cohesive way.

To simplify this, this post breaks it down into the Input, the Output, and the impacts on the ecosystem the generative AIs work in.

The Input.

There’s a lot that goes into these systems other than money and water. There’s the information used for the learning models, the hardware needed, and the algorithms used.

The Training Data.

The focus so far has been on what goes into their training data, and that has been an issue including lawsuits, and less obviously, trust of the involved companies.

…The race to lead A.I. has become a desperate hunt for the digital data needed to advance the technology. To obtain that data, tech companies including OpenAI, Google and Meta have cut corners, ignored corporate policies and debated bending the law, according to an examination by The New York Times…

How Tech Giants Cut Corners to Harvest Data for A.I.“, Cade MetzCecilia KangSheera FrenkelStuart A. Thompson and Nico Grant, New York Times, April 6, 2024 1

Of note, too, is that Google has been indexing AI generated books, which is what is called ‘synthetic data’ and has been warned against, but is something that companies are planning for or even doing already, consciously and unconsciously.

Where some of these actions are questionably legal, they’re not as questionably ethical to some, thus the revolt mentioned last year against AI companies using content without permission. It’s of questionable effect because no one seems to have insight into what the training data consists of, and there seems no one is auditing them.

There’s a need for that audit, if only to allow for trust.

…Industry and audit leaders must break from the pack and embrace the emerging skills needed for AI oversight. Those that fail to address AI’s cascading advancements, flaws, and complexities of design will likely find their organizations facing legal, regulatory, and investor scrutiny for a failure to anticipate and address advanced data-driven controls and guidelines.

Auditing AI: The emerging battlefield of transparency and assessment“, Mark Dangelo, Thomson Reuters, 25 Oct 2023.

While everyone is hunting down data, no one seems to be seriously working on oversight and audits, at least in a public way, though the United States is pushing for global regulations on artificial intelligence at the UN. The status of that hasn’t seemed to have been updated, even as artificial intelligence is being used to select targets in at least 2 wars right now (Ukraine and Gaza).

There’s an imbalance here that needs to be addressed. It would be sensible to have external auditing of learning data models and the sources, as well as the algorithms involved – and just get get a little ahead, also for the output. Of course, these sorts of things should be done with trading on stock markets as well, though that doesn’t seem to have made as much headway in all the time that has been happening either.

Some websites are trying to block AI crawlers, and it is an ongoing process. Blocking them requires knowing who they are and doesn’t guarantee bad actors might not stop by.

There is a new Bill that being pressed in the United States, the Generative AI Copyright Disclosure Act, that is worth keeping an eye on:

“…The California Democratic congressman Adam Schiff introduced the bill, the Generative AI Copyright Disclosure Act, which would require that AI companies submit any copyrighted works in their training datasets to the Register of Copyrights before releasing new generative AI systems, which create text, images, music or video in response to users’ prompts. The bill would need companies to file such documents at least 30 days before publicly debuting their AI tools, or face a financial penalty. Such datasets encompass billions of lines of text and images or millions of hours of music and movies…”

New bill would force AI companies to reveal use of copyrighted art“, Nick Robins-Early, TheGuardian.com, April 9th, 2024.

Given how much information is used by these companies already from Web 2.0 forward, through social media websites such as Facebook and Instagram (Meta), Twitter, and even search engines and advertising tracking, it’s pretty obvious that this would be in the training data as well.

The Algorithms.

The algorithms for generative AI are pretty much trade secrets at this point, but one has to wonder at why so much data is needed to feed the training models when better algorithms could require less. Consider a well read person could answer some questions, even as a layperson, with less of a carbon footprint. We have no insight into the algorithms either, which makes it seem as though these companies are simply throwing more hardware and data at the problem than being more efficient with the data and hardware that they already took.

There’s not much news about that, and it’s unlikely that we’ll see any. It does seem like fuzzy logic is playing a role, but it’s difficult to say to what extent, and given the nature of fuzzy logic, it’s hard to say whether it’s implementation is as good as it should be.

The Hardware

Generative AI has brought about an AI chip race between Microsoft, Meta, Google, and Nvidia, which definitely leaves smaller companies that can’t afford to compete in that arena at a disadvantage so great that it could be seen as impossible, at least at present.

The future holds quantum computing, which could make all of the present efforts obsolete, but no one seems interested in waiting around for that to happen. Instead, it’s full speed ahead with NVIDIA presently dominating the market for hardware for these AI companies.

The Output.

One of the larger topics that has seemed to have faded is regarding what was called by some as ‘hallucinations’ by generative AI. Strategic deception was also something that was very prominent for a short period.

There is criticism that the algorithms are making the spread of false information faster, and the US Department of Justice is stepping up efforts to go after the misuse of generative AI. This is dangerous ground, since algorithms are being sent out to hunt products of other algorithms, and the crossfire between doesn’t care too much about civilians.2

The impact on education, as students use generative AI, education itself has been disrupted. It is being portrayed as an overall good, which may simply be an acceptance that it’s not going away. It’s interesting to consider that the AI companies have taken more content than students could possibly get or afford in the educational system, which is something worth exploring.

Given that ChatGPT is presently 82% more persuasive than humans, likely because it has been trained on persuasive works (Input; Training Data), and since most content on the internet is marketing either products, services or ideas, that was predictable. While it’s hard to say how much content being put into training data feeds on our confirmation biases, it’s fair to say that at least some of it is. Then there are the other biases that the training data inherits through omission or selective writing of history.

There are a lot of problems, clearly, and much of it can be traced back to the training data, which even on a good day is as imperfect as our own imperfections, it can magnify, distort, or even be consciously influenced by good or bad actors.

And that’s what leads us to the Big Picture.

The Big Picture

…For the past year, a political fight has been raging around the world, mostly in the shadows, over how — and whether — to control AI. This new digital Great Game is a long way from over. Whoever wins will cement their dominance over Western rules for an era-defining technology. Once these rules are set, they will be almost impossible to rewrite…

Inside the shadowy global battle to tame the world’s most dangerous technology“, Mark Scott, Gian Volpicelli, Mohar Chatterjee, Vincent Manancourt, Clothilde Goujard and Brendan Bordelon, Politico.com, March 26th, 2024

What most people don’t realize is that the ‘game’ includes social media and the information it provides for training models, such as what is happening with TikTok in the United States now. There is a deeper battle, and just perusing content on social networks gives data to those building training models. Even WordPress.com, where this site is presently hosted, is selling data, though there is a way to unvolunteer one’s self.

Even the Fediverse is open to data being pulled for training models.

All of this, combined with the persuasiveness of generative AI that has given psychology pause, has democracies concerned about the influence. A recent example is Grok, Twitter X’s AI for paid subscribers, fell victim to what was clearly satire and caused a panic – which should also have us wondering about how we view intelligence.

…The headline available to Grok subscribers on Monday read, “Sun’s Odd Behavior: Experts Baffled.” And it went on to explain that the sun had been, “behaving unusually, sparking widespread concern and confusion among the general public.”…

Elon Musk’s Grok Creates Bizarre Fake News About the Solar Eclipse Thanks to Jokes on X“, Matt Novak, Gizmodo, 8 April 2024

Of course, some levity is involved in that one whereas Grok posting that Iran had struck Tel Aviv (Israel) with missiles seems dangerous, particularly when posted to the front page of Twitter X. It shows the dangers of fake news with AI, deepening concerns related to social media and AI and should be making us ask the question about why billionaires involved in artificial intelligence wield the influence that they do. How much of that is generated? We have an idea how much it is lobbied for.

Meanwhile, Facebook has been spamming users and has been restricting accounts without demonstrating a cause. If there were a video tape in a Blockbuster on this, it would be titled, “Algorithms Gone Wild!”.

Journalism is also impacted by AI, though real journalists tend to be rigorous in their sources. Real newsrooms have rules, and while we don’t have that much insight into how AI is being used in newsrooms, it stands to reason that if a newsroom is to be a trusted source, they will go out of their way to make sure that they are: They have a vested interest in getting things right. This has not stopped some websites parading as trusted sources disseminating untrustworthy information because, even in Web 2.0 when the world had an opportunity to discuss such things at the World Summit on Information Society, the country with the largest web presence did not participate much, if at all, at a government level.

Then we have the thing that concerns the most people: their lives. Jon Stewart even did a Daily Show on it, which is worth watching, because people are worried about generative AI taking their jobs with good reason. Even as the Davids of AI3 square off for your market-share, layoffs have been happening in tech as they reposition for AI.

Meanwhile, AI is also apparently being used as a cover for some outsourcing:

Your automated cashier isn’t an AI, just someone in India. Amazon made headlines this week for rolling back its “Just Walk Out” checkout system, where customers could simply grab their in-store purchases and leave while a “generative AI” tallied up their receipt. As reported by The Information, however, the system wasn’t as automated as it seemed. Amazon merely relied on Indian workers reviewing store surveillance camera footage to produce an itemized list of purchases. Instead of saving money on cashiers or training better systems, costs escalated and the promise of a fully technical solution was even further away…

Don’t Be Fooled: Much “AI” is Just Outsourcing, Redux“, Janet Vertesi, TechPolicy.com, Apr 4, 2024

Maybe AI is creating jobs in India by proxy. It’s easy to blame problems on AI, too, which is a larger problem because the world often looks for something to blame and having an automated scapegoat certainly muddies the waters.

And the waters of The Big Picture of AI are muddied indeed – perhaps partly by design. After all, those involved are making money, they have now even better tools to influence markets, populations, and you.

In a world that seems to be running a deficit when it comes to trust, the tools we’re creating seem to be increasing rather than decreasing that deficit at an exponential pace.

  1. The full article at the New York Times is worth expending one of your free articles, if you’re not a subscriber. It gets into a lot of specifics, and is really a treasure chest of a snapshot of what companies such as Google, Meta and OpenAI have been up to and have released as plans so far. ↩︎
  2. That’s not just a metaphor, as the Israeli use of Lavender (AI) has been outed recently. ↩︎
  3. Not the Goliaths. David was the one with newer technology: The sling. ↩︎

So. Many. Layoffs.

I’ve been looking at getting back into the ring of software engineering, but it doesn’t seem like a great time to do it.

When Google was laying off workers, I shook my head a bit. It ends up that Google spent 800 million dollars in layoffs just this month. Just this month!

By comparison, Google spent $2.1 billion dollars on layoff expenses for more than 12,000 employees over the course of 2023. Other Google employees only knew about people being dismissed when people’s emails got bounced back last year in February.

With so many layoffs, hopefully they’re getting better at it. Well, maybe not. Google employees have been told more layoffs are coming this year.

I imagine that there are some pretty high quality resumes floating around. As far as the tech field goes, Google is probably considered top tier, and landing a position against someone with Google on their resume is going to be tough.

There’s a problem with that, though. More than 25,000 tech workers from 100 companies got the axe in first few weeks of 2024. Meta, Amazon, Microsoft, Google, TikTok and Salesforce are included in that… and Microsoft numbers may account for the Blizzard/Activision layoffs that happened this past week, sadly.

Blizzard was one of those dream jobs I had as a significantly younger developer way back when. They were often late on delivery for a new game, but it was pretty much worth it. I still play Starcraft II.

It’s become an employer’s job market – maybe it was before, but definitely more so now, and in an era when artificial intelligence may be becoming more attractive for companies and software development, as well as other things. For all we know, they may have consulted artificial intelligence for some of the layoffs, though. It wouldn’t be the first time that happened, though that was in Russia.

I can’t imagine that Google, Microsoft, Meta and Amazon aren’t using big data and AI for this, at least behind the scenes, but it’s probably not being explained because of the blowback that might cause. ‘Fired by AI’ is not something that people would like to see.

When tech companies axe companies, Wall Street rewards them, so stock prices go up – and there are more unemployed technology folk in a period when AI tools are making so many types of productivity easier. Maybe too much easier.

This reminds me so much of the 1990s. The good news is that tech survived the 1990s despite the post-merger layoffs.

Of course, the correction on the NPR article(at the bottom) is something I wish I had caught earlier. “Nearly 25,000 tech workers were laid in the first weeks of 2024. Why is that?would definitely be an article worth reading.

Running The Biostar Racing P1

b20160823I had a problem. In my apartment here in South Oropouche, I had the need for a sort of media PC in the living room.

Sometimes I want to kick back and write on my old Chromebook while watching Netflix or a streaming news/space service on YouTube. Sometimes I want to write from my dining table – really, a patio table I have indoors because I like it. Sometimes, I want to listen to music while I’m working out in the living room. Sometimes, I want to have multimedia ability in the living room when I have visitors who aren’t in the bedroom (can I write that publicly?).

I’m in Trinidad and Tobago, so options are limited as far as what I can find locally. When I visited Pricesmart, I saw a Lenovo ‘Yoga Pad’ I almost got until I tried the keyboard (ugh!) and thought through what I actually wanted. They had a Haier Mini-PC that looked promising, but there were no boxes and a web search on my phone only showed a link to The Wizz whose site was down for maintenance.

I visited an Apple reseller and stared at the old and somewhat disappointing specs of the Apple Mini, which has become the one thing that Apple doesn’t seem to want to advance. And for the cost? Oh, Apple, your systems are so pretty, and OS-X is nice, but my word, your prices suck. Apple lovers, sorry, I see how you like spending your money but I can buy a lot of beer with the difference in price.

So I ended up at The Wizz in San Fernando, mainly to chase down the Haier and see what it looked like outside of a Pricesmart display that managed to tell everyone nothing. A gentleman helped me out, and dutifully trotted out the competition. That competition included the Biostar Racing P1, which I ended up with, as well as it’s little sibling, an Android version.

I’ll commend The Wizz here – over the years, on the rare occasions when I visited them, they have always been good – even over a decade ago when they were in some ways competition (I had a brief flirtation with wholesaling with one of their competitors). This was, hands down, my best experience with them. I picked up a keyboard, mouse and modest monitor for the system.

I got home. That’s when the troubles began.

Setting Up The Biostar Racing P1.

The box says that it’s Windows 10 compatible – and I mistakenly thought it actually came with Windows 10 on it. No such luck. Instead, it came with a CD for a system that – oh, this has got to take the cake – doesn’t have an optical drive. In fact, it’s so small, an optical drive couldn’t fit in it. So why on Earth would Biostar do this?

The documentation that comes with the system, a folded sheet of color printing, looks informative at a glance until you try to use it – they believe you know more than you do – and it’s actually not much better than their FAQ on installing Windows 10 on the Biostar Racing P1, dancing between informative and ‘WTF?’. It’s then I realized that my other systems also lacked optical drives – who uses those anymore? So here, I have a CD with no way to use it and cagey documentation on how to use the CD.

So I went with Linux. Lubuntu, Kubuntu – I went through quite a few distros in the course of an hour, using Rufus as noted in the FAQ, and every time there was no love for the AP6255 Wifi on the system. Oh, and the sound didn’t work. 3 hours in, I found myself scanning through kernel logs and considering hacking through all of it when I realized:

(1) I’m tired.
(2) I did not buy the machine to be a project, I bought it to be an appliance.
(3) I wasn’t committed to any course of action, I was committed to getting the results I wanted.

As it happened, a helpful cousin lent me a portable optical drive – so I (mistakenly) thought I’d install Windows from it. No joy – there is no Windows on that CD, I found, only drivers (not for Linux). At that point I realized I actually had to install Windows – I was tired – so I went to Microsoft and downloaded the ISO for Rufus to install via USB – that download took all night. I attempted to purchase a Windows 10 License, having figured out that it was necessary, but Microsoft gave me no joy. Amazon.com did. I punched in the product key during the installation on the Biostar Racing P1, and after an hour or so I used the borrowed optical drive to install the drivers.

It works, but honestly, this was annoying for me. Sure, I could have hacked through, sure, I could have done other things, but the documentation sucks and is a little misleading in my opinion. So, what do you need to know?

It’s a pain to get running, largely because no OS is pre-installed (to keep the price down, probably) and because the driver media is in a form that doesn’t come with the machine. You quite literally need another working machine to set the Biostar Racing P1 up, and if you don’t have an optical drive, you’ll have to navigate to Biostar to download the drivers, put them on a USB key, and hope you manage that without problems.

Now, using it once it’s all set up with Windows 10? Not bad. In fact, I wrote this entire entry using the system. Do I like it? Now that the annoyance of setting it up has passed, yes.

Would I suggest buying it for the casual user? Not unless you have a portable optical drive and access to a Windows 10 ISO as well as license. The lack of those two things is infuriating. It could easily be resolved by Biostar if they chose to install the OS at the factory – and honestly, they could toss a Linux distro on it themselves and save everyone some heartache. And money.