Just this past week as I committed to the Pine64 cluster I’ll be building, I was looking at new ways to crunch some data. As such, I was looking into clustering and databases, something I haven’t done before.
Meanwhile, a Microsoft SQL Server DBA told me that some company had made a claim that they could import millions of records in minutes, or something along those lines. He was of the opinion that it was impossible. And here I am thinking about clusters and data, so I said, “ya know, if they cluster the databases they could get throughput that might allow for that…”
I’m not a DBA. I don’t play one on the internet. What I know of databases is limited to what I have used them for in the last 2 decades, and I have no real experience with clustered databases – but it stood to reason to me that a well designed cluster would have more throughput ability. That’s just a theory because of my limited experience, so… because I’m finally able to focus on writing and researching rather than updating my old site… I did some research this morning.
It ends up my MySQL bias shows a bit here not because I think it’s the greatest thing in the world (far from it), but because most of my experience revolves around it – and that biased my research. When I tried looking into Microsoft SQL Server clustering I found nothing I was looking for but plenty I was not (the Microsoft method of emulating Linux and making it worse). I gripe, but hey, I have a bias and let you know.
Here’s what I found.
‘Where would I use a MySQL cluster?‘: Interesting read and links.
‘MySQL Cluster Features & Benefits’: OK, I have to admit a few geek-gasms in here as I was exploring in the context of my own cluster, but a young software engineer who I respect has been waxing poetic about MongoDB, too, so I’m not done yet. I have until April 2016 anyway.
‘MySQL Cluster – When to use it and when not to‘: A wake up call; it’s only a presentation teaser and yet it manages to say a lot more with a lot less.
That’s just some stuff on MySQL.
So, in the end, I stand by the theory that a clustered database can get much better throughput than the traditional systems, and I don’t know why that seems like a stretch for some.
But I also know that I’ve only scratched the surface and that I have to dig deeper on that. My preliminary reads on MongoDB and Production sharded clusters make that also very interesting.