I wrote ‘Risk and the Photo Cloud‘ as a first stab at identifying the problem I am having with mitigating risk with my Flickr stream as well as new needs I have from my collection of images.
Bear in mind, this is a ‘one off’ problem for me, and I approached it as such.
I was a bit more focused on what I wanted to do with them – I have people willing to buy prints – and I may not have made the best choice by paying $25 for Bulkr Pro. The trick was to get the tags, etc, and in doing the initial research on the Flickr forums, I saw nothing better. In writing this today, I found FlickrDownloader which I probably would have chosen given that it’s open source, available at no cost, etc. I may give it a spin anyway. Try it out and let me know how it differs.
When using Bulkr Pro, you can opt to have the full information being saved to text files. This is generally not a bad idea, but it’s certainly not a great idea when you have over 19,000 files on Flickr as I do (only 18,000 or so are public). So, the general idea is to get them into a database.
I opted not to directly import them into the database after some thought because I need to do some manual editing of which files I want to keep, etc – and in doing that, a spreadsheet would be easier for that part of the process. So that’s what I did.
The text files generated by Bulkr are mildly annoying in this regard because their format isn’t meant for what I wanted to do. However, the line numbers for the specific information is constant, as is the labeling, so I was able to whip together a Python 3.x script that reads all the text files and writes them all to a CSV. I tossed it up on my GitHub; you can check out https://github.com/knowprose/BulkrTxtFilesToCSV
So now, I have the information in a CSV and, as time permits, I’m working on choosing which images I am getting rid of (with Flickr, I was lazy about space). Let’s just be clear and call it ‘dirty data’; when I uploaded I did it as a hobby and I now have a professional use.
From there, it’s a simple matter of uploading the CSV to MariaDB, which is extremely fast.
Having spoken to a few professional photographers I know, there are professional tools out there that do what I want with image management, etc, but I’m not too pleased with them. I may end up spinning my own image and file management system in Python, or not – I’m undecided at this point because I’m moving from a ‘one off’ to a more consistent system that will suit my personal needs.
Why Python? Honestly, I like coding in Python, but the job market has always been more interested in my C, C++, C#, VB, VB.Net, PHP/MySQL, etc. This is a low priority project for me, and it’s something I want to be fun while getting used to Python 3 a bit more.
At this point, the system I’m thinking of will create resized images for the web, as well as allow me to edit based on tags. Pillow will allow for much of that, and more.