Pulling My Photos From Flickr: Lessons Learned, More.

nsb SunriseI wrote ‘Risk and the Photo Cloud‘ as a first stab at identifying the problem I am having with mitigating risk with my Flickr stream as well as new needs I have from my collection of images.

Bear in mind, this is a ‘one off’ problem for me, and I approached it as such.

I was a bit more focused on what I wanted to do with them – I have people willing to buy prints – and I may not have made the best choice by paying $25 for Bulkr Pro. The trick was to get the tags, etc, and in doing the initial research on the Flickr forums, I saw nothing better. In writing this today, I found FlickrDownloader which I probably would have chosen given that it’s open source, available at no cost, etc. I may give it a spin anyway. Try it out and let me know how it differs. 

When using Bulkr Pro,  you can opt to have the full information being saved to text files.  This is generally not a bad idea, but it’s certainly not a great idea when you have over 19,000 files on Flickr as I do (only 18,000 or so are public). So, the general idea is to get them into a database.

I opted not to directly import them into the database after some thought because I need to do some manual editing of which files I want to keep, etc – and in doing that, a spreadsheet would be easier for that part of the process. So that’s what I did.

The text files generated by Bulkr are mildly annoying in this regard because their format isn’t meant for what I wanted to do. However, the line numbers for the specific information is constant, as is the labeling, so I was able to whip together a Python 3.x script that reads all the text files and writes them all to a CSV. I tossed it up on my GitHub; you can check out https://github.com/knowprose/BulkrTxtFilesToCSV

So now, I have the information in a CSV and, as time permits, I’m working on choosing which images I am getting rid of (with Flickr, I was lazy about space). Let’s just be clear and call it ‘dirty data’; when I uploaded I did it as a hobby and I now have a professional use.

From there, it’s a simple matter of uploading the CSV to MariaDB, which is extremely fast.

Having spoken to a few professional photographers I know, there are professional tools out there that do what I want with image management, etc, but I’m not too pleased with them. I may end up spinning my own image and file management system in Python, or not – I’m undecided at this point because I’m moving from a ‘one off’ to a more consistent system that will suit my personal needs.

Why Python? Honestly, I like coding in Python, but the job market has always been more interested in my C, C++, C#, VB, VB.Net, PHP/MySQL, etc. This is a low priority project for me, and it’s something I want to be fun while getting used to Python 3 a bit more. 

At this point, the system I’m thinking of will create resized images for the web, as well as allow me to edit based on tags. Pillow will allow for much of that, and more.

The Why and Why Not Of Documentation

FAIL stampMy core pet peeve of the last 10 years is the lack of documentation I’ve encountered with Software Projects. Sometimes it’s as the consultant that’s called in after someone already charged a lot of money for a substandard job. Sometimes it’s that legacy project that sticks around through generations of developers who leave a company. Sometimes it’s the “we don’t have time” factor.

Documentation is a value-added part of any software project for a variety of reasons:

  • Reference: The ability of software engineers involved with a project to look back and see why things were done the way they were, and to allow for knowing the limitations and potential of a project.
  • On-boarding: Bringing new software engineers up to speed quickly without having to bounce around the company finding answers.
  • New Projects: Proper documentation of old projects, including estimates and other data, is of great use for estimating similar projects. (Project Management and Business Analysis)
  • Complicated code reviews are simplified.
  • If someone gets hit by a bus, someone else can take over. Quickly.
  • Legal: If a project is supposed to meet certain requirements, those requirements should be traced to the actual implementation.
  • Teams: Proper documentation in an Agile or DevOps process allows team members to see what others are doing in similar areas of code, and allows easy identification of problem areas quickly.

These are just some of the reasons documentation is important. More often than not, I’ve ended up writing documentation at different companies because it simply did not exist before.

Why didn’t it exist? Here are the standard excuses:

  • No Time: We didn’t have any time to write the documentation because we’re busy working on the next thing!
  • “Me no write so good”: Fair enough, but there’s only one way to get better at writing.
  • “It’s not my job, man”: Erm – no. Documentation is a part of software engineering. A big part.
  • We don’t know how to organize it: Fair enough. Spend some time and figure it out and implement it.

At the base of all the excuses are 2 things that are really simple:

  1. Software Developers generally don’t like writing documentation.
  2. Management fails to have developers write documentation.

These 2 things are understandable, wrong, and amazingly easy to deal with if a company wants to.