Moving away from spreadsheets

Its a shame that I’ve not been blogging for so long. But some of my business trips outside the country and my other committments have kept me busy. Anyways, I will try to get back to a habit of regular posts based on my experiences.

As I have mentioned earlier in my post, the biggest challenge I am facing with a spreadsheet based BI system is the consistency of data and offcourse the time lag between recieving a request and delivering the output. There are tonns of other pitfals of otherwise easy to manage system (which I am trying to get rid of, hopefully soon). So my first obvious step was to move away from spreadsheets and create a central data repository – and for a small or a mid sized organization, there is nothing better than mysql. It has the features of a robust commercial database and yet it comes for free.

Step 1 – The first step towards this initiative was to understand the business needs (from a BI perspective) and design someting on a piece of paper that would be later converted into a formal database structure. I am not a database guru, but there are tonns of resources available online to get started. I am not going to talk much about how I designed and implemented my database (this is something you have to learn online and implement). But after this excercise, I had list of tables in mysql ready to be fed with data.

Step 2 – The next step was obviously to get some test (or real) data into the database to see everything is working as per the earlier design. If in case the data is on the worksheets, some sort of ETL (Extract, transform and Load) software has to be used to transfer data from spreadsheets to mysql. There are lot of software available on the internet, but I guess Navicat worked best for me. It is fairly easy to map your spreadsheet columns to the fields in your tables in mysql, rest everything is taken care by Navicat. Do remember to check the data in mysql after every transfer process to ensure that everything was transferred correctly or not.

With test data in the database (I would recommend atleast 3 months of data in all tables), you are ready to jump on to the next big level – deployment of BI Solution (Pentaho in my case).

Though the deployment process was a little complex earlier, but the inclusion of a bitrock windows installer has made things a lot easier now. Just download the executable and start installing. I will talk a little about the installation piece in my next blog.

Until then, have fun deploying :D

Advertisement