Saturday, May 28, 2011

First week of GSoC

Although a full week has not been completed since the official kick off day for coding (May 23rd) it will be much easier for me to write my weekly blog post on weekends. So here I am writing my first weekly report. 

In my opinion the first week was very productive and I could deliver the first two features on schedule. Further I was able to do some more work of the next feature, so most probably I will be able to deliver the next feature before its alloted date.

As the first feature I improved the GIS data browsing by adding two new options. Now the user can view GIS data in standard Well Known Text(WKT) and Well Known Binary(WKB) formats. This is in addition to viewing GIS data with indication of its type and size (eg. [GEOMETRY - 2KB]), which was the method that was there earlier with PMA. 

These formats also adhere to other options set when viewing data. For example when 'partial data' option is set, WKT format will adhere to it by truncating well know text to the specified limit. Similarly WKB adheres to the 'Show binary contents' and 'Show binary contents as HEX' directives.

My second task was to support spatial indexes. Spatial indexes can only be created on spatial data columns and contrary to other indexes, spatial indexes can only have a single column. Further spatial indexes can only be crated with MyISAM and specifying a prefix length is not allowed when creating the index. With this feature, now you can create spatial indexes on geometric columns easily by clicking on the button corresponding to the column in table structure page. 

I also improved the index edit page to suit spatial indexes. To prevent the users from wrongly specifying parameters and ending up with errors, some input boxes are hidden dynamically when 'Spatial' is selected as the index type. Both the above features have been pushed to my repository and you can play around with them on my demo server.

To the end of the week I was working on my next feature which involves a considerable amount of coding compared to the above two. I think this is not the post to write about the next feature. So I will preserve those details to my next weekly post. I am looking forward to deliver my next feature, 'Visualizing GIS data', ASAP as it will always be nice to see some graphics :)

Friday, May 13, 2011

Rigor Automation

When it come software engineering, rigor refers to keeping the code healthy. Code health can be measured in terms of number of parameters, freeness from bugs and vulnerabilities, adherence to coding standards, unit test coverage being few of them. Now, 'rigor automation' refers to the automation of the process of identifying the deviations from this healthy status.

Programmers' lives have been made easy (or one may say hard), with loads of tools that have been built to automate the rigor. Standalone tools as well as plugins to various IDEs, especially eclipse, are there for this cause.

To measure the adherence to the coding style, eclipse plugin, CheckStyle is the most favorite choice. While PMD can also be used for this purpose, it can also be used to detect the potential bugs and vulnerabilities in the code. FindBugs is another great tool that finds bugs as its name suggests. Copy Paste Detector (CPD) does what it’s supposed to do, and it's quite important to find the copy-paste instances as most of the time they are indications of bad OOP design. So you better reconsider your class hierarchy if you have to copy and paste your code in number of places.

Cyclic dependencies between packages are also an indication of bad overall design. JDepend is there to save you by automatically analyzing the dependencies for each and every package to detect the cycles. It's also important to know the statistic about your code. The average, minimum and maximum length of you files, methods etc, the total LOC of you project, comment ratio provides invaluable insights to your code. Tools such as Metrics, SourceMonitor can be used to measure those metrics. Unit test is a must for healthy code and unit test coverage is the measure that provides the assurance in terms of robustness. Coverlicpse and Emma are two great tools that provide unit test coverage statistics.

While most of those listed above are for Java, a number of tools are there for other languages as well. What made me write this post is somewhat related to, me discovering another great tool. While I was looking for a tool that would help me adhere to PEAR coding standards which is used by phpMyAdmin, I found CodeSniffer. Since this post is getting elongated I will write a separate post about my experience with CodeSniffer.