Lastest From LaserDeathStehr Labs - ScreenerBot

The way issue/bug tracking works at our company is that typically an issue is raised, and a team of 'screeners' (the design and verification managers) will look at them and assign them to developers and assign a priority to them.  Myself and a co-worker, who shall remain unnamed, joke around about a manager's 'routing' table and how certain keywords automatically trigger an assignment to a particular developer. This got me thinking, what if a program could pre-emptively predict who the issue will be assigned to?

Design

I thought about this for a bit and decided to run with it, and ScreenerBot was born.  ScreenerBot is a script that will retrieve the unassigned issues from our bug database and then make a guess and what developer to assign it to.  The theory behind ScreenerBot was to use Bayesian classification to classify incoming issues.  In a nutshell, Bayesian classification is a statistical approach to classifying a set keywords, and the key is to 'train' your classifier with known data.  The implementation of my engine relies on the Reverend python library.  This powerful library does all the Bayesian calculations and classification.

Once I had the 'brains' in place, the next step was to train it.  Recently, our company switched to using Jira as our issue management platform.  I find that Jira is a pretty powerful and easy to use platform, and more importantly, for curious hackers like me, it has a fully documented SOAP API.. :) Another thing that makes Jira pretty powerful for a project like this is its JQL querying.  It allowed me to easily retrieve issues used for training and also retrieve the unassigned issues. Using this API, I wrote some Python code that pulled in all the issues that were currently assigned to developers, processed them and then trained my classifier.  The processing I did was simple, I just removed the stop words and the special characters.

Finally, once the system was trained with some initial data, I built a small web app on top of everything.  I had used the Sinatra ruby library in the past, and for this project I searched out something similar for python.  A simple search led me to Juno.  Juno is very cool, but my only complaint is that it has a dependency on SQLAlchemy.  I have a feeling that a lot users of Juno are probably looking to get a quick app up, and most likely won't be using a DB...

The web app is simple, first all un-assigned issues are retrieved from Jira via SOAP, passed through the classifier, and then the top 5 results are displayed on the web page. The results in this case are the developers who mostly likely should be assigned the issue.  The user is given the option to help train the classifier by suggesting a better result (i.e a different user) that the issue should be assigned to.

Results

The results after the initial training were surprising....awesome :) Seriously though, the classifier has been pretty good at making developer suggestions.  I would say it has a just over 60% success rate, and actually its probably even a bit higher.  I was surprised when I saw the initial suggestions and thought "hmmm that is probably right...."  Hopefully with further training it will get even better!  The one thing that ScreenerBot can't do is assess priority.  To me, this seems like it still needs to be a human decision, as, in my opinion, it is subjective and in some cases relies on things like change risk and business priorities.

For this project, I unfortunately have no source code to post or sample project as its mostly custom to our jira repository and runs on our internal network.  But if interested, I could see if I could make it more generic and share the code....

1 comments on Lastest From LaserDeathStehr Labs - ScreenerBot

  1. Anonymous
    Sun, 07/04/2010 - 22:03

    As long as it can detect ÑÞξ'§
    /sent by John Connor