Archive for month: January, 2013

Running Highcharts within SSRS (or any JS Graph Library)

22 Jan
January 22, 2013

In a previous post I described how to convert an SSRS graph into a Highcharts graph by consuming the XML output of the report from the SSRS Web Service and converting that to an input for a Highcharts graph.

That article seemed to be very popular (in fact was the top most popular for a while), so I decided to take this concept a step further, In this article I will show you how, using JavaScript injection into SSRS reports, you can display a Highcharts graph from within SSRS itself (just like any other SSRS report) when the SSRS report is rendered into HTML.

Read more →

10 Tips to Improve your Text Classification Algorithm Accuracy and Performance

21 Jan
January 21, 2013

In this article I discuss some methods you could adopt to improve the accuracy of your text classifier, I’ve taken a generalized approach so the recommendations here should really apply for most text classification problem you are dealing with, be it Sentiment Analysis, Topic Classification or any text based classifier. This is by no means a comprehensive list, but it should provide a nice introduction into the subject of text classification algorithm optimisation.

Read more →

Text Classification Threshold Performance Graph

20 Jan
January 20, 2013

One way to increase the accuracy of a classification algorithm is to allow the algorithm to return an “Unknown” value, particularly when the probability of what we are trying to classify is too low to simply belong in one class and the algorithm is essentially guessing an answer, leading to incorrect classification.

In this post I will try and explore a method for researching and implementing the “Unknown” result in your classifier based on the probability distribution results of a classification, the idea is to give you the tools to tweak the optimum thresholds that gives you the best accuracy, while maintaining acceptable level of over-all coverage of data.

Read more →

SQL Saturday in Edinburgh on June 2013

20 Jan
January 20, 2013

SQL Saturday is finally coming to Scotland with a session scheduled for Edinburgh in June  2013.

Pretty damn exciting if you’re into the whole SQL Server scene, and considering I live in Edinburgh you bet I’ll be there (which might sway your decision not to go). Last year I missed SQL Saturday in Dublin because I couldn’t sort out travel arrangements in time, which was such a disappointment considering everyone else on my team ended up going.

Read more →

Testing & Diagnosing a Text Classification Algorithm

19 Jan
January 19, 2013

To get something going with text (or any) classification algorithm is easy enough, all you need is an algorithm, such as Maximum Entropy or Naive Bayes, an implementation of each is available in many different flavors across various programming languages (I use NLTK on Python for text classification), and a bunch of already classified corpus data to train your algorithm on and that is it, you got yourself a basic classifier.

But the story rarely ends here, and to get any decent production-level performance or accuracy out of your classification algorithm, you’ll need to iteratively test your algorithm for optimum configuration, understand how different classes interact with each other, and diagnose any abnormality or irregularity you’re algorithm is experiencing.

In this post I hope to cover some basic mathematical tools for diagnosing and testing a classification algorithm, I will be taking a real life algorithm that I have worked as an example, and explore the various techniques we used to better understand how well it is performing, and when it is not performing, what is the underlying characteristic of this failure.

Read more →

Running Windows 8 on Raspberry Pi

18 Jan
January 18, 2013

… Essentially you can’t run Win8 on the Raspberry Pi.

In this article I’ll try and explore the reasons why the Raspberry Pi is unable to support Windows 8, as well as present some alternatives that might achieve a sub-set of the Windows OS functionality.

Read more →

Two Dimensional Python Matrix Data-Structure with String Indices (Indexes)

17 Jan
January 17, 2013

Today I was trying to create a 2 dimensional data structure that can be queried using string indices rather than integer ones, this is using Python which am a total newbie in (but trying to write a research project using).

The idea was to find something natively within Python, rather than implement my own structure, such a data-structure is fundamental in programming theory, so a very likely chance that an out-of-the-box implementation exists already in most languages, and they are generally a dimensional extension of Arrays (Array of Array), Lists (List of Lists) or Dictionaries (Dictionary of Dictionary), but with string rather than integer indexes.

Read more →