Microsoft Big Data: Hadoop through HDInsight Server

25 Oct
October 25, 2012

After a year from announcing partnership and starting the collaboration project, Microsoft (SQL Server) and Hortonworks (Hadoop) have finally announced the result of this integration: Microsoft HDInsight Server and HDInsight Azure Service.

So what is HDInsight? well, it is essentially Microsoft’s Hadoop-based distribution which is built on top of the Hortonworks Data Platform. So if you download Microsoft HDInsight Server for a local installation of the Hadoop distribution, then you will end up with a local cluster with your own Hadoop Hive able to run Hadoop jobs, as well as benefit from the already released Hadoop integration points with SharePoint and EXCEL. This is just so powerful!

This release is part of Microsoft’s “End-To-End” approach to handling Big Data, in which they are trying to close all the gaps in terms of fully holistic and comprehensive approach to data warehousing, analysis, prediction, data enrichment (or augmentation) and general Business Intelligence solutions.

You could sign-up for Microsoft’s HDInsight Service on Azure, this is a 10 minute process and by the end you can have up-to 16 clusters ready to start running Hadoop map/reduce jobs, or if you want to keep things local, you can install Microsoft’s HDInsight Server on Windows Server, the instructions for doing so are simple:

  • Download and install Microsoft Web Platform Installer 4.0, currently the distribution is being released through this medium.
  • Once downloaded, open Microsoft Web Platform Installer and search for “Hadoop“, you should receive a result with the title “Microsoft HDInsight for Windows Server Community Technology Preview“, select this installer and click on “Install“. I had to try the installer a couple of times before it worked, probably due to the high demand for this installer.

  • Once the installer completes and configures the Hadoop distribution and the user interface for managing the cluster on IIS, you should be able to browse to your local Hadoop cluster using the following address: http://localhost:8085/, if everything went ok, you should see the screen below which is familiar to anyone who managed to get on the “Hadoop on Azure” community preview that was available at the start of 2012. If the link to the Hadoop Dashboard does not work, then you can go to IIS and check-out what Port it has been assigned to, you should be able to see 2 new sites under Sites in IIS, these are: HadoopDashboard and HadoopWebAPI

 

And in the classic Microsoft installation manner, that is literally all it takes to get Hadoop on Windows going. It doesn’t sound like much, but if you ever tried to get Hadoop to run on a Windows machine before you will understand the struggle, “fiddliness” and instability that you had to deal with in the past to get Hadoop to run either natively or through Cygwin (both ways are not recommended for production use).

In order to navigate and use Microsoft’s HDInsight, there is a helpful albeit basic bunch of tutorials on the Azure Website that can help you get get a grip on the basics, along with the classic “Hello World” tutorial to get you started with running Pig and Hive jobs on your cluster. Also there is a 30 pages word doc jump start guide for HDInsight which can give you a bit more details and overview.

Its important to note that this release of Microsoft’s HDInsight is currently in Community Preview, so it is not recommended to use it for production.

Resources:

 

Microsoft Big Data: Hadoop through HDInsight Server rated 4 out of 5 by 1 readers

Microsoft Big Data: Hadoop through HDInsight Server , 4.0 out of 5 based on 1 ratings
* * * * ½ 4 votes
Tags: , , ,
4 replies
  1. jackie says:

    hi there is no hdserver available!

    Reply
  2. jackie says:

    I mean the Microsoft HDInsight Server you mentioned does not exist in the Microsoft Web Platform Installer search results ,so i think it does not exist anymore ,is it that microsoft canceled the Microsoft HDInsight Server and just left the azure platform? if you can find a Microsoft HDInsight Server download link ,please let me know,cause i don’t want to use the azure for it’s not free, thanks!

    Reply
    • Links Naji says:

      Hi Jackie,

      Sorry for the late response, been away for a while.

      I can see what you are saying now… I think Microsoft did something sneaky and took away their HDInsight server edition installation (and only kept their HDInsight Azure version), without really telling the developer community. Thanks alot for flagging this up, I can see other developers highlighting the no HDInsight installation issue as well.

      There has been an announcement about the release of version 1.0.0.0 of HDInsight a couple of months ago, with a link to the installer, you could try that route, but I have a feeling this wont work as well.

      Alternatively, you might want to consider installing HDP for Windows instead, I reckon its alot more stable than HDInsight anyway, well worth a try.

      Hope this helps, would love to hear how you got on!

      Cheers,

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply to jackie Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>