Hadoop is a pretty neat set of tools for processing loads of data in a distributed, parallel and easy to scale-out manner, and so rightfully the Hadoop toolset owns a pretty high position in the data analysis and BI game, and a must consider when embarking on any new big data project. But that being said, the Hadoop eco-system, however advance in many areas, is still away from being a complete end to end BI solution, particularly when it comes to offering support for emerging data analysis and business intelligence concepts, such as exploratory data analysis and real-time data querying, or even fully-integrated data visualization and report authoring tools.
Archive for category: SSAS
This is a pretty simple post to show how to alter (add, remove or edit) a calculated field in an SSAS cube without redeploying the whole project, a useful technique if you do not have the SSAS cube project handy or wish to quickly implement changes on a live cube.
Until now, I have found working with Kerberos when setting up a SQL Server stack to be a complete nightmarish experience, mainly due to two reasons:
- Working with Kerberos usually requires access rights to Active Directory for the account setting up this authentication protocol on the stack, in order to be able to effectively diagnose the setup and also configure the Service Principal Names (SPN) for the various SQL Server and SharePoint service accounts, and setup delegation. This means SQL Server architects and Network Administrators need to collaborate in order to correctly configure the stack, which is often an unpleasant and long winded experience of trial and error.
- The lack of a centralized diagnostic and configuration tools for Kerberos setup on SQL Server makes this tasks very tedious, particularly if you follow the limited number of online resources out there to setup Kerberos, and find that they do not apply exactly to your situation, or do not work exactly as intended after following the lengthy steps, and you are left with a very limited option in terms of diagnosing exactly what went wrong.
Recently I have been involved in researching and building a low-latency high-data-volume OLAP environment for a social entity and interaction analysis platform, the perfect mixture of concepts such as Big Data collection and processing, large-scale Network Analysis, Natural Language Processing (NLP) and a highly scaled-out OLAP environment for end users to explore and discover data (essentially a Self-Service and Exploratory BI layer).
It is by all means not an easy mission to orchestrate all the technologies that back those concepts, particularly if you are interested in using the optimum solution for the problem at hand, for example Big Data might be better handled by a Hadoop layer, but Hadoop or Hive (at least on their own) are not geared up to respond to OLAP queries, which are real-time by nature, and even if they were, your end-user needs familiar tools and interfaces to analyse and study this data, which is where SQL Server Analysis Service and the whole Microsoft BI stack might come in and offer great integration with already existing business applications (such as Office or SharePoint).
This post discusses a few architectural approaches to exposing a Hadoop layer through a SQL Server Analysis Service (SSAS) interface, with references to data-latency, redundancy and over-all performance.
A very common requirement when administering an SSAS instance or cube is killing a particular connection, SPID or Session, this could be due to a lengthy operation exceeding the expected time to completing, or merely cancelling a transaction that was issued by mistake, hopefully that wont be a schema change on a live environment tho!.
This post goes through the XMLA required for killing an SSAS command, as well as the Analysis Service DMVs that can be utilized to identify the required IDs.
For some strange reason I can’t seem to find a lot of literature online about the basic behavior of the LastNonEmpty SSAS function, which is currently only available in an Enterprise Edition of Analysis Service (for some really strange reason, since you can emulate the behavior in other SSAS versions relatively painlessly), so here is a nice wee post that goes through the syntax of the LastNonEmpty function, how to override and customize it’s default behavior, what performance improvement techniques you can apply, and some of the quirks you might experience when playing around with the LastNonEmpty Analysis Service function.
Ok, this might be abit of a general question, as am sure anyone who found themselves in this blog knows a thing or two about BI, but in this post I will try to give a more holistic overview of what is a full Business Intelligence offering, and what dimensions constitutes a full analytical offering. Additionally, having a BI infrastructure is all well and good, but at the end of the day, the over-all goal of any BI platform is to identify and act upon the data as quickly as possible, when the data is most useful for strategic, tactical or operational business decisions, a concept that we will try and explore in this article.
Without a doubt one of the most annoying and least useful errors I have ever encountered during my experience with SSIS.
This error has crept up on an SSIS packages that processes a dimension. The SSIS package is very simple and all it has is a OLE DB Source that dumps data from a query into a Dimension Processing task. The package executes fine until it reaches a particular row in the OLE DB source batch, in which it will fail with the following error message:
Dimension Processing Error: XML parsing failed at line 4, column 3391: Illegal xml character.