Getting BI Skills... To Land The Job... To Build On Those Skills... To Advance On The Job... To...
During a presentation I gave on SQL Analysis Services 2005 in January 2007 to the San Diego SQL Users Group ( www.sdsqlug.org ), one attendee asked me for my opinion on how to go about acquiring BI skills. I'm no guru, but as one who is a ways along the skills continuum, here is my take on just one aspect of this question, which is...
First Off, Exactly What Are The Core BI Technical Skills?
(1) Relational Database and SQL Language Fundamentals:
Although BI is certainly a departure from traditional / transactional RDBMS principles (eg. third normal form), it is still significantly based on, or at least built from, relational DB management systems' concepts. Additionally, most BI applications are built around large data sets, for which solid SQL language skills are essential. Add these two realities together and most will agree that BI is a place for interested database novices to progress towards, but not to start with. On the other hand, business people may continue growing their BI Analysis skills without necessarily spending much time in this topical area.
(2) Dimensional Modeling:
Designing the data warehouse (DW) or data mart (DM) is, I think, the theoretical foundation of BI. Once the business process of interest has been defined, we define the facts (aka measures), which are the (usually numeric) values to be analyzed according to dimensional criteria (to be defined shortly). Examples of facts in a Sales data mart may include quantity of units sold, unit sales price, and unit cost. We then define dimensions and their attributes. To extend the example, a Sales data mart may include a product dimension with attributes of Product ID, Product Name, as well as hierarchic values such as Product Sub-Category and Product Category. A third dimension may be the Date dimension, which may include analytically intuitive attributes such as Date, Day of Week, Holiday, Month, Quarter and Year. Importantly, although an overriding goal of dimensional modeling is to establish a highly-intuitive, refreshingly simple design, the potential potholes and bends in the road are many, and thus Dimensional Modeling requires study. Fortunately, it's fun. Dimensional modeling calls for, and rewards, experience with not only database design but the business subject area itself.
(3) Extraction, Transformation and Loading (ETL):
Once the DW's dimensional design is complete, the hard work of gathering data from multiple sources, cleansing it and transforming it's data types, granularities and/or aggregations, loading it into the DW, and the scheduling much of that to iterate, begins. Many consider ETL to be the 800 Pound gorilla in the DW project planning room. It's specialized work performed by experts, it's difficulty may be under-appreciated by key business-side stakeholders who wonder "How hard could it be...?", and the likelihood of it's on-time, on-budget successful delivery is obviously vulnerable to even subtle changes in project requirements. In my opinion, ETL may be one role in the team cannot afford for the individual to just learn it as she goes.
Although they are often considered as an advanced database skill set, people with the desire and the dedication to master this specialty, will enjoy a strong demand for their work.
(4) Analytics / OLAP Cube Development:
From my viewpoint, this is the most fun part. The only part more fun is showing the analyst / BI user how to go about "finding the money" and then being around for the high fives when she finds really big bucks.
In building a high-functioning OLAP cube on top of the DW as it's data source, we are creating a truly multi-dimensional on-screen data browsing structure with fast (ideally "speed of thought" query performance, rich semantics, calculated fields, key performance indicators (KPI's), pre-calculated aggregates, a realistic variety of hierarchic default drill-paths and a strong foundation for the inevitable collection of hard-copy reports. MS Excel is, of course, a popular OLAP front-end browser. With a sufficient budget, the OLAP application will also benefit greatly from a rich-featured, browser-based, zero-client-footprint, multi-user front-end OLAP web server (that was a mouthful), such as is provided by Applix, Business Objects, Cognos, Hyperion, MicroStrategy, Panorama, ProClarity, Strategy Companion, SAS and others. The user-browsing capabilities of some of these platforms is extensive, and so the work of configuration, customization, user-training and administration with these platforms is no trivial matter.
For analytical business-specialist types like myself, the analytics / OLAP development phase is uniquely exciting insofar as it delves deep into the actual analyses that will be performed and enables them for users.
(5) Data Mining:
I will only deal with this advanced area briefly. Data mining starts, in a sense, where OLAP stops. Whereas OLAP enables analyses according to known factors (dimensions) in order to determine "what did happen and why?", data mining allows the system to detect patterns from "test data", and then apply those patterns to new data sets and thereby furnish potentially insightful answers to questions no user thought to ask. In that context, data mining is often used for predictive analyses of "...in light of all of our data... what will happen?"
Data Mining practitioners, whose expertise is usually built atop a foundation of statistical analysis skills, may often be relatively uninvolved in database engineering per se, but rather are advanced power-user specialists who can generate unprecedented insight from a database. As such, data mining is among the hottest of hot database skills.
(6) Report Development:
Although the aforementioned OLAP on-screen analytics may tend to reduce the sheer number of semi-redundant reports that IT is asked to generate, is does not signal the end of the need for reports per se. Reports retain the role as the hard evidence of information queried and, hopefully, valuable knowledge gained. In my opinion, a reasonably small collection of carefully conceived, uniquely insightful OLAP-based reports, many of which should include an extensive set of user input parameters, are crucial BI assets. Ideally, OLAP-browsing analysts should be able to drill directly and seamlessly from on-screen browsing into a variety of these on-screen reports with their input parameters automatically pre-selected according to the users currently OLAP browsing slice-and-dice positioning. As the analyst successively "finds the money", she should be able not only to bookmark those OLAP browsing locations, but also document each of them with appropriately filtered hard-copy reports. Lastly, these report hard-copies must include automatic documentation (ie. in the report footer) of all input parameters settings, so that the report can be duplicated later.
Personally, I view report development skills as an excellent compliment to RDBMS fundamentals and SQL language skills for novices interested in BI.
(7) Project Lifecycle Process Methodologies:
These competencies, which span skills that are technical, analytical, organizational and interpersonal, may be the single most crucial factor in BI success in light of the sheer complexity of BI development. For an article on the mainstream "Data Warehouse Lifecycle, click on... http://www.businessintelligence.com/ex/asp/code.147/xe/article.htm
For an intriguing book on the "Business Intelligence Lifecycle", click on...
http://www.amazon.com/gp/product/0201784203/ref=wl_it_dp/104-5823156-6544767?ie=UTF8&coliid=I3JFAKDAO7BDG0&colid=266K0ULFTUTZY. Of, if that link breaks, visit Amazon.com and search for the book entitled "Business Intelligence Roadmap" (ISBN-10 0201784203), by Moss and Atre. What's intriguing? They tout Extreme Programming (XP) as an effective BI development process. Although I have not been exposed to XP usage in BI, I must concede that anything that can reliably accelerate progress amid a high degree of complexity needs careful consideration.
In my view, expertise with software lifecycle methods simply comes with experience. Having said that, the topic makes for enlightening, and occasionally enjoyable, reading.
Let's close this blog entry with some open questions...
Question #1: For BI newcomers from a variety of backgrounds, which of the above skills are reasonable to seek and in perhaps what sequence from a career-development standpoint?
Question #2: Under what circumstances can any, or many, of the above roles be wisely placed on a single individual, and what risks do these choices create?
Question #3: To what extent is Agile or Extreme Programming (XP) an appropriate methodology for BI?
What are your thoughts?
...





Question #1: I have approached (thus far) growing my BI skills by taking on, as you state, the 800lb gorilla first. In my 8yrs experience in RDBMS development and reporting I have had to do many ETL type functions but have, in the past, always hand coded everything in T-SQL rather then using DTS. I had more control and I didn't have to learn some tool that I was weary of. In 2005 that all changed when I really started to dive into SSIS for my last contract. SSIS can be overwhelming and I have not mastered it by any means but after several months of working with it almost night and day I feel confident at this point that I can accomplish most ETL tasks with out any hicups. Next on my path I have been working on gaining a solid understanding of Dimensional Modeling and Cube Development at the same time. Lastly will be Data Mining as from my experience not many organizations are at a point where they can even consider it. So in order to be able to really gain value from data mining I would need to be good at all the other aspects first. This is why I've chosen data mining last, although I have studied it briefly.
Question #2: I believe BI Architects must poses all of these qualities in order to their job successfully and that anyone wishing to become well versed in BI will inevitably learn all aspects of the solution. However, when someone is first trying to get into BI Development it may make sense to have them focused on only one aspect of the overall solution.
Question #3: Don't know what this is...I really hate the word Extreme though :-)
Cheers,
Ben
Reply to this