Is your organization ready to innovate?
There are still some misconceptions about what exactly data science is – and how it differs from the broadly-used term, “Business Intelligence” (BI). A business will gain an edge over its competitors by understanding that difference, and by leveraging it.
To begin with, the end-goal of business intelligence tends more towards reporting, dashboards and alerts. Common deliverables are reports – sales trends by season, inventory backlog or inventory shortages, or even customer segmentation. The value is in the visualization. Marketing and management people appreciate having a complex amount of data distilled into the simplest, easily-digestible visual format. And let’s face it, who doesn’t love a good pie chart?
Data that is utilized by BI tools will be stored in some kind of structured, relational database. The speed of availability is paramount to the usefulness of business data. Online analytical processing, or OLAP, allows a business to analyze its data through multi-dimensional perspectives. A chain of restaurants may, for example, use its transactional data to inform the decision around the specials for each day of the week, and the times of day that the offer is most likely to draw in additional customers. The shrewd BI tool may also pick up that certain regions, even within the same city, have different demographic make-ups (age, marital status, number of children) and therefore must cater to different tastes. But an inherent problem with BI tools is that you need to know what you’re looking for. You will not find new insight in your data beyond the variables that already exist.
While data science is no less interested in practical, real-world business problems, its approach and the challenges it attempts to overcome are markedly different. First off, data science is less relational and tends to be more unstructured, which is the kind way of saying, “inconvenient”. There will be incomplete and sparse data – as well as a necessity for more external sources that may not be immediately usable without some degree of cleansing and conditioning.
Where data science really differs is in the approach and tools that are used. Contrasting the OLAP and ETL (Extract Transform and Load) approach of BI is the strategy of machine learning. Here, algorithms attempt to generalize from the observed data in order to predict the unseen. A few of the techniques used are artificial neural networks, decision tree learning and Bayesian networks. Optical character recognition is one application, as is natural language processing. There are many nuances to language and characters that can’t be completely documented with business rules, thereby making these problems challenging. Can an enterprising restaurateur use a BI tool to parse the menus available on-line for all of his competitors, then join the terms and price-points to come up with an optimized menu? Not likely.
On the subject of tools, those used within the business intelligence community tend to fall under the “platform” paradigm, and usually come with added cost and per-seat licenses. These tools are absolutely stat- of-the-art in their ability to visualize and rapidly generate colourful dashboards and ad-hoc reports. Data science tools (such as source code libraries) tend to be open-source, but may require a significant level of expertise to make use of them. These tools may not offer much support in the visualization of the results, a factor to be aware of when considering the target consumer of the data.
At Apption we don’t necessarily advocate data science over pure business intelligence. We take into consideration the problem being solved and the nature of the available data. Either approach can be used to simplify complex operational needs – because all data is an asset.