Talking to the Data

Data in its raw form has no meaning. Whether structured or unstructured in its format, data must be processed to make it useful and meaningful. That said, every data ‘talks’ to its user, but one needs to ‘listen’ to it. It is up to the skills and capability of the data user to analyze, interpret, and take synthesized information to the next level of usage- which is Artificial Intelligence. To do this, we need to ‘talk’ and understand what data is telling us.

Talk to the data, it’s waiting for its audience 

Data analytics is a structured process that transforms raw data into meaningful insights, but it all starts with mastering the fundamentals. While there is an eagerness to dive into advanced analytics, many overlook the importance of building a strong foundation. Let us understand how to start this process of data analytics- of ‘listening’ and ‘speaking’ to data.

First, you should be clear about what you want and why you want it. Background and objectives prior to performing analytics should be clear and documented. It is easier to write on paper than to start hitting the ALT key over Excel. Once you have your background and objectives in place, identify where the data is and what format is it available in, whether structured or unstructured.

A challenge faced by users is getting the correct and complete data on a continuous basis. Such challenges can and should be resolved with the support and commitment from Management. It is often seen that once the user gets the data, there is a hurry to start analytics. Contrary to this, a user should focus on data cleaning– a process which involves removing unwanted characters, streamlining data, and making it reliable for analytical results. This involves the use of various commands, functions, and algorithms to achieve answers to our why. Cleansing and standardizing data in this form is foundational. Only after the data is properly cleansed can the process of analytics begin followed by a gradual progression such as auto mode reporting for daily tasks.

This data cleaning also forms the basis for advance analytics, including a) Machine Learning, b) Predictive Analytics, and c) Artificial Intelligence Analytics. Let’s look at these in some detail.

  1. Machine Learningprovides computers with the ability to learn – without being overly programmed, meaning they can teach themselves to grow and change when exposed to new data. Machine learning uses analytics from historical data to detect patterns in new data and adjusts programme actions accordingly. The purpose of machine learning is to discover patterns in your data and then make predictions based on often complex findings to answer business questions, detect and analyze trends to help solve problems. Machine learning is effectively a method of data analysis that works by automating the process of building data models. Machine learning examines small or large amounts of data possibly from many different sources with statistical algorithms such as clustering/ profiling, regression and classification. The objective is to discover patterns and then make predictions based on those often, complex patterns to answer business questions and solve problems.

Clustering/ Profiling– Is the task of separating a set of un-structured objects into groups such that those in one group are more similar to each other than they are to objects in other groups.

An example of a clustering problem is identifying groups of people with similar buying patterns. The input is a dataset where none of the samples is assigned to a specific group. The clustering method firstly identifies a set of groups and then associates each sample to a specific group.

Regression– Is the task of determining the numeric response of numeric or categorical variables. For example, based on the number of past purchases, what’s the probability of purchasing a specific product? Linear Regression algorithm could be an effective tool/ formula for Predictive Analytics.

Classification– Is the task of deciding which category a new object belongs to based on a model constructed from relationships between collections of existing objects that are already labelled.

  1. b) Predictive analyticsis the use of statistics and modelling techniques to determine future performance based on current and historical data. Larger the historical data is with maximum variables of parameters with the user, better the predictions.

There are three pillars to predictive analytics, they are the needs of the entity that is using the models, the data and the technology used to study it, and the actions and insights that come as a result. There are three types of predictive analytics techniques: predictive models, descriptive models, and decision models.

  1. c) Artificial Intelligence (AI) is the ability of a computer, or a robot controlled by a computer to do tasks that are usually done by humans because they require human intelligence and discernment. There has been an age old statement about computers “Garbage In Garbage Out”, to achieve effective and efficient artificial intelligence results. It is important to program the tools/ software with maximum permutations and combinations of events that it is supposed to handle and manage. It is not a one-time exercise, as AI tools must be updated with changing environments and requirements.

This sector of analytics is driven by the users’ imaginations and capability to comprehend both present and future scenarios and possible solutions. Computers can only process what has been coded in its software. There is always an inherent risk of wrong outcomes, if scenarios are incorrectly analysed and programmed. This risk is escalated when users’ reliability on AI’s managing capabilities, lack of adequate controls and monitoring of the correctness of the results goes unchecked.

In summary, analytics is driven by your imagination, it is important to keep updating the analytics that has been implemented successfully. There is something known as Analytics Life Cycle (ALC), which means that if the user finds certain analytical results are under control or within permissible risk limits, it’s time to move-on and explore other areas with analytics. At the same time, it is also advised that user should revisit the previously analysed reports from time to time, to make sure all is well.

Lastly, to have an effective and more importantly efficient system of analytics, user must think out of box and should not limit the imagination with the solution available to solve the requirements. Solutions are created based on the need, and this process shall be the center point of future development too. Hence, it is important to keep the thinking process active, hungry for more and progressive to achieve higher heights/ improvements.

Leave a Reply

Your email address will not be published. Required fields are marked *