Big data to smart data : How to structure your analysis
Date: 09/03/2017

By Pankit Dhawan

Big data is often seen as the enabler of achieving continual competitive advantage. But the challenge often faced is creating an everlasting framework to truly unlock the power of big data that can continually drive strategic initiatives with confidence. So, here are some steps on how to go about this and a general set of guidelines for you to tweak with your knowledge and industry experience to make them your own.

Find out what Big Data can answer for you

  • Identify Challenges: A good starting point for understanding how a big data strategy can help, is first evaluating and extracting the key challenges your organisation is facing. These key challenges will create a directional focus for evaluating data assets that are relevant, and could potentially add value towards a solution to the defined problem.
  • Identify key data assets: Once you know what you are trying to solve, the next natural progression is towards identifying the data assets (internal or external) to the organisation which you evaluate as the ones that might have potential to solve the problem at hand.

The key in any kind of data analysis is to identify potential relations between data, and have a cause and effect type correlation between two seemingly different assets. An example would be an insight like “The rise in temperature by every degree, has a negative correlation with public transport usage”. Once you have an insight like this you can drill into variables and deduce the cause, possibly the expansion of train tracks on hot days has something to do with it.

To break it down, this insight would have been derived from weather and public transport data, which are seemingly different but might have an embedded intrinsic correlation which can give you an edge that you might be trying to extract.

Data Lifecycle | DIKW

DIKW stands for DATA → INFORMATION → KNOWLEDGE → WISDOM, which is a pretty basic but structured way of observing the transition of the raw data to applied wisdom. Every step in the DIKW can get quite detailed, but for the sake of high level observation we can stick to this:


This is where the assemblage of data assets occurs, where you choose the potential data sets and even do high level selectivity on what might be relevant to the challenge you are trying to solve. But you should be thinking slightly out of the box to align some correlating data sets which might not have any exact relation to the problem. The data in the current stage is always considered raw.


Information is the first step in making sense from the raw data. This is basically the preliminary observations we could make from high level analysis of the data assets, and which is normally done exclusively for the data assets we have chosen for the analysis. What we try to ultimately achieve from this stage is “Meaning”, essentially a bit more meta information beyond the basic raw data. An example; raw data is “its day time” and information is “the sun rise is at 5am every morning making it day”.


The information becomes or qualifies as knowledge when it becomes contextual. This is where we start going into the more actionable realm with the data we started with, or as a bare minimum start thinking about the action. The example is “If I wake up at 5 am, I would be able to utilise most of the day and get more usable time out of my typical day”.


This is the stage of derived action, where the action is in context with the original problem and the chances of it leading to a favourable outcome are good. The knowledge is that “Trying to utilise the day by waking up early could make life more efficient”, so the deduced action or applied wisdom could be getting into an early sleep routine to increase the likelihood of capitalizing on the additional day time because of waking early.

Another interesting thing to observe is that the raw data and the relayed information is normally a form of historical facts. However, as we progress towards the knowledge and wisdom transitions it becomes more relevant in the context of time as well, which is the present and immediate future.

Data Integrity is Key

Common sense would have us ask why this step is not a pre-qualifier before diving into analysis of any sort. The answer is simple, in the more exploratory phase where we are experimenting with different data sets and identifying potential insights, working through validating data accuracy incurs a lot of time as all we discuss here is in the context of big data.

The key is having an agile approach where data validation is visited as needed (as we are evaluating certain findings worth pursuing and deep diving into). However, this in no way should be taken as undermining the criticality of this phase - you could have the smartest of algorithms and the best statisticians coming up with breakthroughs, which are rendered unusable if the quality of the underlying data is compromised.

Make it Comprehensible

Until now, the focus has been towards coming up with insights and making sense of the data to mould the derived strategic initiatives. One of the more important aspects of your strategy should be about communication of the discovered insights, and how to make them easily consumed by the broader organisation. This would give you the leverage to align the efforts of different areas of your organisation to work off the insights that have been discovered.

This is why “analysis needs the analytics” to serve as the basis for competitive advantage. An example of a typical scenario of a digital marketing initiative is, trying to build a single view of the customer that is easily accessible and flexible enough to be used for analysing the behavioural and attitudinal aspects of the customer profile, which can fuel the creativity of the marketing and content teams using real time and usable data and analytics capabilities.

Derive Actions with Awareness

At this stage, we expect to have proven confidence around the knowledge and wisdom generation capabilities of the data analysis structure that is in place. However, a very vital element or the secret sauce to the entire strategy is adding the flavour of your industry specific expertise in conjunction with the learnt knowledge from analysing the developed data assets.

As mentioned in the opening statement that big data is the enabler, that is very powerful and can create tremendous value when applied in close conjunction with the understanding of the specific industry. This ensures that the direction of the initiative never loses context in which the power of big data is unleashed.

Data is just a heap of information that can be leveraged to extract a lot of value, but it is purposeless without the right leadership and industry experience directing the efforts. Just like “a loaded M16 without a trained Marine to pull the trigger” (quoted from Wolf of Wall Street).

Test and Scale

The final assessment of the data cycle is to test the hypothesis of analytically proven knowledge, and action it in a real environment. There are different testing methodologies, like designing comparative AB tests and then concluding the results. But no matter how you test, the first few tests should be on relatively smaller sets, or more containable environments, so that you can have some risk mitigation built into your strategy and to give you the flexibility to revisit any of the previous stages as necessary. If the results continue to come as favourably expected then the scale can be increased in subsequent phases.

If you achieve a favourable outcome in the preliminary tests in the real environment, it will build a sense of validity and confidence in your concluded hypothesis that was originally derived from your analysis cycle.


DIKW applied in the right agile way could lead you to generate a lot of expertise from correlations or internal data mining. This process can contribute invaluably to providing a deeper understanding and formulating your strategic initiatives.

My key learning from following the DIKW methodology is not to waste time at the beginning trying to get the data well organised and validated. Instead look for preliminary patterns and identify the correlations that support your organisations strategic goals, and the data sources that add the most value. Once you have identified the data sources that provide the most value, then invest the time in integrating and validating these assets.

If you would like to learn more about how you can manage and gain insights from your data, please contact us on 1300 841 048 or reach us online.