The D in GIDAR Analytics: Data

Image of A GIDAR data schema

After defining clear Goals and collecting as much information as possible (clarifying key questions), you will have a good idea of the data that you need.
Here you want to be methodical. That is the technical part that requires extracting data, access to data sets, databases, build data models and data cleansing. That means that even if you or your team are not in charge of this part, you need to prepare a full brief.

What data do you need?

In the previous step, information, you should have collected the key questions to answer. They are the starting point for the data collection.

You need to write down the questions if you haven’t already and figure out which data you need to answer them.

Don’t be shy, take this initial list as a wish list, so put everything you think could help answer your key questions. 

How to collect and store the data 

In a spreadsheet, list all the questions, data sources and fields, you need to answer the key questions you gathered. 

The next step is the data gap analysis. For this section, we can use any validated gap analysis technique. If you want to keep it simple, you can start with, availability, accessibility and DQI (Data quality Index).

  • Availability refers to the mere existence of the data; do we have competitors data? It is usually a binary result, either yes or no.
  • Accessibility is a step after confirming availability, and it answers the question, can we access the data? Here you might get yes, no or yes but. For example, yes, but we cannot use specific fields due to privacy.
  • DQI or Data Quality Index. That is a quality of measure for the data we will use, and it answers the question. Can we rely on this data? DQI can become complicated, but I recommend you to make sure that the information is accurate, complete and unique if you want to know what these mean have a look into the Introduction to Business Analytics course. And if you don’t have much time, look at the data quality charts presented here.

Examples of Data points

  • Examples of Data  for “Jumpy Shoes”:
  • Total sales of “Jumpy Shoes” by hour of the day, day, week, month, quarter of year.
  • Total sales of “Jumpy Shoes” by model
  • Total share of “Jumpy Shoes” sales by model.
  • Customer that bought “Jumpy Shoes”
  • Demographics
  • Psycographics
  • Price history of “Jumpy Shoes”

In a more DATA friendly format:

ModelDateTimePriceSizeChannel  Oder IDCustomer ID
Shoe Gx20/03/20201:23pm76.538Web1214914Customer1
Shoe Gy24/03/20209:34m4942In Store85895798Customer2
Example of a table with sales data
Customer IDAgeGenderPost CodeNumber of PurchasesLast PurchaseCustomer Lifecycle Value
Customer124Female4057319/01/2020146.74
Customer244Male4914120/05/201956
Example of a table with Customer data

Final Remark

By the end of this step, you should have collected and prepared the data you will use for the analysis.