Who is in control of your data

This article describes data processing tasks in eCommerce projects and forms the criteria for quality assessment while working with the projects.

In recent years, the volume of digitized customer behavior data has increased rapidly. Before that, retailers just collected order and customer data in CRM and loyalty programs, but thanks to the Internet we now have information from the earlier stages of the sales funnel about how buyers choose products and how they find it.

The ability to analyze this data allows you to better understand your customer, find bottlenecks in the sales funnel and improve the quality of strategic business decisions in the company as a whole.

The abundance of data frequently creates the illusion that the business owns the data. Let’s take look at what it means to truly own your data.

Do you collect the right data?

To start with, information about user activity must be recorded and stored in a database available to you, free from «noise» that can reduce the value of the data. You are most likely confident that this is working properly in your business, but here are first two questions that you need to be able to answer:

  1. Do you record each block and product impression for each customer?
  2. Do you filter your employee visits out from user visits?

If your customers give you signals, but you do not save them — you are not in control of your data.

Is collected data merged?

There is an abundance of services and software, which record user actions, scattering the data out:

  1. Online campaign data is stored in various advertising services.
  2. Users’ onsite or mobile app activity is collected in an online analytics system.
  3. Real sales data, including margins, order completion rates and adjustments are processed in an ERP or CMS.

In order to assess the impact of your efforts on the results, the cost of customer acquisition on the income from these customers, you should be able to merge and correlate the data from all these systems.

If you can not collect all your data into one system — you are not in control of your data.

Can you process the data you collect?

Now that you have collected terabytes of your beloved data, there is just one thing left to do, process it. Most likely it is too much for Excel or a standard database to handle, so you will need specialized software that can crunch vast amounts of data.

Fortunately, the market has a several customized solutions to choose from, but do they let you keep control over your data? The two questions you need to ask are:

  1. Can you process the data in just a few seconds, regardless of size?
  2. Can you perform ad-hoc queries to the data, not just pre-defined reports?

If you need to wait several hours for your reports during your peak sales season, and the structure of the reports is pre-defined, your data is useless. You don’t control your data, your data controls you.

Are you sure your data is safe?

Processing large amounts of data requires hardware as well as software. Hard drives fail and information can be lost. If you are not confident about the safety of your data — you are not in control of your data. Okay, you own it, but temporarily. It’s only a matter of time until you lose it.


What does it mean for a business to really be in control of their data?
It means that they collect all the necessary information, are able to merge and correlate it quickly, are able to process it regardless of size and are sure that they won’t lose it.

How can you accomplish all this?

There are several different ways to take control of your data. I want to give you four main reasons why we use Google BigQuery to do this.

1. Applications like it

Most requests for data processing are made by programs, not people. There is a great set of SDKs for various programming languages, support for ODBC drivers and the ability to work with Google BigQuery from the command line.

2. IT departments love it

This shouldn’t be surprising, because there is no need to maintain servers, provide parallelized computing, create indexes and dealing with backups.

3. Management loves it

Google BigQuery can process any amount of data in just a few seconds and your IT department will never ask for more servers. You also don’t need to buy expensive licenses or enter into long-term contracts.

4. The future loves it

In today’s digital-world you never know exactly what questions you will need answer in the future or what services you will need. But with Google BigQuery you can be confident that your data will stay under your control and you will always be able to process it or transfer it into an external service.