6 min read

Hate cookie banners? Want more privacy? Me too!

Metrics without cookies is possible (and easy), trends and targeting can be done with privacy in mind. So let's do it.
Hate cookie banners? Want more privacy? Me too!

Remember the good old days?

Back when the internet was young, you had hit counters, animated flames, under construction notices and so many GIFs and tables. Good times.

You can revisit those good times via http://www.oocities.org or even download it from Bit Torrent.

But now we have GDPR pop-ups, CCPA, data leaks, scary big brother and all sorts of nasty hackers etc. Oh, how we pine for the simpler times.

Or not. It turns out being able to measure your audience, A/B test, segment and otherwise understand and how your app/website/tool is being used is extremely useful.

But, can you get all the wonderful measurement without impacting privacy and setting yourself up as a juicy hacking target? And most importantly, can you rid the internet of another cookie banner.

YES. YES YOU CAN!

Google Analytics without cookies?

Google Analytics is great, it's free, it works, and its extremely powerful - which is why so many people (me included) use it. But it has a cookie from one of the (if not THE) largest advertiser in the world, Google - but does it have to?

I don't want to track people around the Internet, I just want to know if people read my blog. Ideally, what pages, how long they read for etc. But can you do this without cookies?

Surprisingly, yes, you can make Google Analytics work without the cookies, and the implementation is actually really easy, as outline in the excellent blog post: https://helgeklein.com/blog/2020/06/google-analytics-cookieless-tracking-without-gdpr-consent/#other-platforms

<script>
const cyrb53 = function(str, seed = 0) {
   let h1 = 0xdeadbeef ^ seed,
      h2 = 0x41c6ce57 ^ seed;
   for (let i = 0, ch; i < str.length; i++) {
      ch = str.charCodeAt(i);
      h1 = Math.imul(h1 ^ ch, 2654435761);
      h2 = Math.imul(h2 ^ ch, 1597334677);
   }
   h1 = Math.imul(h1 ^ h1 >>> 16, 2246822507) ^ Math.imul(h2 ^ h2 >>> 13, 3266489909);
   h2 = Math.imul(h2 ^ h2 >>> 16, 2246822507) ^ Math.imul(h1 ^ h1 >>> 13, 3266489909);
   return 4294967296 * (2097151 & h2) + (h1 >>> 0);
};

let clientIP = "{$_SERVER['REMOTE_ADDR']}";
let validityInterval = Math.round (new Date() / 1000 / 3600 / 24 / 4);
let clientIDSource = clientIP + ";" + window.location.host + ";" + navigator.userAgent + ";" + navigator.language + ";" + validityInterval;
let clientIDHashed = cyrb53(clientIDSource).toString(16);

(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');

ga('create', 'YOUR-GA-TRACKING-CODE', {
   'storage': 'none',
   'clientId': clientIDHashed
});
ga('set', 'anonymizeIp', true);
ga('send', 'pageview');
</script>

I've implemented it on this site already and so far, it seems to work exactly as you would hope/expect.

What about not using google at all?

Good question, at the end of https://helgeklein.com/blog/2020/06/google-analytics-cookieless-tracking-without-gdpr-consent they talk about a few solutions, some SaaS some self hosted and some really clever, like https://adwise.ch/blog/minimal-cookieless-web-analytics/ where you use serverless azure functions.

As a personal blog, I didn't want to spend money on analytics - but I did still want them. I also didn't want to spend much time/effort setting it up, so while interesting (and likely free) the azure server-less functions just seemed like more effort than I wanted to invest right now.

What about your CDN?

I use Cloudflare,  (I detail how I set things up here: https://www.acmconsulting.eu/post/hello-world/) mostly to provide an SSL certificate https://shyr.io/blog/free-ssl-for-github-pages-with-custom-domains - but it turns out they also offer free web analytics (https://www.cloudflare.com/en-gb/web-analytics/) which is privacy first.

CDN based analytics is nice, as it doesn't need to drop a cookie -  But the free version is only at the site level, to get more you have to pay. Hence, why am still using Google Analytics.

A battle is won, what about the war?

No cookie banner on this website, excellent! One down, countless to go.

The bigger picture is considering why GDPR and rules like it exist in the first place:

This (...GDPR) should make it easier for EU citizens to understand how their data is being used, and also raise any complaints, even if they are not in the country where its located
https://www.privacytrust.com/gdpr/whats-the-real-purpose-of-the-gdpr.html

This sound reasonable to me.

May people (myself included) are concerned about what data is collected about me and how that data is used/stored. While GDPR and laws like it may have helped, I don't think they have won the war.

Sometimes, I accept I am providing data and attention as payment for a product, most often when using advertising funded platforms - and I'm OK with that trade, assuming it's done with respect and not misused. But that's not always the case.

Good adverts, like good content recommendations are actually better for me and enhance my experience while also offering the best results for everyone in the consensus triangle. Bad adverts are a waste of time, money and energy for everyone.

Targeting, tracking, measurement are all tools, if used well, for good, they can result in better outcomes for everyone - but it has to be done right.

A promising next step

If we do want to track, target and measure, we need to do so in a way the insures the rights, security and privacy of the people providing the data. Thankfully, some very clever people are working on a system to do exactly that - it's called differential privacy:

Differential privacy is a system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset.
Differential privacy - Wikipedia

This is really clever, and it could (perhaps should) be the gold standard for metrics and measurement. It may end up doing what GDPR could not. Proving a way for business to collect data and understand their customers as trends while preserving the privacy of the individual.

A good explainer which goes into more detail is available here:

Explainer: what is differential privacy and how can it protect your data?
How should privacy be protected in a world where data is gathered and shared with increasing speed and ingenuity? Differential privacy, a new model of cyber security, provides a potential solution.

For me, the most exciting part of this, is that differential privacy looks like it should work with AI/Machine Learning.

Ai Differential Privacy And Federated Learning
(Source: https://ai.googleblog.com/2017/04/federated-learning-collaborative.html)

Microsoft with LinkedIn is leading the way on implementing differential privacy and I follow their progress with interest.

LinkedIn’s Audience Engagements API: A Privacy Preserving Data Analytics System at Scale
We present a privacy system that leverages differential privacy to protectLinkedIn members’ data while also providing audience engagement insights toenable marketing analytics related applications. We detail the differentiallyprivate algorithms and other privacy safeguards used to provide results…

A better future is possible, and if you are reading this, its likely you are in a position to promote solutions like this, pushing for their implementation and delivery.

Be the change you want to see in your industry.