dataLayer and recursive merge

dataLayer is a simple JavaScript array. Push an object, read it in a Google Tag Manager (GTM) variable, done. Nothing complicated about that.

Right.

If you’re designing dataLayer structures and haven’t heard of recursive merge yet, consider whether you want to keep reading. You’ll sleep worse.

dataLayer vs. GTM data model — not the same thing

Before we jump into examples, let’s clarify one fundamental thing. In dataLayer discussions, two terms are commonly confused:

  • dataLayer — a JavaScript array (Array) you push objects into. It lives in the browser; you can inspect it in the console.
  • GTM data model (internal state) — an object GTM maintains internally. When you create a Data Layer Variable in GTM, you’re reading from this internal model — not directly from the dataLayer array.

Here’s where it gets fun. When you call dataLayer.push(), GTM takes your object and merges it into its internal data model. And that merge isn’t a simple overwrite. It’s a recursive merge.

What is recursive merge

Recursive merge (deep merging) means GTM walks through the object structure level by level, merging values. Primitive values (string, number, boolean) get overwritten. But nested objects aren’t replaced as a whole — GTM descends into them and merges individual keys.

Sounds harmless. Let’s see what it means in practice.

dataLayer etudes

I’ve prepared several scenarios. For each one, try to answer first — then read the solution.

Etude 1: Simple overwrite

What do I get in the data model for variable x?

dataLayer.push({'x': 1});
dataLayer.push({'x': 2});

x = 2

No surprises. A primitive value simply gets overwritten. This works exactly as you’d expect.

Etude 2: Nested object

What do I get in the data model for variable x?

dataLayer.push({'x': {'a': 1} });
dataLayer.push({'x': {'b': 2} });

x = {'a': 1, 'b': 2}

Here’s where it starts. The second push didn’t overwrite the entire object x. GTM descended into it and added key b to the existing key a. The result is a merged object. If you expected {'b': 2}, you’re not alone. But recursive merge works differently — it walks the structure and merges, not overwrites.

Etude 3: Array

What do I get in the data model for variable x?

dataLayer.push({'x': [1, 2, 3] });
dataLayer.push({'x': [4, 5] });

x = [4, 5, 3]

The fun begins.

In recursive merge, arrays behave like objects with numeric keys. Index 0 gets overwritten to 4, index 1 to 5, and index 2 stays 3 because the second push had no third element. The result isn’t [4, 5] (overwrite) nor [1, 2, 3, 4, 5] (concat). It’s a hybrid that makes no sense in any common programming context.

Watch out.

Etude 4: Product arrays

What do I get in the data model for variable x?

dataLayer.push({'x': [{'id': 1, 'name': 'Product 1'}] });
dataLayer.push({'x': [{'id': 2}] });

x = [{'id': 2, 'name': 'Product 1'}]

This is the most practical etude. Recursive merge descended into the array (index 0), found an object inside — and merged it again. Result: the product with id: 2 inherited name: 'Product 1' from the previous push. A frankenstein — an object describing a product that doesn’t exist. In e-commerce tracking, this is a nightmare. If you push events without clearing the data model, you might see a product with another product’s price or with the name of a previous cart item in your Google Analytics 4 (GA4) reports. And since it doesn’t throw an error, you might not notice for weeks — until the data stops making sense.

Etude 5: Object + _clear

What do I get in the data model for variable x?

dataLayer.push({'x': {'a': 1} });
dataLayer.push({'x': {'b': 2}, _clear: true });

x = {'b': 2}

The _clear: true flag tells GTM: write the keys in this push as a whole, don’t do recursive merge. So x doesn’t get added to the existing {'a': 1} — it gets completely overwritten to {'b': 2}. Key a is gone.

Etude 6: _clear and a different key

What do I get in the data model for variable x?

dataLayer.push({'x': {'a': 1} });
dataLayer.push({'y': {'b': 2}, _clear: true });

x = {'a': 1}

_clear: true only resets the root keys present in the given push. The second push only writes y — it doesn’t touch x. The value of x remains {'a': 1}. This is important to understand: _clear is not a global reset of the entire data model. It only resets what you’re pushing.

Etude 7: Array + _clear

What do I get in the data model for variable x?

dataLayer.push({'x': [1, 2, 3] });
dataLayer.push({'x': [4, 5], _clear: true });

x = [4, 5]

With _clear: true, the array behaves predictably — key x gets overwritten as a whole. No hybrid index merging.

What this means in practice

Data accumulates in the data model

If I push e-commerce data for product A and then push data for product B, both remain in the data model. Nested objects get merged, not overwritten.

With _clear:true you have the option to change individual parts of the dataLayer.

Scientific note

It’s like the Ship of Theseus — if each push replaces a part of the object, is it still the same object?

Mmm…

Timing of data reads matters

On single page applications, this is a critical problem. The user navigates from a product page to a category — but the product data is still in the data model. If a tag reads a variable when it shouldn’t be there anymore, you’re sending nonsense to GA4.

On traditional websites, this is less of an issue (the page reloads, dataLayer gets recreated), but it can still hurt. Typical example: a user adds a product to the cart, then removes it. If you push the removal as a new object without clearing, your GA4 revenue data won’t add up — because the data model retains remnants of previous pushes.

Solutions exist, but none is universal

There’s no silver bullet. Each approach has trade-offs, and you need to think it through so the whole solution makes sense in the context of your website:

  1. Clearing on the developer side — Developers explicitly reset relevant keys before a new push. Reliable, but requires discipline and documentation. Any oversight = silent data error.
  2. Selective reading in GTM — You don’t read a variable “generally,” but bind it to a specific event. The trigger fires only on the right push, so you read data at the moment it’s current. Works well, but configuration can get complex.
  3. _clear: true — Resets the root keys in the given push — overwrites them as a whole instead of recursive merge. Safe against accumulation within a single key. But note: it doesn’t reset other root keys. If you need to clear other keys too, you must explicitly set them to null or include them in the push.
  4. dataLayer picker – a template that will only load data from the push that triggered the event from the dataLayer. More about the dataLayer picker was written by Simo Ahava.
    Note: thanks to Marek Lecián for adding this option
  5. Combination — In practice, you’ll usually end up combining approaches. On my projects, I typically recommend _clear: true for e-commerce events (add_to_cart, purchase) and selective reading for the rest.

Watch out for arrays

Etude 3 shows why arrays in dataLayer are treacherous. If you push arrays (like a list of products in a cart), recursive merge can break them in ways that are hard to debug. I recommend:

  • Always push arrays with _clear: true, or
  • wrap them in a new object so they get overwritten as a whole, or
  • use a dedicated tag in GTM for each event type.

How this relates to data quality

Recursive merge is one of those mechanisms you need to understand very well to measure correct data. Without this knowledge, you’ll see strange values in GA4 reports — revenue doesn’t add up, products get duplicated, events carry data from previous interactions. This is exactly the type of bugs I presented at MeasureCamp — silent, no errors, but with real impact on data.

If you’re working on data quality and measurement monitoring, this is exactly the kind of problem you should have on your radar. It’s not a code error that crashes with a stack trace. It’s a data error that quietly distorts your decision-making.

And if your company plans to use AI on analytics data, you need a data infrastructure that’s ready for it. The recursive merge problem in dataLayer is exactly the type of “detail” that makes the difference between data you can train a model on and data that teaches it nonsense.

Summary

  • The dataLayer array and the GTM data model are two different things.
  • GTM performs recursive merge — nested objects get merged, not overwritten.
  • Arrays get merged by index, leading to unexpected results.
  • _clear: true resets root keys in the given push — it prevents recursive merge but doesn’t touch other keys.
  • On SPAs, this is critical. On traditional websites, it hurts with events and e-commerce.
  • Design your dataLayer architecture upfront — fixing it retroactively is expensive.

If you’re not sure whether recursive merge is affecting your data, get in touch — we’ll review your dataLayer architecture.