Seller-Defined Audiences Analysis Series – Part I: SDA Introduction & Initial Insights

Have you heard of Seller-Defined Audiences (SDA) yet? SDA was released at the beginning of 2022 by Project Rearc – an initiative that came to life as the IAB Tech Lab’s answer to the impending demise of 3rd-party cookies. Although a year hasn’t passed since its release, SDA has received significant attention in the press and industry conferences and is considered one of the go-to solutions for targeting when the 3rd-party cookie pond dries up.

Key takeaways from this article:
  • Cookie sync web traffic brought around 135B SDA-enriched bid requests from 20 SSPs within 2 weeks, most of which came from the US and Germany.
  • Around 33% of the received contextual SDA signals and 75% of the received user SDA signals were incompliant (missing key information).
  • The most popular segment, across both types of SDA signals, was “Food & Drink”.
  • IAB Tech Lab’s taxonomies are most popular across Seller-Defined Audiences signals, and Topics API’s taxonomy left a marginal footprint.
  • The Data Transparency Standard (DTS) repository has minimal information.
  • IAB Tech Lab’s annual certification leaves the buy-side skeptical about continuous quality assurance.
  • User SDA signals were observed being passed in parallel with other user IDs, which contradicts the approach advised in the specification.
  • In its current shape, SDA enhances fingerprinting techniques.
Last Updated:
Published:

What exactly are Seller-Defined Audiences?

It is a specification that describes how publishers (or their vendors) can provide meaningful information about their inventory and audiences to downstream buyers within OpenRTB in a standardized manner. SDA signals – the specification’s main product – are supposed to be attached to bid requests coming from interested publishers. There are 2 types of SDA signals, each conveying a different type of data:

  • Contextual SDA signals – placed in site.content.data of the bid request, they assign topics that are supposed to reflect the website’s content.

Figure 1
  • User SDA signals – placed in user.data of the bid request, they assign interests to users according to their historical behavior on the publisher’s domains.

Figure 2

Please note that a single bid request can include multiple SDA signals that are different and/or the same type.

The IAB Tech Lab’s specification suggests that a compliant SDA signal includes 3 vital elements: 

  • Provider’s name – names the entity which produced a signal for the publisher (this could be the publisher itself or its technological partner)

  • Segment – specifies the topic of the content’s focus or user’s interest

  • Taxonomy – refers to the taxonomy that decodes the listed segment(s). Taxonomies are often divided into tiers, which represent different data granularity (i.e. a segment from Tier 1 is “Food & Drink”, from Tier 2  is “Pies”, and from Tier 3 is “Cherry Pies”)

Benefits of using Seller-Defined Audiences

So what is it about Seller-Defined Audiences that makes publishers so excited about the concept? The answer is: with Seller-Defined Audiences, publishers enjoy having an influence over their inventory and audience labeling. It allows publishers to:

  • independently decide what their articles are about and mark the relevant topics using the site.content.data field,

  • assign users according to their behavior on the publishers’ premises using their preferred methodology (i.e. after visiting articles with the “cooking” contextual signal twice within 3 days, the user gets labeled as a “cooking” enthusiast) and mark the users’ interests in the user.data field,

  • use standardized taxonomies to achieve a better scale (i.e. IAB Tech Lab’s Content Category Taxonomy for contextual SDA signals and IAB Tech Lab’s Audience Taxonomy for user SDA signals) or their own taxonomies, which could be decoded only by the desired business partners.

A handful of insights

When it comes to finding a good privacy-oriented alternative for 3rd-party cookies, RTB House is actively exploring all options and Seller-Defined Audiences is considered a promising candidate. To investigate Seller-Defined Audiences, the first natural step is to start receiving the SDA signals from SSPs, collect a pile, and see what they are and where they come from. The scope for this experiment is mostly cookie sync-matched web traffic, so the actual presence of SDA signals might be wider. We currently receive SDA signals in bid requests from around 20 SSPs.

  • Scale & geographies

    As of the first two weeks of December 2022, we received around 135B bid requests globally which were enriched with at least one SDA signal. The top 3 countries were:

    • United States – 58.3B 

    • Germany – 30.1B

    • United Kingdom – 10.3B

  • Differentiation

    • Among the 135B bid requests with any SDA signal, 73B had at least 1 contextual signal, 82B had at least 1 user SDA, and 20B bid requests had both types of signals attached. 

Figure 3 (SSPs of the same assigned number might differ for the different types of signals analyzed, i.e. “SSP 1” from user SDA is not necessarily the same entity as “SSP 1″ from contextual SDA)

Glancing at the distribution, we can observe a distinctive leading provider in terms of contextual SDA signal creation with around 60% of the total contribution. What’s particularly noteworthy is the large portion of user SDA signals with no provider specified, which makes them incompliant (a deep-dive is provided in the next section).

  • Compliance – SSP split

    As listed above, in order for us to deem any SDA signal as compliant, we need it to include 3 essential pieces of information: provider, taxonomy, and segment. As it turns out, the complicance of a signal varies greatly depending on its type (context or user), SSP, provider, and many other factors. Here’s a snapshot of the information on this subject (the compliance rate is the % ratio of the SDA signals with all the required information against all the SDA signals received from the given type):

  • Contextual SDA

    Being SSP-agnostic, the compliance rate is around 67%. However, if we take a closer look at the top SDA senders across SSPs, we can see that the values range literally from 0 to 100%:

Figure 4 (SSPs nomenclature aligned with Figure 3)

All incompliant context SDA signals have their provider listed but fail to specify taxonomy (99%) and/or segment (25%).

  • User SDA

    Here, the compliance rate radically deteriorates to as little as 25% and appears to be more binary for most active SSPs (either they send compliant or incompliant signals):

Figure 5 (SSPs nomenclature aligned with figure 3)

48% of the incompliant user SDA signals lack provider information but the main source of errors were missing segments (89%) and taxonomies (91%)

  • Compliance – provider split

    We observed that compliance was mainly impacted by missing segments and taxonomies but at the same time each signal had a provider for contextual SDAs (not the case for user SDAs). Let’s take a deeper look into how the split presents itself for different contextual SDA providers (top providers listed).

  • Contextual SDA

Figure 6 (SSPs nomenclature aligned with Figure 3)

As one can see, Provider 2 with its 0% compliance rate is the main reason behind the decreased general contextual SDA’s compliance rate average (the “Other” category has very little impact due to its size – 0.6% against Provider 2’s 18.7%). If not for Provider 2, the average would increase by around 15% to reach 82% in total.

  • Most popular segments

Figure 7

When it comes to the most popular segments, it only makes sense to analyze compliant signals. On top of that, we decided to look at the most generic Tier 1 from IAB Tech Lab’s Content Taxonomies and the second most generic Tier 2 from IAB Tech Lab’s Audience Taxonomy. “Food & Drink” turned out to be by far the most popular across all types of the currently observable SDA signal segments (this was an unexpected result given the fact that food & drink-related websites don’t dominate the world wide web, or at least not to that extent). Please note that one SDA signal can carry a number of Seller-Defined Audiences segments (from the same or different tiers) in parallel.

  • Most popular leveraged taxonomies

Figure 8

The dominant taxonomies were IAB Tech Lab’s Content Taxonomy 2.0 and 2.2. The newest IAB Tech Lab’s Content Category Taxonomy 3.0 was only leveraged in 0.2% of the compliant contextual SDA signals.

Figure 9

The unquestionable leader across the compliant user SDA signals was IAB Tech Lab’s Audience Taxonomy 1.1. There was, however, one provider, which created signals using user SDA specification but leveraging Topics API’s taxonomy.

Are Seller-Defined Audiences the silver bullet for audience targeting?

No. Despite the excitement across the sell-side, it doesn’t really apply to the buy-side. DSPs are wary of how much they’ll be able to trust publishers and their providers to deliver bid request information adequate to the content/audience it refers to. In other words, they are afraid that some inventory might be mislabeled in order to be considered more valuable than it really is.

Data Transparency Standard (DTS) is an element of the Seller-Defined Audiences which is aimed at putting the worries of the DSPs to bed. It’s supposed to provide standardized information, placed in metadata, including 20 fields (filled-in by publishers or their vendors) describing the SDA signal’s quality, with pieces like: data provenance, age, modeling, segmentation criteria, and comparability. These fields are subject to change in accordance with what the buyers might deem necessary and helpful in the future. The purpose of DTS is clear – being transparent about quality. This information can be accessed by any registered user but to extract data in an automated manner (via API), one has to pay an annual fee. As of now, the DTS repository (IAB Tech Lab’s Tools Portal) has marginal traction and is expected to gradually gain it over the next few years.

However valuable the information within might be, the table itself doesn’t prove anything, and that’s where the IAB Tech Lab’s certification is supposed to assist. Any publisher might apply (and pay) for an annual certification, in which case the IAB Tech Lab takes samples of the SDA signals from the publisher in question and analyzes their alignment with reality. This idea was received very skeptically by the buy-side, which suspects that publishers will be nearing perfection when the time comes to get the certification or its extension, but as the quarter comes to an end and financial goals have to be met, there’s no guarantee that the reliability of the signals will be their number one priority. This issue calls for specialized external entities to devote much of their time and resources to continuously verify the SDA signals’ reliability.

Another issue is how many factors and actors there are to consider while leveraging this new concept and analyzing its performance. If, for example, the SDA signal’s quality is doubtful, who’s to blame? Is it the publisher? Or perhaps the publisher’s intentions were intact but the provider didn’t do a good job crafting the signal? Or maybe both entities did well but the SSP didn’t pass the signal compliantly and its value disappeared? 

On top of that, there are plenty of pieces of information, such as: language, geography, browser, and many others. The buy-side needs to consider all of this when deciding on how high the bid should be, and it might be difficult to verify whether Seller-Defined Audiences brings real added value or only adds more complexity to the already over-complex ad tech landscape. 

Lastly, and perhaps most importantly, SDA itself doesn’t provide any mechanisms which would enable cross-publisher frequency capping and prevent fingerprinting. While expecting the first feature might be wishful thinking, the latter is an issue the ad tech industry has been struggling with. Although the specification clearly states that “unnecessary commingling of data points (user agent information, first-party identifiers, probabilistic maps of various IDs, and encrypted user-provided IDs) should always be avoided”, there is nothing preventing that from happening. 

In fact, we have observed that user types of SDA signals have been passed along with other signals identifying the user (such as 3rd-party cookies and others listed above). Across all user SDA signals RTB House observed in Q4 2022, only 15% of them were not accompanied by any other user identification (such as cookies, external IDs, mobile identifiers). This leaves around 85% of user SDA posing a risk for leveraging SDA signals to enrich cross-site graphs with behavioral data, as well as providing additional signal increasing fingerprinting surface.

When a user SDA signal is used in parallel with some other common identifier, valuable publishers’ information can be used on some external websites without their control. For this reason, publishers might refrain from sending any IDs along with their SDA signals for their most valuable inventory, making sure their users’ data is not monetized on somebody else’s properties. What’s more, SDA signals can even be used to further enhance fingerprinting (by adding yet another potential fingerprinting surface), which increases the probability of user reidentification.

What’s the verdict?

We don’t have one yet but please bear in mind that this is just the first part of our series of articles regarding Seller-Defined Audiences. In the next part, we’re going to introduce the results of our quality tests performed on SDA signals. Stay tuned!

If you have any questions, comments or issues, or you’re interested in meeting with us, please get in touch.

More Articles

See our reports, articles, guides, videos, and more.