Methods — EU Ad Transparency Report

This page describes our methods to track political advertising and produce an ad transparency report for the 2019 European Parliament Election.
We attempted to download a copy of the political ads on a daily basis using the Facebook Ad Library API and the Google Ad Library, starting on March 29 and May 11 respectively, when the two companies released their political ads archive.

Facebook Ad Library API

Facebook provides an Application Programmable Interface ("API") to authorized users who may search for ads in their archive. However, due to the inconsistent state of the Facebook Ad Library API, our methods to scan and discover ads must be adapted on a daily and sometimes hourly basis. We regret we do not have reliable or predictable instructions on how to retrieve political ads from Facebook. Below, we describe our default crawler settings and various workarounds. For more details, see our data collection log.
Identity Confirmation
To gain access to the API, you need to first confirm your identity with Facebook. The process took approximately 11 days for us, from submitting the request online to receiving the confirmation code in the mail.
API Parameters
According to the Ad Library API documentation, the following parameters are available in a search query: ad_active_status, ad_reached_countries, ad_type, search_page_ids, and search_terms.
By default, the API returns only 25 ads per page. You may request more ads per page by adding an undocumented parameter, limit, to either your initial search query or requests for subsequent pages.
For each search, you also may request various additional data fields. Available data fields are documented on the Archived Ad page.
Default Crawler Settings
Our default daily crawl consists of the following search where [TOKEN] is the application token that you'll receive after identity confirmation.
https://graph.facebook.com/v3.2/ads_archive?access_token=[TOKEN]&ad_type=POLITICAL_AND_ISSUE_ADS&ad_active_status=ALL&fields=ad_creation_time%2Cad_creative_body%2Cad_creative_link_caption%2Cad_creative_link_description%2Cad_creative_link_title%2Cad_delivery_start_time%2Cad_delivery_stop_time%2Cad_snapshot_url%2Ccurrency%2Cfunding_entity%2Cimpressions%2Cpage_id%2Cpage_name%2Cspend%2Cregion_distribution%2Cdemographic_distribution&limit=500&ad_reached_countries=AT%2CBE%2CBG%2CHR%2CCY%2CCZ%2CDK%2CEE%2CFI%2CFR%2CDE%2CGR%2CHU%2CIE%2CIT%2CLV%2CLT%2CLU%2CMT%2CNL%2CPL%2CPT%2CRO%2CSK%2CSI%2CES%2CSE%2CGB&search_terms=.
For readability, below is the unescaped URL:
https://graph.facebook.com/v3.2/ads_archive?access_token=[TOKEN]&ad_type=POLITICAL_AND_ISSUE_ADS&ad_active_status=ALL&fields=ad_creation_time,ad_creative_body,ad_creative_link_caption,ad_creative_link_description,ad_creative_link_title,ad_delivery_start_time,ad_delivery_stop_time,ad_snapshot_url,currency,funding_entity,impressions,page_id,page_name,spend,region_distribution,demographic_distribution&limit=500&ad_reached_countries=AT,BE,BG,HR,CY,CZ,DK,EE,FI,FR,DE,GR,HU,IE,IT,LV,LT,LU,MT,NL,PL,PT,RO,SK,SI,ES,SE,GB&search_terms=.
Parameter — ad_active_status
We always set ad_active_status=ALL to request all ads in the archive.
Parameter — ad_reached_countries
By default, we search for ads in the all 28 member states by setting ad_reached_countries to AT,BE,BG,HR,CY,CZ,DK,EE,FI,FR,DE,GR,HU,IE,IT,LV,LT,LU,MT,NL,PL,PT,RO,SK,SI,ES,SE,GB.
Facebook uses ISO instead of EU-recommended country codes
Facebook uses ISO-3166 country code instead of the EU-recommended country code. The United Kingdom is coded as GB instead of UK. Greece is coded as GR instead of EL.
Workaround for pagination errors
We frequently encounter pagination issues (e.g., infinite loop bug, invalid next page bug, random termination bug) and are unable to retrieve all pages associated with a search. When such errors occur, we split the crawl into 28 smaller searches, by requesting the ads for each member state separately (e.g., ad_reached_countries=AT, ad_reached_countries=BE, ad_reached_countries=BG).
However, please be aware that the API may return inconsistent results when you search for ads in all E.U. member states collectively in a single query, than when you search each member state individually in a separate query.
Parameter — ad_type
We set this parameter to the default and only supported value, POLITICAL_AND_ISSUE_ADS.
Not recognized in Graph Explorer API
This parameter is not recognized by the Graph API Explorer, a tool used by Facebook for recording API sessions and reporting bugs. Remember to remove this parameter when reporting bugs, otherwise Facebook will not be able to reproduce your API sessions.
Parameter — search_terms
Empirically, we find that we could retrieve the most number of ads by using the period (.) as the search term — after experimenting with multilingual dictionary-based approaches, stopwords, and other types of punctuations.
No guarantee of completeness
Even though the period (.) returns more ads than other search terms, the tactic does not guarantee that you will uncover all ads in the archive. In fact, the API provides no guarantee of completeness.
Incorrect results
Be aware, even when you specify a search term, the API may not return all ads matching the search term.
Unreproducible results
More generally, the results provided by the API are unreproducible. You may receive significantly different results when you repeat an identical search on the same day, or even when you conduct two identical searches within seconds of each other.
Parameter — search_page_ids
We do not have the necessary data to use this field.
No available values
Even though the Ad Library API provides the capability to search for ads using page_id, the API does not provide a list of available page_ids.
Technically, a list of page_ids could be scraped from the Facebook Ad Library Report. However, such actions are prohibited by Facebook's terms of services. We did not take such actions, and therefore do not have the data needed to crawl political ads in the European Union using page_ids.
In any case, the Facebook Ad Library Report for the European Union was not released until after May 18, when the API had already become non-functioning.
Parameter — limit
We request 500 ads per page at the start of each day. However, depending on the errors and error types, we may increase, decrease, or sample the potential values of limit.
Reasons for decreasing the value
Even though in public statements (e.g., Ad Library FAQ), Facebook states that users may request up to 5,000 ads per page, empirically, we find that the API fails with increasing frequency as we request more ads per page. While the API may succeed on odd occasions, it would frequently return zero ads and ask us to re-try our requests. On May 7, when we last measured the API failure rates, we received on average 223, 125, and 101 ads per page, when we requested 1,000, 2,000, and 4,000 ads respectively.
When you encounter a significant amount of general API failures, you may wish to reduce limit otherwise requesting more ads may cause the API to return fewer ads.
Reasons for increasing the value
Due to the numerous pagination bugs in the Ad Library API, we often cannot retrieve all pages associated with a search (e.g., infinite loop bug, invalid next page bug, random termination bug). In which case, increasing the value of limit may reduce the chance of encountering a pagination failure and improve the likelihood of completing a search.
On the days when you encounter a significant amount of pagination errors, you may wish to increase the value of limit. Retrieving each additional page is like playing a round of Russian roulette, it's strategic to minimize the number of potential failure points.
Reasons for sampling potential values
The Ad Library API can fail in other unexpected ways. In early May, the API would fail when returning exactly 100 ads per page. Starting in late May, the API would fail when returning exactly one page of ads.
You may wish to randomly perturb the value of limit to guard against such failures.

Google Ad Library

Google releases their political ads archive as a public dataset. Below, we describe our SQL statements for retrieving advertiser and ad data.
Advertiser Statistics
We download a list of advertisers by executing the following SQL query on Google BigQuery.
SELECT * FROM `bigquery-public-data.google_political_ads.advertiser_stats` WHERE elections = "EU-Parliament";
Ad Statistics
We download a list of ads by executing the following SQL query on Google BigQuery.
SELECT * FROM `bigquery-public-data.google_political_ads.creative_stats` WHERE regions != "US" and regions != "IN";
Future updates
During the 2019 European Parliament election, the Google Ad Library contained only ads for three election cycles in the US, India, and the European Union. The above query will need to be updated in the future.