Data Collection Log — EU Ad Transparency Report

This page documents our efforts to track political advertising and produce an ad transparency report for the 2019 European Parliament Election.
We attempted to download a copy of the political ads on a daily basis using the Facebook Ad Library API and the Google Ad Library, starting on March 29 and May 11 respectively, when the two companies released their political ads archive. We provide this data collection log, so that external researchers, journalists, analysts, and readers may examine our methods and assess the data presented in our reports.
Facebook Ad Library API
Facebook provides an Application Programmable Interface ("API") to authorized users who may search for ads in their archive. However, due to the inconsistent state of the Facebook Ad Library API, our methods to scan and discover ads must be adapted on a daily and sometimes hourly basis — to deal with design limitations, data issues, and numerous software bugs in the Facebook Ad Library API.
Despite our best efforts to help Facebook debug their system, the majority of the issues were not resolved. The API delivered incomplete data on most days from its release through May 16, when Facebook fixed a critical bug. The API was broken again from May 18 through May 26, the last day of the elections.
We regret we do not have reliable or predictable instructions on how to retrieve political ads from Facebook. Visit the methods page for our default crawler settings and a list of suggested workarounds for known bugs, or to see our log.
In general, we encountered three categories of issues with the Facebook Ad Library API. First, software programming errors that cripple a user's ability to complete even a single search including the following bugs:
Second, technical or data issues that affect a user's ability to reliably retrieve data from multiple searches:
Third, design limitations that would have prevented users from retrieving a sufficient quantity of data, even if the API had been functional:
Google Ad Library
Google releases their political ads archive as a public dataset. We describe how we download political ads from Google Cloud Services on our methods page, or scroll down to see our data collection log. We did not experience any technical issue with the dataset or the download process.

Facebook Ad Library API

March 29, 2019
We began accessing the Facebook Ad Library API.
March 29, 2019
We began searching for political ads, and found the following design limitations.
March 29, 2019 — ISSUE: No Guarantee of Completeness
Facebook designs their API, so that a user must provide a keyword or an advertiser name — in order to retrieve ads from the Ad Library.
As an example, to investigate how paid political ads discuss the health services in Britain, a user might provide the following keywords to the API: national, health, service, and funding. However, the API returns only ads that match the exact keyword. Potentially relevant ads that use the word services (i.e., the plural form of a keyword) or the abbreviation NHS (instead of the full spelling) will not show up in the result.
Without knowing ahead of time, the complete set of words used by all advertisers in all countries in all languages, a user cannot retrieve the full set of political ads from the Facebook Ad Library API.
UPDATE: March 29 - April 3, 2019 — No data
We were unable to find any political ads in the European Union except for the existing ads in the United Kingdom, after four days of searching.
UPDATE: April 4 - 7, 2019 — Only one ad
We were unable to find any political ads except for one ad in Germany, after more than a week of searching.
UPDATE: April 14, 2019 — The API returns incorrect results (keywords)
We found errors in the API, where the Facebook Ad Library API does not return all ads matching a keyword, even when a user provides the keyword.
Back to Top
March 29, 2019 — ISSUE: Significant Limit on Search Terms
We found that the number of allowable search terms is not sufficient to cover all relevant political issues and all candidates in the election.
Facebook limits each user to making 200 API requests per hour (or 4,800 requests per day). Divided among the 28 member states of the European Union, a user is allocated on average 171 API requests per country per day.
To discover ads about a political issue, a users must specify all keywords relevant to the issue. Every spelling variation counts as a separate keyword (e.g., plurals, abbreviations, colloquial forms). Furthermore, some countries have multiple official or unofficial languages, and require the keywords be specified in multiple languages.
As another example, 308 candidates ran for the European Parliament in the Netherlands. Even if we omit political issues and track only ads that mention a candidate by name, we still do have enough bandwidth to list all candidate names. In larger member states such as Germany, 1,380 candidates ran for the elections, far exceeding the search term limit.
UPDATE: April 10, 2019
We abandoned our approach to retrieve political ads using relevant keywords. We began scanning the Ad Library using stopwords such as the, of, and, a in English, je, de, ne, pas in French, das, nicht, die, es in German, and so forth.
UPDATE: April 14, 2019
We abandoned our approach to retrieve political ads using stopwords. We began scanning the Ad Library using punctuations such as the period (.), the exclamation mark (!), etc.
Back to Top
March 29, 2019 — ISSUE: Significant Limit on Bandwidth
The API returns only 25 ads per page by default, making it difficult to retrieve a reasonable quantity of data necessary for daily reporting.
Each request for an additional page counts against a user's API rate limit of 200 requests per hour. In other words, after making a search and receiving the first page of ads, a user must choose between requesting a second page of results or searching for a new keyword — further diluting the above limited pool of search terms.
UPDATE: April 14, 2019 — The API exhibits exceedingly high error rates
We were informed by another researcher of an undocumented parameter, limit, that could increase the number of ads per page. However, the Ad Library API fails with increasing frequency as we request more ads per page.
Back to Top
March 29, 2019 — No data
We were unable to find any political ads in the European Union, except for the existing ads in the United Kingdom that Facebook had already released. Due to the design of the API, we are unable to determine whether the Ad Library was empty, or whether the ads were written using words that differ from our search keywords.
RELATED: The API provides no guarantee of completeness
March 30, 2019 — No data
We were unable to find any political ads, except for the existing ads in the United Kingdom.
April 1, 2019 — No data
We were unable to find any political ads, except for the existing ads in the United Kingdom.
April 2, 2019 — No data
We were unable to find any political ads, except for the existing ads in the United Kingdom.
April 3, 2019 — No data
We were unable to find any political ads, except for the existing ads in the United Kingdom.
April 4, 2019 — Only one ad
We uncovered the first politcal ad in Germany that matched the keyword berlin.
RELATED: The API provides no guarantee of completeness
April 5, 2019 — Only one ad
We were unable to find any political ads except for the one ad in Germany.
April 6, 2019 — Only one ad
We were unable to find any political ads except for the one ad in Germany.
April 7, 2019 — Only one ad
We were unable to find any political ads except for the one ad in Germany.
April 8, 2019
We began scanning the Ad Library API using 125 keywords in five languages (English, German, French, Italian, and Spanish) and uncovered two more political ads in France.
April 10, 2019
We found that Facebook does not filter out stopwords in the API search algorithm. Instead of searching for relevant keywords, we began scanning for political ads using stopwords in seven languages. Examples of stopwords include the, of, and, a in English, je, de, ne, pas in French, das, nicht, die, es in German, and so forth.
RELATED: The API places a significant limit on search terms
April 11, 2019 — Irregularities in search results
We observed irregularities in the search results provided by Facebook, and began conducting experiments to test the correctness of the Ad Library API.
April 12 - 13, 2019 — Blocked
We were blocked from accessing the Ad Library API for approximately 36 hours after exceeding the rate limit.
April 14, 2019 — ISSUE: The Ad Library API returns incorrect results (keywords)
We found errors in the API, where the Facebook Ad Library API does not return all ads matching a keyword, even when a user provides the keyword.
Details
We compiled a list of the top 1,000 words in the British National Corpus. We used the words as search terms to crawl the Ad Library. We downloaded a total of 725,656 ads containing the 1,000 keywords. After deduplication, we identified 37,452 distinct ads. On average, each ad was retrieved 19 times using 19 different search terms.
We then reconstructed the list of ads that should have been returned for each keyword search, and compared them to the list of ads that were actually returned by the API. We found that in 33.04% of the cases, even if a political ad contains a keyword, the ad is not returned by the API when we search for the keyword.
UPDATE: May 26, 2019 — Unresolved
We reported the bug to Facebook, but the issue remains unaddressed through May 26, the last day of the elections.
RELATED: The API provides no guarantee of completeness
Back to Top
April 14, 2019
We found that Facebook does not filter out punctuations in their API search algorithm, and began scanning for political ads using the period (.) as the keyword.
We conducted experiments to determine the best-performing search term for retrieving ads from the Ad Library API. We uncovered the most number of ads by using the period, over 16 other punctuations and over the best-performing English stopwords.
RELATED: The API places a significant limit on search terms
April 14, 2019 — ISSUE: The Ad Library API exhibits exceedingly high error rates
We were informed by another researcher of an undocumented parameter, limit, that could increase the number of ads per page.
However, when we conducted experiments to determine the valid parameter range and optimal setting for limit, we found that the Ad Library API fails with increasing frequency as we request more ads per page. We empirically determined that, the parameter is usable for values up to approximately 500 ads per page.
Details
When we increase the value of limit, the API would often fail, returning zero ads, and ask us to retry our request.
The API fails with 100% certainty when limit is above 5,000. Between 500 and 5,000 ads per page, the API fails randomly with increasingly probability. While we may occasionally receive a long page of ads, we often need to repeatedly retry our requests, wasting time and precious API rate limit.
Mathematically, it is more efficiently to request fewer ads per page.
UPDATE: May 6, 2019
Facebook updated the "Frequently Asked Questions" section of the Ad Library API website, and stated that users may request up to 5,000 ads per page.
UPDATE: May 7, 2019 — The API continues to fail frequently
We re-tested the API, and found that the Ad Library API failed 97.3% of the time when we requested 4,000 ads per page. Despite setting limit to 4,000, we retrieved on average only 101 ads per page, after accounting for the number of times we received zero ads and must retry our requests.
UPDATE: May 20, 2019
Responding to our bug report about the high failure rates, Facebook suggests that we keep limit to 200 or lower when accessing the Ad Library API.
RELATED: Significant Limit on Bandwidth
Back to Top
April 15, 2019 — First crawl of the Facebook Ad Library
Based on our experiments to date, we made our first attempt to crawl political ads in the European Union using the period (.) as the search term and requesting 500 ads per page. We uncovered 38,302 ads in the United Kingdom and 4,803 ads in the other 27 member states.
While we believe the downloaded data is accurate, we are unable to determine whether the API returned all ads matching our search term or estimate number of ads we might have missed.
April 16, 2019 — ISSUE: The Ad Library API is trapped in an infinite loop
We manually aborted our crawl today because the Ad Library API appeared to be trapped in an infinite loop, repeatedly returning the same set of ads over and over again.
Details
When a user conducts a search, if the results contain too many ads, the Facebook Ad Library API splits the results into pages.
A user must then make a separate API request to retrieve each page in sequence. The token for retrieving page 2 of the search results is contained in page 1. The token for retrieving page 3 is contained in page 2, and so forth.
However, the pagination algorithm in the Ad Library API is broken. As an example, on this day, after retrieving page 37 of our search for ads containing the period (.), the API pointed us back to page 36 instead of page 38. The API continued to point us between pages 36 and 37, and never provided us with the token for page 38. As a result, we could never request page 38 or any of the subsequent pages.
UPDATE: April 25, 2019
Facebook reproduced the error based on our report, and acknowledged the bug.
UPDATE: May 3, 2019 — Crawl failed due to the infinite loop bug
We made 644 API requests and downloaded a total of 161,786 ads. However, after returning 3,104 distinct ads in the first five minutes, the API got trapped in an infinite loop. Over the next four hours, the API repeatedly returned the same ads over and over again, sending us 158,682 ads that were duplicates of what we had already received. At which time, we decided that the API will likely not provide any additional data, other than the same 3,104 ads. We manually aborted the crawl.
UPDATE: May 3, 2019
We isolated and traced the infinite loop bug to the German ads archive, and reported our findings to Facebook.
UPDATE: May 5, 2019 — Crawl failed due to the infinite loop bug
Our daily crawl failed because the API began to send us duplicate ads after 21 minutes. After observing three hours of repeated data, we manually aborted the crawl.
UPDATE: May 5, 2019
We isolated and traced the new source of errors to the Austrian ads archive, and reported our findings to Facebook.
UPDATE: May 5, 2019 — Blocked from reporting additional bugs
We were informed that we've been reporting too many bugs. Our report was blocked and removed by Facebook.
UPDATE: May 7, 2019 — Multiple crawls failed due to the infinite loop bug
We attempted our daily crawl seven times today. We manually aborted four of the crawls when the API began sending repeated data. After approximately 23 hours, we downloaded a combined total of 1,219,683 ads. After deduplication, we identified 167,518 distinct ads and discarded 1,052,165 duplicates.
UPDATE: May 8 - 9, 2019 — Multiple crawls failed due to multiple types of errors
We attempted our daily crawl twelve times. The process lasted two days into May 9. All twelve crawls were identical and should have produced identical data. Instead, however, the crawls all yielded incomplete and different results. We discovered three new types of errors in the API, in addition to the infinite loop bug, that caused nine of the twelve crawls to fail today.
UPDATE: May 9, 2019 — Blocked from reporting additional bugs
We attempted to report the three new bugs. However, our reports were blocked and removed by Facebook.
UPDATE: May 10 - 11, 2019 — Multiple crawls failed due to multiple types of errors
To keep this log concise, we omit the exact breakdown of errors and error types for the rest of this section. All dates below include crawls that encountered the infinite loop bug.
We attempted our daily crawl 13 times today. The process lasted two days into May 11. We experienced multiple types of API failures.
UPDATE: May 12, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 17 daily crawls, and experienced multiple types of API failures.
UPDATE: May 13, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 11 daily crawls, and experienced multiple types of API failures.
UPDATE: May 14, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 11 daily crawls, and experienced multiple types of API failures.
UPDATE: May 15, 2019 — Multiple crawls failed due to multiple types of errors
We attempted ten daily crawls, and experienced multiple types of API failures.
UPDATE: May 16, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 15 daily crawls, and experienced multiple types of API failures.
UPDATE: May 16, 2019
Facebook resolved the infinite loop bug.
RELATED: The API returns unreproducible results (identical searches)
Back to Top
April 17 - 18, 2019 — Blocked
Our crawler was blocked for approximately 36 hours after exceeding the rate limit.
April 18 - 19, 2019 — Blocked
Our crawler was again blocked for approximately 36 hours after exceeding the rate limit.
April 20 - 21, 2019
We rewrote our crawler software to use a closed-loop scheduler that makes 50 requests per 15 minutes, instead of an open-loop scheduler based on a sleep timer, to more preciously control the speed of our requests.
April 21, 2019 — Second crawl of the Facebook Ad Library
Our crawler completed without error and without being blocked by Facebook. We uncovered 74,659 political ads in the European Union. While we believe the downloaded data is accurate, we are unable to estimate the number of ads we might have missed.
April 22, 2019 — Third crawl of the Facebook Ad Library
Our crawler completed without error and without being blocked by Facebook. We uncovered 77,379 political ads in the European Union. While we believe the downloaded data is accurate, we are unable to estimate the number of ads we might have missed.
April 22 - 25, 2019
After three crawls without errors and without running afoul of the rate limit, we began to write additional software and set up the infrastructure needed for conducting daily crawls.
April 25, 2019 — Daily crawl completed without error
We believe the downloaded data is accurate, but are unable to to determine its completeness.
April 26, 2019 — Daily crawl completed without error
We believe the downloaded data is accurate, but are unable to to determine its completeness.
April 27, 2019 — Daily crawl completed without error
We believe the downloaded data is accurate, but are unable to to determine its completeness.
April 28, 2019 — Daily crawl completed without error
We believe the downloaded data is accurate, but are unable to to determine its completeness.
April 29, 2019 — Daily crawl completed without error
We believe the downloaded data is accurate, but are unable to to determine its completeness.
April 30, 2019
We began crawling the Ad Library multiple times per day, monitoring day-over-day changes, and comparing the multiple crawls to assess the completeness of our dataset.
May 1, 2019 — Multiple crawls completed without error
We believe the downloaded data is accurate. We began assessing its completeness.
May 2, 2019 — Multiple crawls completed without error
We believe the downloaded data is accurate. We continued to assess its completeness.
May 3, 2019 — Crawl failed due to the infinite loop bug
We manually aborted one of our daily crawls because the Ad Library API was trapped in an infinite loop, returning the same set of ads over and over again.
Details
We became alarmed when the one of our daily crawls did not complete after four hours, even though it had not encountered any error.
Inspecting its data, we found that the crawl had made 644 API requests and downloaded a total of 161,786 ads. However, after returning 3,104 distinct ads in the first five minutes, the API appeared to be trapped in an infinite loop. Over the next four hours, the API repeatedly returned the same ads over and over again, and sent us 158,682 ads that were duplicates of what we had already received. At which time, we decided that the API will likely not provide any additional data. We manually aborted the crawl.
We isolated and traced the infinite loop bug to the German ads archive, and reported our findings to Facebook.
RELATED: The API is trapped in an infinite loop
May 4, 2019 — ISSUE: The Ad Library API returns inconsistent results (member states)
We discovered that the Facebook Ad Library API returns fewer ads when we search for ads in all E.U. member states collectively in a single query, than when we search each member state individually in a separate query.
Details
Up to this point, we had conducted our daily crawls by searching for ads in the European Union as a whole. As a workaround for the infinite loop bug, we began searching for ads in each E.U. member state separately. The process requires 28 times more API requests, and significantly slows down our crawl. However, this workaround prevents errors in a single country's ads archive for stopping an entire crawl.
Today, when we searched for ads in all 28 member states collectively, the API returned 6,449 distinct ads at the end of the search, which completed normally without any error. However, when we searched for ads in each member state individually, using 28 separate queries, we downloaded a combined total of 170,717 ads and identified 130,936 distinct ads.
UPDATE: May 6, 2019 — Inconsistent search results
We again observed discrepancy between searching for ads in all E.U. member states collectively and searching each member state individually.
UPDATE: May 9, 2019
Facebook acknowledged the error.
UPDATE: May 26, 2019 — Unresolved
The bug remains unresolved through the last day of the elections.
Back to Top
May 5, 2019 — Crawl failed due to the infinite loop bug
We manually aborted our only crawl today after the Ad Library API got trapped in an infinite loop.
Details
We were busy analyzing errors from the previous two days, and so had the time to launch only a single crawl today. Our sole crawl failed because the API began to send us duplicate ads after 21 minutes. After observing three hours of repeated data, we manually aborted the crawl.
We isolated and traced the bug to the Austrian ads archived, and reported our findings to Facebook.
RELATED: The API is trapped in an infinite loop
May 5, 2019 — Blocked from reporting additional bugs about the Ad Library API
After submitting all required materials, including saved Graph API Explorer sessions and step-by-step instructions to reproduce errors in the Austrian ads archive, we were informed that we've been reporting too many errors. Our bug report was blocked and removed by Facebook.
May 6, 2019 — Crawl failed due to inconsistent search results
We again observed that the Facebook Ad Library API returns fewer ads when we search for ads in all E.U. member states collectively in a single query, than when we search each member state individually in a separate query.
RELATED: The Ad Library API returns inconsistent results (member states)
May 6, 2019
Facebook updated the "Frequently Asked Questions section of them Ad Library API website, and stated that users may request up to 5,000 ads per page.
May 7, 2019 — The Ad Library API continues to exhibit exceedingly high error rates
We had earlier conducted experiments to assess the usability of the undocumented limit parameter on April 14. We found that the API fails frequently when we request more than 500 ads per page. Due to Facebook's announcement yesterday, we decided to re-test the API. We found again that the Ad Library API continues to fail frequently.
Details
Based on 3,336 API requests made over 19 hours, we found that when limit was set to 1,000, the API failed 70.1% of the time. At 2,000, the API failed 92.8% of the time. At 4,000, the API failed 97.3% of the time.
In other words, after accounting for the number of times we received zero ads and must retry our requests, we received on average only 223, 125, and 101 ads per page, even though we requested 1,000, 2,000, and 4,000 ads per page respectively. Due to the high failure rates, asking more than 500 ads per page actually reduces the number of ads returned by the API.
RELATED: The API exhibits exceedingly high error rates
May 7, 2019 — Multiple crawls failed due to the infinite loop bug
We attempted our daily crawl seven times today. We manually aborted four of the crawls when the API began sending repeated data. After approximately 23 hours, we downloaded a combined total of 1,219,683 ads. After deduplication, we identified 167,518 distinct ads and discarded 1,052,165 duplicates.
RELATED: The API is trapped in an infinite loop
May 8 - 9, 2019 — ISSUE: The Ad Library API returns unreproducible results (identical searches)
We attempted our daily crawl twelve times. The process lasted two days into May 9. All twelve crawls were identical and should have produced identical data. Instead, however, the crawls returned different results. We discovered three new types of errors, in addition to the infinite loop bug, that caused nine of the crawls to fail.
Details
One of the crawls failed because the Facebook Ad Library API cannot return exactly 100 ads per page. The API would fail on exactly 100 ads per page, but works if we change the request to 99 or 101 ads per page.
Three of the crawls failed because the API returned an invalid next page token. When we use the token provided by the API to retrieve the next page, the token would be rejected by the API itself. Therefore, we were unable to retrieve the remaining pages in our search results.
Four of our crawls terminated randomly. Even though the API did not provide an error code, the total ad count differed from the other crawls by up to 129 times. We considered data unreliable and discarded these crawls.
We manually aborted one of the crawls when the API entered an infinite loop after 56 minutes, and was trapped for three hours.
Even though three crawls eventually completed, it took us 12 attempts. In total, we downloaded 736,832 ads, and identified 180,215 distinct ads.
UPDATE: May 9, 2019 — Blocked from reporting additional bugs
We attempted to report the three new bugs. However, we were informed that we've been reporting too many errors. Our first bug report was accepted. Our second bug report was blocked and removed by Facebook. We asked another researcher, who was experiencing the same errors, to help file the third bug report.
UPDATE: May 10, 2019 — Multiple crawls failed due to multiple types of errors
To keep this log concise, we omit the exact breakdown of errors and error types for the rest of this section. All dates below include multiple identical crawls that produced different results.
We attempted our daily crawl 13 times today. The crawls experienced multiple types of API failures and returned different results.
UPDATE: May 12, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 17 daily crawls, which experienced multiple types of API failures and returned different results.
UPDATE: May 13, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 11 daily crawls, which experienced multiple types of API failures and returned different results.
UPDATE: May 14, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 11 daily crawls, which experienced multiple types of API failures and returned different results.
UPDATE: May 15, 2019 — Multiple crawls failed due to multiple types of errors
We attempted ten daily crawls, which experienced multiple types of API failures and returned different results.
UPDATE: May 16, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 15 daily crawls, which experienced multiple types of API failures and returned different results.
UPDATE: May 18 - 19, 2019 — All crawls failed due to the invalid next page bug
The invalid next page bug became widespread today. We attempted 178 daily crawls, all of which failed due to the invalid next page bug, and thus yielded incomplete and different results. We were unable to refresh our dataset.
UPDATE: May 20, 2019 — Crawl failed due to the exactly one page bug
We attempted one crawl today and discovered a new type of error where the API cannot return exactly one page of ads.
UPDATE: May 21, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed due to the invalid next page bug and/or the exactly one page bug, and thus yielded incomplete and different results.
UPDATE: May 22, 2019 — All crawls failed due to two types of errors
We attempted 10 daily crawls, all of which failed, and thus yielded incomplete and different results.
UPDATE: May 23, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed, and thus yielded incomplete and different results.
UPDATE: May 24, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed, and thus yielded incomplete and different results.
UPDATE: May 25, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed, and thus yielded incomplete and different results.
UPDATE: May 26, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed, and thus yielded incomplete and different results.
UPDATE: May 26, 2019 — Unresolved
The issue remains unresolved through the last day of the elections.
The API is trapped in an infinite loop
The API fails when returning exactly 100 ads per page
The API returns invalid next pages
The API randomly terminates a search
The API fails when returning exactly one page of ads
Back to Top
May 8, 2019 — ISSUE: The Ad Library API fails when returning exactly 100 ads per page
We discovered that the API would return a random error code, with an unspecified suberror code and no data payload if we request exactly 100 ads per page.
Details
The API responds to requests for limit of 99 (to retrieve 99 ads per page) or limit of 101 (to retrieve 101 ads per page), but whenever a user requests exactly 100 ads per page, the API would fail. The API would return no data payload. The error code is typically 1 but, on some occasions, the error code is 100. The response would also contain an unspecified suberror code.
UPDATE: May 9, 2019 — Potentially still blocked from reporting additional bugs
Since we were potentially still blocked from reporting bugs by Facebook, we asked another researcher experiencing the same issues to help file the bug report.
UPDATE: May 13, 2019
Facebook reproduced the error, and acknowledged the bug.
UPDATE: May 14, 2019
Facebook resolved the 100 ads per page bug.
RELATED: The API returns unreproducible results (identical searches)
Back to Top
May 8, 2019 — ISSUE: The Ad Library API returns invalid next pages
Multiple crawls failed today because the Ad Library API provided us with invalid next page tokens, and so we were unable to retrieve the remaining pages in our search results.
Details
When a user conducts a search, if the results contain too many ads, the Facebook Ad Library API splits the results into pages.
A user must then make a separate API request to retrieve each page in sequence. The token for retrieving page 2 of the search results is contained in page 1. The token for retrieving page 3 is contained in page 2, and so forth.
Today, we encountered another bug in the pagination algorithm. After we began searching for ads containing the period (.) and after retrieving 58 pages of results, the API provided us with a token for retrieving the 59th page. However, when we use the token given by the API to retrieve the page, the token was rejected by the API itself. We were unable to retrieve the remaining pages in our search results.
Two additional daily crawls also failed (for a total of three), when the API provided us with invalid next page tokens.
UPDATE: May 10, 2019
Facebook reproduced the error based on our report, and acknowledged the bug.
UPDATE: May 17, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 24 daily crawls. Five ended in failure when the API returned invalid tokens for retrieving the next page.
UPDATE: May 18 - 19, 2019 — All crawls failed due to the invalid next page bug
The invalid next page bug became widespread today. We made a total of 178 attempts to restart our daily crawl. However, every single one of the 178 crawls encountered the invalid next page bug and failed to complete. We were unable to refresh our dataset.
UPDATE: May 21, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed to complete due to the invalid next page bug and/or the exactly one page bug. All crawls from this date onward will fail due to these two bugs. We have been unable to refresh our dataset since May 17 and will not be able to refresh our dataset through the end of the elections.
UPDATE: May 22, 2019 — All crawls failed due to two types of errors
We attempted 10 daily crawls, all of which failed to complete.
UPDATE: May 23, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed to complete.
UPDATE: May 24, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
UPDATE: May 25, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
UPDATE: May 26, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
UPDATE: May 26, 2019 — Unresolved
We've been unable to refresh our dataset of political ads in the European Union since May 18. Despite our reminder of the dates of the European Parliament election, the issue remains unresolved as of May 26, the last day of the elections.
UPDATE: June 6, 2019 — Facebook will not fix the bug
Responding to our bug report, Facebook stated that they will not fix the bug.
RELATED: The API returns unreproducible results (identical searches)
Back to Top
May 8, 2019 — ISSUE: The Ad Library API randomly terminates a search
We discarded the results from four of our daily crawls. Even though the search completed "normally" (i.e., when the API indicated there is no next page), the total ad counts from these crawls differed from the others by up to 129 times. We considered data unreliable and discarded these crawls.
Details
Per the Ad Library API documentation, a search is considered completed when the API provides no token for retrieving the next page.
However, we suspect there may be another bug in the pagination algorithm where the API provides no token for retrieving the next page, even though there are still ads remaining in the search results.
We flag this bug when results from a search deviates significantly from other identical searches. As an example, today, we observed four crawls where the total number of ads differed from others by up to 129 times, and thus highly likely that the search and/or pagination was terminated early.
We are unable to reliably reproduce the bug. Its onset appears to be random. We also suspect the issue occurs more frequently, but may be harder to detect if the bug occurs near the end of a search and the deviation is less pronounced.
RELATED: The API returns unreproducible results (identical searches)
Back to Top
May 10 - 11, 2019 — Multiple crawls failed due to multiple types of errors
We attempted our daily crawl 13 times today. The process lasted two days into May 11. We experienced multiple types of API failures.
May 12, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 17 daily crawls, and experienced multiple types of API failures.
May 13, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 11 daily crawls, and experienced multiple types of API failures.
May 14, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 11 daily crawls, and experienced multiple types of API failures.
May 15, 2019 — Multiple crawls failed due to multiple types of errors
We attempted ten daily crawls, and experienced multiple types of API failures.
May 16, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 15 daily crawls, and experienced multiple types of API failures.
May 16, 2019
Facebook fixed the infinite loop bug
May 17, 2019 — Multiple crawls failed due to multiple types of errors
We attempted 24 daily crawls. Five failed because the API returned invalid tokens for retrieving the next page. We manually aborted four crawls, after the API began to exhibit exceedingly high error rates.
May 18 - 19, 2019 — All crawls failed due to the invalid next page bug
The invalid next page bug became widespread today. We attempted 178 daily crawls, all of which failed to complete. We were unable to refresh our dataset, and will not be able to refresh our dataset through the end of the elections.
May 20, 2019 — ISSUE: The Ad Library API fails when returning exactly one page of ads
We discovered a new type of error that arises when the search results fit onto exactly one page. This bug disproportionally affects smaller E.U. member states such as Bulgaria, Croatia, Cyprus, Estonia, Latvia, Lithuania, Malta, Portugal, and Slovenia.
Details
When searching for political ads, if the total number of ads is less than limit, then all results could in theory fit onto a single page. When this happens, the API will fail and instead return 0 ads. However, if we keep decreasing the value of limit, so that the results span two or more pages, then the API will return two or more pages of ads.
As an example, when we requested 4,000 ads per page, every single request to retrieve ads from Portugal failed. When we reduced limit to 1,000, we received four pages of results consisting of 1,000, 1,000, 1,000, and 909 ads per page respectively.
While this issue may appear similar to the exceeding high error rates issue, we believe this is caused by a separate bug. Requests for larger member states would randomly succeed, but requests for smaller member states failed for seven straight days, from this date through May 26.
All dates below include crawls that encountered the exactly one page bug.
UPDATE: May 21, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed to complete.
UPDATE: May 22, 2019 — All crawls failed due to two types of errors
We attempted 10 daily crawls, all of which failed to complete.
UPDATE: May 23, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed to complete.
UPDATE: May 24, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
UPDATE: May 25, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
UPDATE: May 26, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
UPDATE: May 26, 2019 — Unresolved
We reported the bug to Facebook, but the issue remains unaddressed through May 26, the last day of the elections.
RELATED: The API returns unreproducible results (identical searches)
Back to Top
May 21, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed to complete.
May 22, 2019 — All crawls failed due to two types of errors
We attempted 10 daily crawls, all of which failed to complete.
May 23, 2019 — All crawls failed due to two types of errors
We attempted 20 daily crawls, all of which failed to complete.
May 24, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
May 25, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.
May 26, 2019 — All crawls failed due to two types of errors
We attempted 11 daily crawls, all of which failed to complete.

Google Ad Library

May 11, 2019
We began accessing the Google Ad Library.
May 11, 2019
We completed our script for downloading political ads from the Google Ad Library.
May 11, 2019
We successfully downloaded a full copy of the European Union politcal ads dataset.
May 11, 2019 — Inconsistent capitalization
We found inconsistent capitalization in the header row of the table creative_stats. We reported the issue to Google and suggested that the 25th column be renamed from spend_Range_Min_dkk to spend_range_min_dkk.
May 14, 2019
Google resolved bug, and regenerated their political ads dataset using the new column name.
May 12, 2019
We downloaded a full copy of the European Union politcal ads.
May 13, 2019
We downloaded a full copy of the European Union politcal ads.
May 15, 2019
We downloaded a full copy of the European Union politcal ads.
May 16, 2019
We downloaded a full copy of the European Union politcal ads.
May 17, 2019
We downloaded a full copy of the European Union politcal ads.
May 18, 2019
We downloaded a full copy of the European Union politcal ads.
May 19, 2019
We downloaded a full copy of the European Union politcal ads.
May 20, 2019
We downloaded a full copy of the European Union politcal ads.
May 21, 2019
We downloaded a full copy of the European Union politcal ads.
May 22, 2019
We downloaded a full copy of the European Union politcal ads.
May 23, 2019
We downloaded a full copy of the European Union politcal ads.
May 24, 2019
We downloaded a full copy of the European Union politcal ads.
May 25, 2019
We downloaded a full copy of the European Union politcal ads.
May 26, 2019
We downloaded a full copy of the European Union politcal ads.