A common question asked by potential clients is how many tweets they can expect to get from the search API. Although I have been telling them “1,500 tweets up to 7 days old” for years, I decided to confirm that. To my surprise, the limit is no longer provided in the official docs. I tried asking for the current limits on the developer mailing list and got no answer, so I’m going to try an experiment. My hope is that the developer community can come up with our own answers for this important limit.
I wrote a simple test script that counts the tweets and also reports on the date of the oldest tweet returned. Running this myself gave me two possible answers. When I used a low volume query of my town of “lexington mass”, I got 24 tweets going back 7 days, which is what I expected. But when I used the high volume query of “obama”, I got 17,773 tweets before the script hit the rate limit for requests. Clearly something has changed in a big, yet undocumented way.
Here is the script I used, called search_limits.php. It uses the tmhOAuth.php OAuth library, as I do in all my code. If you don’t already have a copy of this library, you can download it along with the search_limits.php script. You will need to fill in a set of OAuth tokens to make the API request. You also need to fill in your own query. Try different queries and see what you get.
search_limits.php
<?php // Connect through OAuth require('tmhOAuth.php'); $connection = new tmhOAuth(array( 'consumer_key' => '*************', 'consumer_secret' => '*************', 'user_token' => '*************', 'user_secret' => '*************' )); // Loop through search results and accumulate count $query = '*************'; $max_id = 0; $oldest_tweet = ''; $tweets_found = 0; while (true) { // First API call if ($max_id == 0) { $connection->request('GET', $connection->url('1.1/search/tweets'), array('q' => $query, 'result_type' => 'recent', 'count' => 100)); // Repeated API call } else { // Collect older tweets --$max_id; $connection->request('GET', $connection->url('1.1/search/tweets'), array('q' => $query, 'result_type' => 'recent', 'count' => 100, 'max_id' => $max_id)); } // Exit on error if ($connection->response['code'] != 200) { print "Exited with error: " . $connection->response['code'] . "\n"; break; } // Process each tweet returned $results = json_decode($connection->response['response']); $tweets = $results->statuses; // Exit when no more tweets are returned if (sizeof($tweets)==0) { break; } foreach($tweets as $tweet) { ++$tweets_found; $max_id = $tweet->id; $oldest_tweet = $tweet->created_at; } } print "Tweets found: $tweets_found Oldest tweet: $oldest_tweet"; ?>
It would be great if a group of developers ran this script with their own queries and reported the results. I’d like to know how many tweets you got back from the API and how far back in time they went. What do you say? Can we start working together to answer questions rather than waiting for answers on the developer mailing list? Tweet your answers to me @140dev. Thanks for your help.
Help me solve the mystery of search API limits.