I got a call the other day from a developer who was receiving various 500 series errors when trying to gather large amounts of Twitter user data. The API has a number of errors in the 500 range, all of which generally mean that the Twitter servers are overloaded. The API is built on the principle of staying alive while handling as many requests as possible. If the load gets too high or a request takes too long to process, the request is dumped and one of the 500 errors is returned.
The specific requirement for this developer was getting information on all the followers of his app’s users. He was doing this in a brute force fashion every 24 hours. First looking up all the followers 5,000 at a time with the /followers/ids call. Then getting the profile data for each of these followers 100 at a time with /users/lookup. This is a very intensive use of the API, and it is exactly what Twitter doesn’t want you to do. Look at the hint they are offering by returning 5,000 follower ids in a single call, but doling out profile data on only 100 users. They are telling us not to request too much user data.
Whenever possible you should be caching data you get from the API. User profiles are a perfect example. Instead of requesting data on every user every 24 hours, it is much better to store user profiles in a database, and request this data less often. Cutting back to once every 7 days reduces the number of API calls by 86%. I recommended that he adopt this type of caching and then check the user ids he receives from /followers/ids against the user database table. If the user is new or hasn’t been updated recently, then request the profile with /users/lookup.
It also helps to be opportunistic about caching. Many of the API calls return a user’s profile in the payload. If you get this data anywhere in your code, take advantage of this opportunity to cache it.
The other solution to 500 errors is to request less data each time. As I said, a 500 error is often a time out. While the /users/lookup call allows you to request 100 users at a time, try backing off to just 50 at a time. It will take more API calls, but you’ll have a better chance of getting results without an error. This type of logic should be built into your code. If a request triggers a 500 error, scale back the quantity requested and repeat the call.