Adam Green
Twitter API Consultant
adam@140dev.com
781-879-2960
@140dev

Lead Generation: Data mining Twitter lists for the best leads

by Adam Green on February 5, 2014

in Lead Generation

Twitter lists don’t get the respect they deserve, but here is a way to use a well curated list as the source of great leads. You could just use the members of a list as leads themselves, but that will only give you a few hundred accounts. If you use the following procedure, you can turn a list into tens of thousands of accounts that are all interested in a specific subject.

First you start with a good Twitter list whose members have been carefully selected. Ironically, Twitter doesn’t offer a method of searching for its own lists, but Google does a good job of this. I find that a search like “twitter list [subject]” works best. With the Olympics in the news, let’s try to find a great list of sports journalists as a starting point. This can be used to find thousands of Twitter users who are avid consumers of sports news, a great lead list for anyone promoting sports related products.

A Google search for twitter list sports journalists leads me to the @SportSJA account. Clicking on the account’s Lists link gave me this excellent list.

The assumption I will make when data mining this list is that the more members of this list someone follows, the more they are interested in sports. Someone who follows a few members may just be an accident, but if an account follows a couple of dozen list members, they are a sports fanatic. Those are the people I’m looking for.

The next step is to collect all the followers of every member of this list. I do this by writing code that first collects all the lists members with the /lists/members API call. This gives me the user ids of the list members, which are stored in a list_members table.

Then I have a script loop through the user ids in the list_members table, and collect all the followers of each one with the /followers/ids request. This is the slowest part of the process, because only 60 follower requests can be made per hour, retrieving up to 5,000 followers each time. Depending on the number of list members and average followers of each, this can take a few hours up to several days. Running a script that does 15 requests as a cron job once every 15 minutes will eventually chew through all the data.

All of the follower user ids I get in this stage are added to a list_member_followers table. The important trick is that I allow duplicates. If someone follows 10 members of the list, their user id is added to the list_member_followers table 10 times.

Since users are added multiple times to this table, you can get a count of the total unique users with the query:

SELECT distinct user_id
FROM list_member_followers

From my experience, a list with close to 500 members (this one has 485), will deliver on the order of 500,000 to 1M unique followers.

I don’t want all the followers, just the best ones. The top 10% is usually enough, so if there are 500,000 followers, the 50,000 who follow the most are my target. I can get these with:

SELECT count(*) as cnt, user_id
FROM list_member_followers
GROUP BY user_id
ORDER BY cnt DESC
LIMIT 50000

To make sure these are all good leads, I need to review their account stats, such as follower count and number of tweets. I can have a script read through these 50,000 accounts, use /users/lookup to collect the account info, and store it in a list_member_leads table. This works on 100 users at a time, and can be run 720 times an hour. This means that collecting all the account data will take (50000/100/720) or .7 hours. Not bad for 50,000 potential leads.

Finally, I can run a SQL request that deletes any of the lead accounts that have poor stats, such as less than 50 friends, 50 followers, or 50 tweets. I also tend to delete accounts with egg avatars and no descriptions. This usually eliminates about 20% of the total.

The result is about 40,000 leads, all of whom have shown that they really want info on sports. And best of all, this data is free. Pretty amazing. Remember, automated following of a list like this is forbidden by Twitter’s terms of service, but there are many ways to track and engage with these users. What you need is a solid reporting and engagement management system to approach these users and record your engagement. Sort of like a CRM for Twitter leads. Luckily, I happen to have a book on just this subject.

Leave a Comment

Previous post:

Next post: