Adam Green
Twitter API Consultant
adam@140dev.com
781-879-2960
@140dev

Streaming API enhancements, part 1: Keyword collection enhancements

by Adam Green on February 4, 2014

in 140dev Source Code,Streaming API

The next few posts will describe a set of enhancements to the streaming API framework that will greatly expand the capabilities of the code for collecting tweets based on keywords. I thought I’d start with an overview of what I want to accomplish:

  • Add a collection_keywords table to the database to hold keywords to be used for collection.
  • Add an exclusion_keywords table to the database to hold words (typical curse words) that identify tweets to be rejected.
  • Add a tweet_keywords table to the database to record the tweet_id of any tweet with a collection keyword. This will greatly speed up queries that get tweets for specific keywords.
  • Modify get_tweets.php to collect tweets that contain the collection_keywords.
  • Modify parse_tweets.php to test each tweet and reject it if an exclusion_keyword is found.
  • Modify parse_tweets.php to record any keywords found in the tweet_keywords table.

I’m going to leave the current version of the framework code unchanged, so the enhanced scripts will be called get_tweets_keyword.php and parse_tweets_keyword.php. Once people have had a chance to test this code, I will integrate it into a new release version of the framework.

The next post in this series is available here.

Leave a Comment

Previous post:

Next post: