Adam Green
Twitter API Consultant
adam@140dev.com
781-879-2960
@140dev

Free Source Code – Twitter Database Server: get_tweets.php

Important Note: Do not run this script as a cronjob. There are detailed install instructions that explain how to run this script. Please read them.

The Phirehose library makes collecting tweets from the Twitter streaming API very simple, as shown in this script. All of the work of establishing and maintaining a continuous connection is done for you by Phirehose. Your code has to extend the enqueueStatus() function to save each tweet as it is received (lines 26 – 42). Because the tweet data is being stored in a MySQL database, I also included code to establish a persistent connection to the database, and hold the connection handle $dbh in a class member (lines 18-22). This is faster than reconnecting to MySQL each time a new tweet is inserted.

Once you have created an extension of the Phirehose class, starting up the collection process takes a few simple steps:

  • The Twitter streaming API uses OAuth authentication, so you must provide Phirehose with the user token and user secret for a valid Twitter account (line 45). These values are stored in 140dev_config.php.
  • The keywords used to select tweets from the API are passed as an array to Phirehose with the setTrack() function (line 53).
  • Finally the consume() function is called to begin collection (line 57).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
<?php
/**
* get_tweets.php
* Collect tweets from the Twitter streaming API
* This must be run as a continuous background process
* Latest copy of this code: http://140dev.com/free-twitter-api-source-code-library/
* @author Adam Green <140dev@gmail.com>
* @license GNU Public License
* @version BETA 0.30
*/
require_once('140dev_config.php');

require_once('../libraries/phirehose/Phirehose.php');
require_once('../libraries/phirehose/OauthPhirehose.php');
class Consumer extends OauthPhirehose
{
  // A database connection is established at launch and kept open permanently
  public $oDB;
  public function db_connect() {
    require_once('db_lib.php');
    $this->oDB = new db;
  }
	
  // This function is called automatically by the Phirehose class
  // when a new tweet is received with the JSON data in $status
  public function enqueueStatus($status) {
    $tweet_object = json_decode($status);
		
		// Ignore tweets without a properly formed tweet id value
    if (!(isset($tweet_object->id_str))) { return;}
		
    $tweet_id = $tweet_object->id_str;

    // If there's a ", ', :, or ; in object elements, serialize() gets corrupted 
    // You should also use base64_encode() before saving this
    $raw_tweet = base64_encode(serialize($tweet_object));
		
    $field_values = 'raw_tweet = "' . $raw_tweet . '", ' .
      'tweet_id = ' . $tweet_id;
    $this->oDB->insert('json_cache',$field_values);
  }
}

// Open a persistent connection to the Twitter streaming API
$stream = new Consumer(OAUTH_TOKEN, OAUTH_SECRET, Phirehose::METHOD_FILTER);

// Establish a MySQL database connection
$stream->db_connect();

// The keywords for tweet collection are entered here as an array
// More keywords can be added as array elements
// For example: array('recipe','food','cook','restaurant','great meal')
$stream->setTrack(array('recipe'));

// Start collecting tweets
// Automatically call enqueueStatus($status) with each tweet's JSON data
$stream->consume();

?>

streaming_framework