Processing Quote Tweets With Twitter API

I’ve been writing scripts to process Twitter streaming data via the Twitter API. One of those scripts looks for patterns in metadata and associations between accounts, as streaming data arrives. The script processes retweets, and I decided to add functionality to also process quote Tweets.

Retweets “echo” the original by embedding a copy of the Tweet in a field called retweeted_status:

twitter_API_retweeted_status

Twitter’s API reference entry for retweeted_status

According to Twitter’s own API documentation, a quote Tweet should work in a similar way. (A quote Tweet is like wrapping your tweet around somebody else’s.) A Tweet object containing the quoted Tweet should be available in the quoted_status field.

twitter_API_quoted_status

Twitter’s API reference entry for quoted_status

I some wrote code to fetch and process quoted_status in a similar way to how I was already processing retweeted_status, but it didn’t work. I “asked” Google for answers, but didn’t really find anything, so I decided to dig into what the API was actually returning in the quoted_status field.

It turns out it’s not a Tweet object. Here’s what a quoted_status field actually looks like:

{u'contributors': None, 
 u'truncated': False, 
 u'text': u'', 
 u'is_quote_status': False, 
 u'in_reply_to_status_id': None, 
 u'id': 0, 
 u'favorite_count': 0, 
 u'source': u'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 
 u'retweeted': False, 
 u'coordinates': None, 
 u'entities': {u'user_mentions': [], 
               u'symbols': [], 
               u'hashtags': [], 
               u'urls': []}, 
 u'in_reply_to_screen_name': None, 
 u'id_str': u'', 
 u'retweet_count': 0, 
 u'in_reply_to_user_id': None, 
 u'favorited': False, 
 u'user': {u'follow_request_sent': None, 
           u'profile_use_background_image': True, 
           u'default_profile_image': False, 
           u'id': 0, 
           u'verified': True, 
           u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/', 
           u'profile_sidebar_fill_color': u'FFFFFF', 
           u'profile_text_color': u'FFFFFF', 
           u'followers_count': 0, 
           u'profile_sidebar_border_color': u'FFFFFF', 
           u'id_str': u'0', 
           u'profile_background_color': u'FFFFFF', 
           u'listed_count': 0, 
           u'profile_background_image_url_https': u'https://abs.twimg.com/images/', 
           u'utc_offset': -18000, 
           u'statuses_count': 0, 
           u'description': u"", 
           u'friends_count': 0, 
           u'location': None, 
           u'profile_link_color': u'FFFFFF', 
           u'profile_image_url': u'http://pbs.twimg.com/profile_images/', 
           u'following': None, 
           u'geo_enabled': True, 
           u'profile_banner_url': u'https://pbs.twimg.com/profile_banners/', 
           u'profile_background_image_url': u'http://abs.twimg.com/images/', 
           u'name': u'', 
           u'lang': u'en', 
           u'profile_background_tile': False, 
           u'favourites_count': 0, 
           u'screen_name': u'', 
           u'notifications': None, 
           u'url': None, 
           u'created_at': u'Fri Nov 27 23:14:06 +0000 2009', 
           u'contributors_enabled': False, 
           u'time_zone': u'', 
           u'protected': False, 
           u'default_profile': True, 
           u'is_translator': False}, 
 u'geo': None, 
 u'in_reply_to_user_id_str': None, 
 u'lang': u'en', 
 u'created_at': u'Thu Jun 22 00:33:13 +0000 2017', 
 u'filter_level': u'low', 
 u'in_reply_to_status_id_str': None, 
 u'place': None}

So, it’s a data structure that contains some of the information you might find in a Tweet object. But it’s not an actual Tweet object. Kinda makes sense if you think about it. A quote Tweet can quote other quote Tweets, which can quote other quote Tweets. (Some folks created rather long quote Tweet chains when the feature was first introduced.) So, if the API would return a fully-hydrated Tweet object for a quoted Tweet, that object could contain another Tweet object in its own quoted_status field, and so on, and so on.

Here’s a small piece of python code that looks for retweets and quote Tweets in a stream and retrieves the screen_name of the user who published the original Tweet, if it finds one. It illustrates the differences between handling retweets and quote Tweets.

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy import API

consumer_key="add your own key here"
consumer_secret="add your own secret here"
access_token="add your own token here"
access_token_secret="add your own secret here"

class StdOutListener(StreamListener):
    def on_status(self, status):
        screen_name = status.user.screen_name

        if hasattr(status, 'retweeted_status'):
            retweet = status.retweeted_status
            if hasattr(retweet, 'user'):
                if retweet.user is not None:
                    if hasattr(retweet.user, "screen_name"):
                        if retweet.user.screen_name is not None:
                            retweet_screen_name = retweet.user.screen_name
                            print screen_name + " retweeted " + retweet_screen_name

        if hasattr(status, 'quoted_status'):
            quote_tweet = status.quoted_status
            if 'user' in quote_tweet:
                if quote_tweet['user'] is not None:
                    if "screen_name" in quote_tweet['user']:
                        if quote_tweet['user']['screen_name'] is not None:
                            quote_tweet_screen_name = quote_tweet['user']['screen_name']
                            print screen_name + " quote tweeted " + quote_tweet_screen_name
        return True

    def on_error(self, status):
        print status

if __name__ == '__main__':
    l = StdOutListener()
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    auth_api = API(auth)
    print "Signing in as: "+auth_api.me().name
    print "Preparing stream"

    stream = Stream(auth, l, timeout=30.0)
    searches = ['donald', 'trump', ]
    while True:
      if 'searches' in locals():
        print"Filtering on:" + str(searches)
        stream.filter(track=searches)
      else:
        print"Getting 1% sample"
        stream.sample()


Articles with similar Tags