Visualization, D3.js, Python, Twitter, Django, AWS Elastic Beanstalk

February 15, 2019

In this project, I...

  • Connect to the Twitter Streaming API with a keyword filter using Python
  • Encode incoming tweets according to dimensions such as sentiment, volume, virality, etc.
  • Share a single Twitter listener among all users using Django and Elastic Beanstalk on AWS.
  • Display encoded results in a dynamic real-time data visualization built with D3.js.

This is a project that attempts to visualize the global conversation around a topic in Twitter in real-time as it occurs. The volume of streaming data can be too overwhelming to parse, so tweets are encoded across seven different dimensions:

  • Topic
  • Sentiment
  • Engagement
  • Virality
  • Controversiality
  • Visibility
  • Length

As tweets are published, they are represented in the visualization as falling dots, which can be individually 'caught' and examined, or left alone to provide an overall visual impression of the conversation, like letting the ambient tone of hundreds of people talking in a public space wash over you.

The visualization is built in D3.js with a Python backend to connect to Twitter and relay the tweets. I originally built it with Flask on Google App Engine, but the requirements of the Flexible Environment meant that it would cost $120/month just to keep it running! I rebuilt it in Django to take advantage of the local database and moved it to Elastic Beanstalk on AWS, which is much more accommodating towards esoteric Python packages.

There are some unique challenges presented by the Twitter Streaming API's limitations, not least of which is the fact that my API key only allows one connection at a time. To that end, the Twitter listener is pooled across ALL users of the visualization, so you may see keywords come and go from concurrent viewers. There is also the risk of being rate-limited from too many users at once, which will severely throttle or kill the flow of tweets.

Try It Out!