The BU-TCAT allows Boston University students and faculty–and even off campus researchers–to examine Tweets off the STREAM API (the so-called “gardenhose” access to Twitter) and then process the data for network analysis and visualization in Gephi. With this open-source software, social data in the millions of units is quickly and easily sorted by algorithms to find people or items of importance on Twitter. Dr. Jacob Groshek teaches a range of network analysis techniques using the BU-TCAT. Contact him for more information.

As a resource to Boston University and the broader research community, the BU-TCAT opens up a host of analytic options that require no programming knowledge. Detailed analytic options include:

  • Timeline of Twitter activity, with minute-by-minute timestamping
  • Tweet statistics like hashtag and retweet frequencies, geocoding and unique users
  • Specific user stats: number of friends, followers, favorites, and verification
  • Activity metrics such as user visibility by mention frequency
  • Hashtag frequency, hashtag-user activity, word and identicial Tweet frequency
  • Lists of individual retweets and geocoded Tweets
  • Network graphs by mentions, co-hashtagging, and hashtag-user graphs
  • Cascades, alluvial diagrams, and associational profiles
  • Unlike many other types of Twitter collection systems or software, BU-TCAT searches do not run out or expire until they are turned off. To date, the BU-TCAT system has archived approximately 155 million Tweets (and counting), on topics such as Ferguson, data journalism, and the Massachusetts gubernatorial race.