Twitter alerts: using twitter streaming API for fun and profit

23 Oct
2009

twitter_alerts

Twitter is a wonderful service, but, until now, you have to subscribe to some websites to be alerted when a selected word (maybe your trademark) is tweeted. We’ll try to develop a service that filters the tweeter api, stores the interesting ones in our database, and show them in the browser in real time.

If you want to try it, watch it in action, grab the code or read on…

NOTE: Now you have to use https:// instead of http:// for it to work.

What?

We’ll take twitter real time results for a given word (or words) and visualize them in a browser window, like monitter.com, but on our own servers and a bit more automatic. This is going to be useful to get our own alerts and work with them.

How?

We are going to use the twitter stream API to get the comments containing a given word. Since the results are given in Json format, we’ll need to filter the data, take the interesting bits and store them. From the other side, we’re going to write a bit of javascript to read the data from the server into a web page using ajax. We could use other technologies, like comet, to push the data to the browser, but, while it would be a much cleaner implementation, I think I’m not ready to write about that yet (check this space ;) ). We need to decouple the reading of the stream from twitter and the serving to our clients because we can only have a single active  instance of the twitter streaming api at a given time, and we’re going to leave the listening proccess on for a long time using a daemon/service approach. On top of that, we want to store the tweets in a database for later perusal and data mining.

The tools

We are trying to make a useful system. To do this, we’ll need some tools:

  • HTTP server:I’m currently using Apache, but any one would be OK
  • Server side programming language: This time, PHP. While probably not as elegant or fast as Python or as cool as Ruby, gets the work done and it’s available in almost all web hosting plans. And the documentation is pretty extensive.
  • A database to store all the info
  • Client side programming: We are going to use HTML, javascript and the jQuery javascript library to simplify AJA(X) programming and for the effects.
  • A twitter acount: Yours own is OK but maybe you want to create a new one for this kind of tasks. Write down your user and password, we are going to need them soon.
  • The twitter API. Twitter people are so kind they have developed a restful API free for everyone to use. And it’s sub-zero cool.

The Twitter streaming API

Twitter has published a Streaming API that’s described as “The Twitter Streaming API allows near-realtime access to various subsets of Twitter public statuses”. In fact, this is jus what we need. You can read all the documentation at https://twitterapi.pbworks.com/Streaming-API-Documentation, but I’ll try to take the interesting parts for this project so you don’t need to yet.

We are going to use just one method (status/filter) to get results including one or several words. This method can return a stream of data in xml or json formats, has to be called using POST and can get some parameters. You can use it from the command line if you have access to some kind of unix in the following way:

curl -d 'track=google' https://stream.twitter.com/1/statuses/filter.json -uuser:password

Where user and password are your twitter credentials.

It should return something like:

response.json

Until you exit it (with CTRL+C) . This is a Json stream and can be read and parsed by several means. In fact, it’s eval-uable javascript code that we could read from the browser. But right now, we’re going to use a server-side language to read it and work with the interesting parts.

Reading Twitter stream

As I told you in the tools section, we are going to use PHP as our server side language. For the first part, the reading of the twitter stream, we don’t even need a web server, since we can run it from the command line. And if we run it from the command line, we can convert it to a kind of daemon/service and leave it on for a long time. But first, some code:

basicstreaming.php

This is (almost) the simplest code that delivers what we want. A php formatted stream of tweets that include our marked word. If you run it from your command line, it’ll show something like:

sample_tweets.php

Let’s look at the code:

We create an $opts array (in fact an array of arrays) that contain the parameters. In this particular case, we’re using two, the POST method and the search line (track=google). Then, we can treat the twitter stream as a file, using stream_context_create and fopen and just start reading lines. Each line is going to be a JSON encoded tweet, similar to what we’ve seen when we called the API from command line. Since we want to use the contents as easily as possible, we’ll need the json_decode function to parse it into PHP objects, print them and call flush just in case we’re calling the script from a browser.

Storing the results

The best way to store the results for later perusal is a database. I’m using MySQL but any other database should be OK. To be able to store the data, we need to create a database with a single table.

We are only going to store the following data:

  • Text: This is the twetter status. 140 chars max.
  • User screen name: The screen name of the poster. This is needed to create the link to twitter
  • Id: A unique id for the tweet. It’s a sequential number, so, we can order the tweets acording to this, and use it along with the user screen name to built a link back to twitter, and use it as our primary key.
  • Followers count: The number of people that are going to receive the tweet in their inboxes. We are using it to style the real-time viewer. Since my primal intention is to watch a trademark, I care about the number of people that are watching the messages.
  • The time of the tweet: basically for filtering purposes. We are going to store our server time to avoid lengthy conversions.

We could store several other fields, and a complete solution should probably take into account that you can have some different tweet types, but for the time, these four fields should suffice.

To create the table, we can run the following SQL script from the server:

createtable.sql

After that, we will have a single table database waiting for us to fill it with tweets… Let’s go:

storing_tweets_in_the_database.php

It’s as ugly as sin but it works. If you run it from the command line, it should start storing tweets in your database and keep on until you stop it. So, our database is starting to fill with tweets concerning our desired word. Now we need to be able to navigate them…

Creating the code from the server side

Now we need to publish the tweets in our browser. To do that, we need a small PHP script that returns the tweets when called. If we call it with a parameter start it’ll return all the tweets with an id bigger than that. Otherwise, it’ll return the last ten tweets stored in our database. To do that, we will use two different queries, the first one to return the last ten results, and the second one to return all results since the given id. We use subqueries (SELECT from SELECT) to get the results in our wished order.

server.php

We are going to poll this code every ten seconds and refresh the tweets list to show the most recent ones, using javascript, and the output format will be JSON.

Writing a front-end

Since, as Larry Wall said, one of the cardinal virtues of a programmer is lazyness, we are going to use jQuery  to construct the interface and the business logic. And we’re not even serving it, but linking from the Google CDN, as Dave Ward posted in his wondeful blog.

So, let’s start with the HTML:

barebones.html

Just some styles, one script link and a body containing one div, two links and a header. Simple, eh?

We need to add some javascript in the middle of the code to connect to the server:

logic.js

Let’s see… This is probably the most complex part of the article, so, I’ll try to go slow and explain every function:

getTweets(id)

This function calls the server using the getJSON jQuery method. Then, it takes each response line and calls addNew with it. If we call it with an id parameter, it’ll ask the server for all tweets with ids greater than that. Otherwise, it will grab the last ten tweets.

addNew(item)

It takes an item (a tweet) as input. It hides the first tweet, remove it’s ‘tweet’ class, appends a new tweet at the bottom and shows it. It calls the renderTweet function to get the tweet in HTML format.

renderTweet(item)

Just one line to call getImportanceColor() and a return with the HTML code. It’s a bit long but that’s because we’re adding a couple of links to the tweet to be able to visit the original one.

getImportanceColor(number)

It takes a number of followers and returns a rgb color that will be between total black, for people without followers, and total red, for Ashton Kutchner. It uses logarithms to scale between the two extremes, because there are 6 orders of magnitude between the extremes. We will use it to paint (it) black the twitters with few followers and red the twitter stars.

poll()

This is the timeout function that calls itself every 200ms and gets the new tweets.

The last block just starts the polling as soon as the document is loaded.

The Result

This is a small screen capture of my browser visiting the HTML/javascript page while running storing_tweets_in_the_database.php. It’s watching the word ‘twitter’ and, as you can see, it’s running too fast for the human eye -at least mine -, but since we are keeping all the data in our database, it’s not lost forever  :)

Limits

Right now, because of the Twitter API limits, just one instance of the watching process can be run at once. Anyhow, you can write several words, separated by commas, and Twitter will return results for all of them.This code should not be used in production, since there are almost no security checks to avoid missuse. If you want to use it in a machine open to the public, you should check -twice- every input for missbehaviour.

The code

  1. Download the code and unzip it into a folder in your local webserver
  2. Edit config.php to add your twitter login data and the words you want to watch
  3. Create the database and the table with the SQL code above
  4. Run watch.php and leave it running for as long as you wish.
  5. Visit http://localhost/thefolderwhereyouunzippedthecode/ and watch the tweets coming.

Further work

Obviously, this is just a sample. It can be made much better looking, and we could even analyze the tweets and tweet back a response to any questions concerning our keywords. The watch module should be daemonized or converted to a service to be left unatended. The HTML page could be able to filter between two dates and so on. Keep on watching. We’ll try to keep on posting this kind of contents.

Shameless plug

I’m part of Corunet, a web agency in Spain, that can deliver consistent good results in all kind of internet projects. You can visit our website http://coru.net/ or contact me at david@corunet.com if you have any special needs :)

You can follow me on twitter as @dei_biz

Would like to share? These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Reddit
  • Facebook
  • Google Bookmarks
  • Ma.gnolia
  • TwitThis
  • LinkedIn

36 Responses to Twitter alerts: using twitter streaming API for fun and profit

Avatar

plotti

October 29th, 2009 at 21:32

Very nice example, I like how you explained things simply and easy.
Do you know of a way to do twitter searches that go back more than 7 days?

Avatar

David Pardo

October 30th, 2009 at 11:17

Thank you, plotti. Twitter has stated that they’re keeping the tweets, but they’re not available yet over the API. I’ll try to keep you informed if that changes.

Avatar

Magda

November 30th, 2009 at 06:58

Wow. This is possibly the clearest explanation of streaming API for Twitter on the web.

I’m trying to decipher how to do this in real-time in Flash, but I might just do it your way.

Thank you.

Avatar

Jed Herzog

January 26th, 2010 at 19:07

This is the best tutorial on the Twitter Streaming API for PHP out there. Awesome job. I was a little disappointed to get to the end, have working code, and then find out that you don’t think this is ready for production. What do you think it would take to get his code ready for production? What are the issues? Have you looked at phirehose before?

I am trying to use the twitter streaming API for a personal project. If you think you could help me for a reasonable amount of compensation please contact me.

Avatar

David Pardo

January 26th, 2010 at 19:14

Hi Jed,
I don’t think it’s ready for production since it doesn’t take care of disconnections/reconnections, neither can update the stream for new filter words. Anyway, I’ve already used it with minor modifications for some customers. I’ve been trying phirehose lately and looks great, but can’t vouch for it yet.
If you want me to help you with a project, drop me a line to david@corunet.com with your idea and I will try to send you a budget.

Avatar

Dan Goodwin

February 28th, 2010 at 22:24

That is a very helpful, very well written article. Thanks for taking the time to write it up and share.

Avatar

Ispe

April 26th, 2010 at 10:31

David
Thank you sooooooo much – You rock

Avatar

arsyan

August 18th, 2010 at 20:32

Hi there, great article, i refer to your code a lot. Now im trying to figure out one thing, instead of the new tweet appears at the bottom and the top vanishes, how do i reverse that? mening to say newer tweets will show at top and the bottom one vanishes like tweetdec style.

I tried renaming first to last last to first but doesnt seem to work. Is it more complicated to start from top of issit very simple that i couldnt see it yet.

If u can kindly guide on how to reverse the tweets would be great. Thanks!

Avatar

A.G.

September 24th, 2010 at 11:49

Hi there,

“Text: This is the twetter status. 140 chars max”
Thats just what the user is allowed to enter. If you take a look at the message, it contains of a lot more characters. That’s because links etc. are send as html.
I’m not at home so I cant give you the max number of what I have counted so far :(

Avatar

JS

December 21st, 2010 at 06:35

Great tutorial! I have one question: in logic.js, when calling renderTweet(), you pass the function a second argument, ‘hidden’. I see what it achieves by inspecting the DOM in Firebug, but how does this take effect in your code? it’s a neat trick, but could you explain it? Thanks!

Avatar

aci cartagena

January 31st, 2011 at 13:58

used your code as part of a mashup, between twitter and google maps. thankyou! will credit you in the about page.

Avatar

cmaciasg

February 23rd, 2011 at 21:37

Hola david,
Me alegra ver que se hacen cosas majas en España también, he estado buscando información sobre el Streaming API de Google y es de los mejores tutoriales que he encontrado, mis felicitaciones.
Una pregunta, podemos hacer algo parecido sin tener que guardar datos en base de datos? es decir, mostrarlo a través de web según nos llegan de cURL?
Entiendo que en este caso no tendría sentido el php, no? Entiendo que el php no es algo que se quede “abierto” ejecutando algo, sería mas javascript o se te ocurre algo?
Muchas gracias y enhorabuena otra vez.

Avatar

Rico

June 25th, 2011 at 20:24

Hi
Thx for your amazing tutorial…
Actually when I launch your script nothing appears… (mySQL OK, Server OK..)

Has it had some change on twitter API side or the script should still work?
Thx for your help
Rico

Avatar

Peter Richardson

August 1st, 2011 at 23:41

Quick note to Chrome users: you may have to modify storing_tweets_in_the_database.php to see the results of the flush(). Here’s my solution:
http://stackoverflow.com/questions/6001628/php-flush-not-working-in-chrome/6904643#6904643

Avatar

Steve

September 2nd, 2011 at 14:37

Fantastic, well written post about connecting to Twitter API. I have spent the afternoon playing around with this and I’ve learned a lot. No problems so far. Thank you.

Avatar

Steve

September 2nd, 2011 at 14:39

Also, a question – how can I get the frontend to keep going continuously even if i’m not running storing_tweets_in_the_database.php ?

Avatar

Rico

October 3rd, 2011 at 12:19

Hi all !
thx for this amazing post !

It worked until last Friday, because of the SSL only access to the stream of twitter API…. (here : https://dev.twitter.com/blog/streaming-api-turning-ssl-only-september-29th)

Do you have a simple solution for non expert in SSL developer in php ? to make this code working again ?

Thx

Rico

Avatar

Javier Jaramillo

October 12th, 2011 at 23:09

Rico: as the dev.twitter.com URL says, you just have to change the ‘http’ by ‘https’.
If it ain’t working yet (as happened to me yesterday on this windows machine), then its because of configuration/setup.
Check that the: php_openssl extension is loaded and that you set allow_url_fopen = On in your php.ini
That should do it.
Good luck.

Avatar

Al

October 15th, 2011 at 12:55

Great article!

One question. What syntax should I use to combine predicates eg. add a track and a location? I’m trying things like… ‘track=’twitter&locations=360,180′ but not having success.

Thanks.
Al

Avatar

Felipe Signorini

October 31st, 2011 at 21:03

Awesome, this tutorial is pratice and good. Really a like to see de continuos posts, phirehose….

Avatar

Jose Manuel

November 13th, 2011 at 23:57

Briliant, very interesting and easy! mi admiración por ocmpartirlo.

Rico, to make it work with https, change in file watch.php, line 15:

$instream = fopen(‘http:

for

$instream = fopen(‘https:

and you are set!

cheers

Avatar

Twitter Real Time Search « thewayofcode

November 30th, 2011 at 10:35

[...] Blog post that really helped me with the javascript part of the development: http://blog.corunet.com/twitter-alerts-using-twitter-streaming-api/ [...]

Avatar

Pzelnip

December 14th, 2011 at 22:19

Perhaps I’m mistaken, but isn’t storing tweets into a database a violation of the Terms of Service (TOS) for the streaming API?

https://dev.twitter.com/terms/api-terms

Reads:

“You may export or extract non-programmatic, GUI-driven Twitter Content as a PDF or spreadsheet by using “save as” or similar functionality. Exporting Twitter Content to a datastore as a service or other cloud based service, however, is not permitted”

Seems to me that a SQL db would constitute a “datastore”, and that this datastore is used to drive a “service”.

Avatar

David Pardo

December 14th, 2011 at 23:07

Pzelnip, you’re right. Seems that lately tweeter frowns upon storing the tweets in a database, at least as as permanent store. Anyway, when I first contacted them about two years ago, they told me that the technique described in the article was ok, or at least no reason for a ban. So, I guess that the spirit of the norm is that they don’t want people making tools to analyze past tweets using streaming API.
Thank you so much for your comment.

Avatar

Mark

December 19th, 2011 at 15:56

hi! I tried to run your code but I’m getting this error:
Warning: fopen(http://…@stream.twitter.com/1/statuses/filter.json) [function.fopen]: failed to open stream: No connection could be made because the target machine actively refused it. in E:\wamp\www\twitter\app\demo\twitter_watch\watch.php on line 17

what could be the possible cause of this?

Avatar

Michael

February 1st, 2012 at 23:37

Is it possible to increase the 60 second timeout on the watch.php file? I let it run, but get a Fatal error: Maximum execution time of 60 seconds exceeded in …/twitterwatch/watch.php on line 27

Avatar

David Pardo

February 2nd, 2012 at 00:34

Avatar

David Pardo

February 2nd, 2012 at 00:35

Mark, remember to use https instead of http

Avatar

Jake

February 10th, 2012 at 04:42

So I’ve got everything working great, only problem is I can’t figure out how to stop to the watch/stream. I had to actually change the configuration files to a invalid username and password, then kill the process on the mySQL server.

Thanks for the help.

Avatar

Simon

February 26th, 2012 at 04:13

I did exactly as the instructions said and downloaded the code. But, I find that it is not working.

Avatar

Alberto

March 9th, 2012 at 18:03

Why you didn’t use the Consumer key, Consumer secret, Access token and Access token secret?

Thank you

Avatar

David Pardo

April 11th, 2012 at 01:13

Hi Alberto,
The code predates the new access guides for a couple of years… XD

Avatar

Bruce Markey

October 20th, 2012 at 20:06

Thanks for a great article. Using it for a little pet project just for fun. One question. I’m noticing some oddities with what is returned when I search for keyword x but when I search for the same keyword on twitter i get different results. Some are the same but not all.

Is there an easy to push the output from watch.php to the screen just to debug. Like I said everything works, just with some oddities.

Thanks much.

Avatar

BEN

December 29th, 2012 at 16:17

To Bruce Markey:
Stream API only gives you 1/20 of matching tweets from Twitter.

Avatar

Sylvain

December 30th, 2012 at 19:11

Hey

Thanks for this great article. Sadly, i couldn’t get the code to work, is it still possible to use this method considering the recent changes in the Streaming API (oauth …)? If so, what should be changed in order for it to work ?

THanks in advance

Sylvain

Avatar

Barun Saha

February 21st, 2013 at 15:19

Nice tutorial! I liked the way you have presented the code snippets. It helps one to read the logic coherently without bothering about the implementation details.

Comment Form

top