As advertised, I’m working on a visualizer for geo-tagged tweets. My biggest hurdle at the moment, however, is not the displaying of points, but rather actually getting the data. After much complaining and gnashing of teeth regarding the lack of geolocation flags in the Twitter public API, my coworker Ryan Kee pointed out that the Twitter streaming API does allow filtering for locations.
So my issue becomes this: how to digest a stream in Processing. If I’m not mistaken, opening up the Twitter stream opens up a connection that doesn’t automatically close, but just keeps dumping in data. That’s all well and good, but my understanding of Processing’s http capabilities is that it needs to finish loading something before it can read and parse it.
So, I think I need to do some backend voodoo. More specifically, I think I need to open the stream for a limited amount of time, close it, and send on the results that I received in that time.
My initial quick-and-dirty plan is to create a script that will open the stream, collect the response as plaintext, close the stream after 5 seconds, then return the collected response. This means it’s going to be a long response time (~ 5-6 seconds), but it’ll get the job done and allow me to start playing with the data.
A better long term solution would be to have a constantly running script, perhaps on cron, that will (on fairly short intervals) open a stream, collect the response, then write those results to a static file that can be retrieved. That way I could have a static file that would always be 15-30 seconds of tweets, and it would never be more than 30 seconds out-of-date.
Status:
I’m really not a backend guy, but most of my backend experience is using the Django framework. I was thinking that my quickest bet might be a little Django view to do what I wanted, but I spent most of last night fighting with my Django install on my Dreamhost server – something about “Premature end of script headers.” Did some googling, but most of the chatter about it that actually offered solutions referred to FastCGI, but Dreamhost has transitioned to Passenger instead. In the end, since there is nothing really relying on the subdomain I was running Django on, I ended up completely nuking the subdomain, recreating and reinstalling. After all of that, no closer to my goal. Might be time to look for different hosting, but that’s a whole different discussion.
Now, I’m not much of a PHP guy, but I think my next attempt will be to write a PHP script to execute my quick-and-dirty 5-second response plan. Need to go learn about “curl” now…