bradKELLETT

Released: Twitter Timeline Export (TweetDumpr)

Read about some changes and updates.

The general response to my hesitation on the release of my Twitter timeline export tool was that I should, indeed, release it. So I have.

The tool now carries one of the most attractive names around: TweetDumpr. With it, you can export your entire Twitter timeline to a CSV (comma separated value) file, which can be read by any spreadsheet application. To get around the lingering privacy issues, the tool now requires you to authenticate to Twitter first, which makes sure you are only dumping your timeline and not someone else’s.

Currently, the tool only works on public timelines, but a new version is already in the works that handles protected users. Feel free to give it a go and report back on bugs that you encounter – it is still in the early stages of development.

36 Responses

  1. jv says:

    tweetdumpr is excellent. any chance of adding the option to dump “with others” as well?

  2. Thom says:

    Asking for passwords is wrong! It’s rude and dangerous. You could generate a code that people have to tweet in order to authenticate their account, and/or check the web link in their profile for an open.id server and get them to authenticate with it. Otherwise, it’s cool :)

  3. Brad Kellett says:

    jv: I’m working on a few things like that, but it gets very recursive. Eventually you just end up dumping all of Twitter ;)

    Thom: Are you serious? For a start, basically every Twitter application asks for your password. Most of them store it, TweetDumpr does not. Not only that, TweetDumpr does not actually use your password for anything locally – it just sends it to Twitter, and Twitter sends back if it is correct or not. Ever used Twitter Karma, Twhirl, or any other app built on Twitter?

  4. Will says:

    Thom: Twitter does not support Open ID, or any sort of token-based authentication (like Flickr).

    Authenticating against some random site specified in a users stream is of limited use, and more likely to cause more security concerns the current approach.

    If you really don’t want to hand your password over to third party sites – just use wget or a similar tool to mirror your Twitter account.

  5. Thom says:

    @brad asking for a password is particularly bad when there really isn’t any reason for it, and as Jeremy Keith puts it, it teaches people to be phished. The other apps you mentioned need the password in order to work, it’s not great but it has to be. Twitter are working on an OAuth style API to fix this.

    @Will, I know they don’t have open id, I was suggesting that the site linked from their profile may have open id and could be used to prove that users authority over that twitter stream. Not a random site from their stream. The other option I suggested was that TweetDumpr generates a unique code that the person tweets, and it can then be verified and tie that twitter account to a TweetDumper account, this is the method used by technorati to prove ownership of a blog (also similar to google webmaster tools and many more).

    I appreciate it’s a load of work and I don’t honestly expect you to implement it, but I felt I needed to draw yours and your readers attention to the problem.

  6. Brad Kellett says:

    Thom: How is it not necessary? TweetDumpr needs to verify that you are you. Twitter made an API to do exactly this (and more). Why should I not make use of said API? Linking a Twitter account to a TweetDumpr account, as you say, is completely unnecessary – mainly because TweetDumpr doesn’t have accounts. Like I said, it does not store your information at all, just provides you the CSV file for download.

  7. Thom says:

    You don’t actually need the password to access the users stream, you can easily strip it without. Although it’s very well intentioned to insist the user downloading the csv is the owner of that stream, I don’t think end users necessarily see it that way, eg http://twitter.com/NicoleSimon/statuses/789110074

  8. Brad Kellett says:

    Thom: From the TweetDumpr site – “Your Twitter details are required to verify that it is you dumping your timeline.” It is a privacy issue to let anyone dump your timeline, I verify that it is you doing it.

  9. Thom says:

    I understand that, and it’s well intentioned, but some people are suspicious of it, eg the person I linked to. It would be good if people didn’t need to enter their password. Anyway, it’s still a good idea! :)

  10. Yann Leroux says:

    It would be perfect to dump a world* so we could study how it spreads in the network !

    * freudian slip : i mean, a word.

  11. Ederic says:

    Is there any way the Twitter dump could be exported to WP? :)

  12. Brad Kellett says:

    Ederic: I really have no idea why that would be necessary. Twitter does not equal blog.

  13. [...] Update: I have now released the tool. [...]

  14. [...] took all the words in all the tweets I’ve done so far on [...]

  15. Neal says:

    If this works … it’s awesome! However, I submitted my request almost an hour ago and still waiting for that email.

  16. Brad Kellett says:

    Check your spam folder, the emails get caught in that pretty regularly. If it isn’t there, try it again, and if you still don’t get anything let me know your Twitter username.

  17. Neal says:

    It worked on the second try – maybe because I didn’t use the email address associated with the account the first time. This is great – thank you!

    Unfortunately, I was hoping it would go back in time further than 10 pages (200 tweets), but it looks like that’s all twitter makes available. I wonder why they would make such a frivolous decision, but that’s for another forum.

    Thanks!

  18. Brad Kellett says:

    The 10-page limitation is a new one from Twitter, I hope to find a way around it soon.

  19. Thanks for your work! This looks like a nice tool – except why enforce passwords on public Twitter accounts? I would like to dump the msgs of several accounts of users associated with our company – and limiting the tool for someone to just dump their “own” timeline seems like constraining your tool beyond what’s needed. In fact it makes your tool less useful for someone like me and what I’m looking for.

    If someone using Twitter doesn’t want his/hers tweets to be public, he/she has the option to make his/hers account private and prevent public access to his/hers timeline. This need not really be of any concern to a tool like this.

  20. Brad Kellett says:

    Morten: It’s a privacy thing. Just because someone makes their tweets public doesn’t mean I should make it easy for a 3rd party to download all their tweets. I’m just trying to respect the Twitter users.

  21. Brad, I understand your point. But I don’t get why you have privacy concerns with a public medium? We’re not probing inside people’s private email accounts here, but into what they themselves have posted publicly and given public access to.

  22. Brad Kellett says:

    I don’t personally have concerns over it, but the Twitter users I talked to had an issue with making it so easy to mine that data. I think there is a difference to people between posting something publicly and making it really easy to get a full archive to use however.

    To me, they are the same thing, but people don’t necessarily see posting to Twitter in public as the same as posting downloadable text.

  23. Btw, the 200-tweet archive limit is no longer enforced by Twitter (apparently). My personal archive of Tweets have been released from it’s unwilling hostage status on Twitter’s servers, and I am grateful of a tool such as yours which allow me to retrieve ALL my data from Twitter, in whatever form possible, to prevent this from happening again.

    See this thread on more about the archives inaccessibility problems many users have experienced and on Biz Stone’s “lifting” of this severe limitation of Twitter : http://getsatisfaction.com/twitter/topics/why_cant_i_view_all_of_my_updates?

  24. Well, for one thing it raises my suspicions when a web tool such as yours asks for my account details as well as my email adress for something which is basically public data. There’s no real reason for this.

    What people use Twitter for or would or would not like others to do with their data does not concern me. If you don’t like someone accessing something you post in public, it’s really easy not to post it in public, if it means a lot to you.

    What concerns me is the usability of the tools I have at hand – what they can do for me. And when you superimpose privacy concerns into what is basically mining public data, you get in my way, even if you want to help me. Great, your tool helps me export my own stuff, which is of high priority to myself right now. I’m happy you put your tool out here, and I found it.

    But it makes your own tool less useful, because you think people shouldn’t use it in ways you think violates your privacy concerns. Twitter is, in fact, a great way to analyze markets, if you can access all that data buried in it’s archives. I wonder if you can see what kind of value your tool may have, if you can develop and make it accessible in ways which makes it easy for users to use your tool for such purposes. But right now you’re blinded (as I see it) by privacy concerns for something which is completely publically accessible. And if you don’t create a tool which allows for versatile mining of this data, others will. In fact, I am surprised that Twitter do not seem to see this very clearly. If they do, why would they impose obscure, randomly picked hard limits on our accessibility to the gold in our own archives?

  25. Brad Kellett says:

    To be clear, I have no privacy concerns. To me, what is public is public. I put in the password requirement based on feedback from many other Twitter users. I am trying to respect the Twitter user base and respond to their concerns.

  26. Well, now you have the sentiments of another (perhaps former) member of the Twitter user base ;-)

  27. I think I’ve now waited about 30 mins at least for an email to arrive, especially since the TweetDumpr page says it ought to take just 10 mins to dump a five figure timeline. My timeline is not nearly as big, and yet still no response after half an hour or more. How long should I expect to wait to get an email response?

  28. [...] doings and wished to export my time-line for reformatting into a calender format. Unfortunately TweetDumpr just retrieves the list of Tweets using a single fetch request which is limited by the Twitter API [...]

  29. Adam Franco says:

    Hi Brad,

    Just wanted to let you know that you can access more than 200 Tweets if you load them by page. I’ve posted a small export script here that shows how to do this:
    http://www.adamfranco.com/archives/88

    For now this script just dumps the raw XML and is not set up as an end-user service, but it can be copied and run by anyone who wants to back up all of their time-line.

  30. Adam Franco says:

    Oops, I guess you are already doing this. My bad. :-)

  31. Adam Franco says:

    Using the script I mentioned I was able to retrieve all 34 pages of my Tweets. Maybe HTML scraping the twitter site is the only place where the page limitation exists…

  32. Tamara says:

    I can never get TweetDumpr to work. I’ve tried putting my info in several times, and I never get a response.

  33. Brad Kellett says:

    Tamara: It isn’t working at the moment due to limitations from Twitter. I’ll fix it up soon, but it won’t be able to get more than about 1500 tweets even when I do.

  34. Oscar says:

    I just got back my TweetDumpr file, but it is only a txt file, without dates associated with the tweets. Is that how this tool functions now? Does it no longer output CSV files?

Leave a Reply

Twitter

  • @Lazybastid Yeah, you got me there. That was ripe for the picking. [#] 4:05pm
  • I've been having issues with Google Reader and Firefox 3.5. Have to refresh it a few times to get it to work. [#] 3:56pm
  • Wait, sometimes it goes down, sometimes left. That's even worse. [#] 3:35pm
  • Why does enter move the selection right in Google Spreadsheets instead of down like in Excel? Odd. [#] 3:33pm
  • @Gomeler Nice. I've been temped to get one of those coin roll machines. Partly to sort the change, partly because they are kinda cool. [#] 3:11pm
  • @hiway I don't think that's OCR, just being able to fine edges. OCR is actually extracting characters. [#] 3:07pm