Some time ago somebody stole 1 million data records from StudiVZ, the German Facebook clone. I’m not exactly sure why people call the person a hacker who stole data, because it appears he simply wrote a tool that harvested the publicly available data from StudiVZ (which everyone with an account can view).
People on StudiVZ share all their data by default—contrary to Facebook which values a person’s private data a lot more. Thus by simply opening each profile from a dummy user and processing the HTML data from StudiVZ one can extract a lot and some more information from random people who probably don’t even know about it or don’t care.. so I’m not sure about the stealing part.
Apparently there are some captcha’s when you start browsing searches beyond a few pages. I guess that is where the hacking part comes in, because getting around a captcha probably constitutes hacking—maybe?
Anyway I think part of the media coverage is a bit ridiculous because anyone can write a simple harvester in an hour or two. It took me one and half hours, so I think I’m on the safe side with this estimate and I didn’t really have a clue about this stuff before either.
Since I don’t want to “hack”, I’ve only written a very tame harvester. It connects to your personal StudiVZ account, and retrieves the name and profile ID (and thus profile URL) of all your friends in the “Meine Freunde” pages.
It could do a lot more with that like retrieving everybody’s birthday or random pictures, but I’m too lazy to code that because you use the same pattern for extracting data over and over again and it stops being interesting quite fast.
You can download the project here. It is a one file C# project. I’m releasing it under GPL (whatever).
It’s really easy to explain how it works:
The code is quite ugly. Well, it’s not production code and this is only meant as a proof of concept.
Also note that I have at most violated the AGB of StudiVZ and not committed any criminal acts and I’m not planning to sell my friend’s profile IDs or data either
Maybe someone can extend the code and make it more useful. I guess it would be fun to automatically download all your pictures (including tags) and feed them into flickr or picasa… but someone else can do that.
Cheers,
Andreas
Tags: Profiles, Regex, StudiVZ
Would be cool to see a migration tool. I’d like to have an automated search in Facebook for the friends I had in SchülerVZ, StudiVZ and all the other VZ-trademarks they have.
btw: yes.. you did clearly violate their AGBs..