SeanColombo.com

My little corner of the internet.

Quick 3-question SiloSync Survey

SiloSync is a sizable undertaking and there are a number of different potential places to start from. I want to make sure I have a decent idea of where the demand is, so I’ve put together a quick 3-question survey. Please take a minute to fill it out for me! (you can be anonymous if you’d like)

To answer, please just leave a comment. I’ll leave my own answers in a comment as an example.

Question 1: What would you be most anxious to use SiloSync for?
A. Syncing up data & friendships between profiles on different sites so that they are all up-to-date.
B. Backup up data (photos, etc.) & friendship connections so that they never get lost and are all in one place.
C. Changing services if one of them does something unacceptable (along the lines of the Facebook Beacon debacle).
D. Quickly joining new services w/o the trouble of re-finding everyone and re-typing everything.

For this, please just type all of the letters you are interested in from highest-to-lowest

Question 2: What services would you most like to be able to pull data into SiloSync from?
(examples: do you want to pull your data from Facebook, Flickr, LiveJournal, WordPress, Twitter, MySpace?)
Again, please type the most-desired first.

Question 3: What services are most important to export data to?
(examples: do you want to send-data-to/sync-data-with Facebook, LinkedIn, Twitter, or a bunch of new and exciting social networks we don’t know about yet?)

Hopefully that was quick! Thanks for taking some time to help me out :)

Quick Tip: Delete old log-files if you use mySQL replication

There was a bit of a mess over on LyricWiki off and on for a few days. The culprit was a known bug in mySQL which messes up master-slave replication if you run out of hard-disk space (which you will if you’re using master-slave replication).

The preventative solution is to set up a daily cron-job which will find out what log the slave is using and delete all of the binary log-files that are older than that file. The alternative is an immense pile of unneeded files which will eventually cause you to run out of space and completely break your replication. To give you some idea, we filled up 100gigs of log-files from LyricWiki (which has hundreds of times as many reads as writes) in about 2.5 months.

Hope that helps!

UPDATE: I just wrote this script and figured I’d release it publicly to save others some time. You can get the code from deleteOldBinLogs.txt (that’s just a .txt so you can view the code… save it as a “.pl” file). Once you’ve filled out the “configuration” part at the top and have uploaded the file to the “/root” directory on your Master database server, add this line to your crontab file (by typing “crontab -e” into the command line):
0 4 * * * perl /root/deleteOldBinLogs.pl
That will make the script run at 4am each morning.

Pitt talk was fun

The talk I gave this week on SiloSync at Pitt was a fun venue. Their Lunch-and-Learn series is a really cool idea and sounds like it’s getting even more interesting. Next month’s talk is going to be done by a VP from Sun Microsystems. Prior to presenting, I jumped back into the SiloSync code and wrote the beginnings of the importer for Facebook.

As a side-note: one of the things that’s fascinating about this project is that I get to see all of the half-implemented security that different sites use. LiveJournal had a secure way of sending passwords, but shockingly stores passwords as plain-text (a big security faux-pas). Similarly, I saw some left-over fields in Facebook’s login form, but it appears that they just punted and used https (a secure web connection using SSL encryption) to just encrypt the whole login.

Back to the crux of this post: I’ve been rather tempted lately to actually finish SiloSync – which I had previously shelved in hopes that Open Social and other big-name initiatives would fix the problem (they didn’t). Google, Facebook, and MySpace have all announced fake data portability initiatives in the last week or so, which shows that if we want our data to be free, we’re going to have to take it (see my previous post on freeing the social graph for why this is important).

I decided it would be best to make a habit of posting my slide-decks when I present (I appreciate it when other people do that), here are the PowerPoint and Open Document (Open Office) versions. In the process of making the presentation, I ended up creating a visual representation of SiloSync which I think does a great job of summing up the whole idea for someone who hasn’t been exposed to it yet. That’s the picture above and to the right… click it to see the full-size version.

Interestingly, with these effectively useless announcements from the major Social Networks, a lot of non-technical press has been declaring that data is now free. Okay, cool, let’s all go home.

Fortunately, most of the technical press is calling them on it. Everyone from TechCrunch to David Recordon (of OpenId fame) is telling it like it is.

If you are interested in seeing SiloSync pushed to fruition (more than you’re interested in seeing Motive Suggest or doItLater v2.0), let me know so that I can weigh off the interest between the several projects competing for my time. Also, feel free to leave comments about your thoughts on the various “fake” data portability. This seems to be the topic which always gets the most vocal response on my blog.

Speaking at University of Pittsburgh, May 14th.

I’ll be speaking at the University Of Pittsburgh’s School of Pharmacy (in 810B) for a “Lunch and Learn” on May 14th. The talk will be on SiloSync (which will need to be updated quite a bit before then) and will probably go into a more general discussion of Social Networking and Freeing the Social Graph during Q&A.

From what I understand, the Lunch and Learn series is mostly attended by faculty and staff, but we’ll see. The last talk was by Jesse Schell of Schell Games, so I guess I’m in good company!

Thanks for inviting me, Pitt!