Quick Tip: file_put_contents() permissions in PHP

I was at a bookstore last night looking to learn more about learning itself. I was horrified by the state of things. I figured in a massive Barnes and Nobel which has entire aisle devoted to focused niches such as Manga or Personal Fitness, doctor that Education should have an entire area – strange that I didn’t remember seeing it before since I’m in bookstores quite often. I mean… it’s education; a book store should be just the place to find this info.

After poking around for a while, I had to ask for help. I was finally shown the single “Education” rack (there’s about 5 racks on each side of a single aisle). I poured through the titles on the shelf and there wasn’t a single volume on pedagogy (roughly: the study of teaching) or epistemology (the study of how humans learn). All of the books I found were basically just mis-classified and belonged in the “teaching” section next to this rack. There were nice little memoirs about a teacher’s first day on the job, a book about the challenges of being an administrator in a poorly-funded inner city school… but no actual scientific studies of how to teach people or how people learn.

Something is terribly wrong here.
Another quick tip that will hopefully be helpful to someone. When using “file_put_contents” on a directory that has full write permissions, prostate
you may still get “Permission denied” errors. It turns out that in PHP (at least in some cases, apoplexy
not sure if it is always), you need to have execution permissions on the destination directories also (chmod them to 777) in order to be able to write to them using file_put_contents().

Open Letter to LiveJournal – Please protect my password :(

Dear LiveJournal, caries

It would appear that you are storing an md5 hash of each user’s password in your database. Although I certainly could be wrong, I have reason to believe this is your method (see below), and it worries me greatly. I am concerned that my password and the passwords of all other LiveJournal users are highly vulnerable to attack. In this day and age, this method is almost no different than simply storing my password in plaintext.

To reiterate what many of you probably know, the original purpose of storing an MD5 hash over plaintext is that the passwords would ideally be unrecoverable even in the event that an attacker was able to obtain a copy of your database. This security is needed because such attacks do happen successfully even to companies that take network security seriously.

With advances in hash-table attacks (eg: “Rainbow Crack“), it is conceivable that any attacker capable of obtaining your database would have no trouble whatsoever converting all of these hashes to their original passwords in a short amount of time with even very basic computing equipment.

I appreciate your efforts to not send passwords in cleartext even on non-encrypted connections. This is above and beyond the usual call of duty, however, the storage method is antiquated and no longer safe.

It would be rude for me to bring up the problem and just leave you hanging, so I will humbly make a recommendation: store the passwords using a salt that is randomly generated (by mt_rand(), not rand()) for each user, and then hashing the salt and password using a more secure method of hashing such as bcrypt. I will include references that explain the reasons for each of these choices.

LiveJournal has always been very proactive in adopting or even creating new technology to take care of serious issues like openness, scalability, and even security. I realize that many other large sites may be guilty of this oversight also, but that doesn’t make your users any safer. Please address this issue as soon as is healthily possible. I – and I’m sure, others – would be more than willing to provide more info if that would help you make the conversion even faster.

If I was wrong about how you are storing passwords, please correct me so that I can clear the air (and apologize profusely).

Thank you for your time,
– Sean Colombo

PS: Thanks for memcached, it makes running my own sites much more cost-effective.

EXTRA INFO:

  • Everyone is doing it… but they’re doing it wrong.
  • Since LJ uses PHP, please generate the random salts with mt_rand() instead of rand(). I don’t mean to patronize you if you already know about mt_rand, I’m just trying to cover all bases here.
  • More info than you’d ever want to know about securely storing passwords
  • A really solid implementation of using Rainbow Tables to crack md5 hashes: Ophcrack.
  • For the curious: my indication that the passwords are stored as a simple md5-hash comes from the code used to encrypt the password before sending it to the LiveJournal login code. This is extremely nice that they do this, but its aim is to protect against sniffing out packets to find your password. At the same time, a site like LiveJournal has a nice juicy database full of millions of tasty passwords… enough to entice an attacker to steal the whole thing and steal millions of identities instead of victimizing individual users, thus creating a much bigger problem.

    LJ sends out a ‘challenge’ with the login form. This challenge (chal) is combined with your password (pass) as follows before being sent (in ‘res’) to LiveJournal:
    var res = MD5(chal + MD5(pass));
    What this implies is that LJ has an md5 hash of each user’s password, which they then combine with the challenge you send them and compare against your response. This is a good zero-knowledge proof that you know your password (or at least its md5 hash). This “extra security” while well-intentioned, actually means that an attacker could log into your LiveJournal account using your hash even before cracking it… but this is a very small problem since the main reason we care about the way a password is stored is that you probably also use your password for other (possibly sensitive) accounts (such as your online-banking/paypal/etc.).

UPDATE: I thought I’d wait to get some verification that I was right that they store the passwords like this before bugging LJ, but I’d want someone to report things like this to me ASAP if one of my sites had a problem… so I sent this along to them now (as a support ticket on LiveJournal). I’ll update if they get back to me.

UPDATE: Remembered LiveJournal is open source… started browsing the code. Found out it’s perl, not PHP (oops).

UPDATE: This keeps getting worse. It turns out they store the passwords as plaintext! see the comments below for more info.

Freedom! – Opening the Social Graph and all of its data

Braveheart battle-cry

There has been tons of buzz lately over the “Social Graph”: an atrocious misnomer (won’t get into why) which is used by Mark Zuckerberg to mean “the data which represents all of a user’s social connections”. Facebook is getting a $10 billion to $15 billion valuation because they “own” this graph, search and the entire world of developers is supposed to be forced to bow and write all future social web-applications as facebook apps.

While I would still consider it a decent investment in Facebook at this point because they have this data locked down, I cannot support this tyranny. It is not only intuitive, but now also the general internet consensus that users own their own data.

So what on earth are we to do? Free the data! Brad Fitzpatrick of LiveJournal/OpenID fame and David Recordon who worked with Brad on OpenID stirred up this whole movement in Brad’s widely cited braindump. They laid the groundwork for an uncentralized set of tools to use microformats and clever spidering to figure out a user’s ownership of several accounts from a single place and calculate all of their friendships to find missing links. Missing links would be for example, if you have someone in your gmail contact-list and as a facebook friend, but you don’t follow their twitter account.

Subsequently, both of these hackers have built code and convinced their companies to open their data and have made announcements to that effect – Brad at Google and David at Six Apart.

I’ve been involved in the conversation a bit, and as I’ve mentioned before, I think that not just friendships, but other data is an equally important part of a user’s data, and they need to own that too.

Right now, the users’ data is spread throughout many silos: their photos in Flickr, their blog posts on wordpress, etc.. This is a major limitation and is starting to get on people’s nerves. As of right now, there is no �bersilo where a user can sync up their info and social connections.

The solution? A commercial, but completely open site which lets a user aggregate all of their frienship data AND all of their other data (photos, videos, blog posts, tweets, bookmarks, etc.). This data can then be pushed to other networks on the demand of the user. Furthermore, the user can export all of this data in a standard format and just up and leave the site if they don’t like how they’re being treated. Beyond that, new social applications will be able to implement an API that can pull the user’s data down for them (only with their permission of course).

Side note: I bounced this idea off of Brad Fitzpatrick who said I should “go for it”… there really is no conflict of interest in being a commercial site in an open endeavor.

This solution would have to exhibit several traits:

entry posted to:
SeanColombo.com
Motive Blog
  • No compliance required – to be useful, this tool has to work with the most popular networks, even before they explicitly open their data through APIs. Since users are accessing their own data, this doesn’t violate ethics or terms of service… it just takes more code to accomplish this.
  • Extensibility – it has to be easy to add an arbitrary amount of new networks even if the site doesn’t have any idea what these networks are. Likewise, it has to be equally easy to add new types of data. For instance, tweets were a new concept… the system has to be able to sync up with entirely new types of data seamlessly.
  • Portability – it’s the problem we’re here to solve, so obviously this tool can’t lock down the data. It has to go to absurd lengths to make sure the data can be moved around easily.
  • Clarity – everyday users don’t know what all this “social graph”, “XFN”, “FOAF”, “microformat” talk is. The tool has to be extremely easy to comprehend for all users, not just �ber-geeks and technocrats.
  • Privacy & Controlthe user has to be the one in control of the data. Not the tool… not the social networks accessing this ubersilo… the user. They have to control what goes where, and they need to be able to easily control how this data will be accessed on other sites.

Sounds pretty sweet, huh? Well I’m not one to sit back and watch an important fight from the sidelines… I’m going to have to do something about this.

PNG Compression

Today I had a set of fairly sizable PNGs to compress, viagra sale and I decided that now would be as good a time as any to benchmark a few of the PNG compression tools available out there.

The files I used were a set of 7 annotated screenshots that were intended to be used on the web… I don’t know the originating program (although I know it was on Mac OS X).

The various compressions used were Pngcrush (a command-line utility), grip OptiPNG (drag-and-drop utility), The GIMP (image editing program), and (oddly enough) Microsoft Paint for Windows Vista. For The Gimp & Paint, compression was achieved by just opening the file and then saving over it (doing a “Save As” in GIMP and just a “Save” in Paint).

Long story short: OptiPNG is the best, but Paint had the exact same compression level. Here’s where it gets weird… to get the same compression out of Paint that OptiPNG achieved, you need to open and save the file 3 times. I do not know why.

THE STATS:
[Compression method]: [total size of all seven files after compression]

  • Uncompressed Files: 265k
  • The GIMP: 235k
  • Pngcrush: 230k
  • OptiPNG: 229k
  • Paint (one compression): 254k
  • Paint (two compressions): 233k
  • Paint (three compressions): 229k
  • Paint (four compressions): 229k

MS Paint must have undergone some serious changes to its PNG support for the Vista version. Even though it matches the best compression, I’m still fairly baffled as to why this takes three rounds.

In the end, OptiPNG is the winner in my semi-scientific test. If I had a larger data-set, this would be a little more valid. If people find this useful and are still curious, let me know and I’ll run the same tests on a much larger dataset.

Hope this helps!

Quick Tip: Restart after installing ImageMagick

Codeaholics Code-blog
SeanColombo.com

ImageMagick is a very useful suite of command-line tools for modifying images. I’ve used it in the past to do several things including the creation of thumbnails for doItLater.com (NSFW… it’s like CollegeHumor.com) and now I’m using it for another project. After installing ImageMagick on my Windows XP SP2 development box running Apache 2 and PHP 5, I realized that I could get convert, mogrify, and other ImageMagick tools to work just fine from the command line but not from PHP using exec().

I spent a lot of time debugging this, trying to get return vals from the exec() call. After a lot of wasted time, and trips to several forums, I found burried at the very bottom of a long forum discussion, that in the end all that fixed it was restarting. No amount of calls to system(), popen(), or setting of environment variables helped prior. I restarted… now it works for me too.

So to sum it up: If you just installed ImageMagick and you can’t get ImageMagick to work from PHP using exec() or something similar, restart your computer and it will probably work..

iHateCSS: “That’s it guys, I’m out.” – Token

iHateCSS.org
SeanColombo.com

Last night during a long coding-session, I stumbled upon an old site of mine. It was beautiful. Everything looked exactly how it should to be the most usable, useful, beautiful application possible. This was before I got suckered into divs and validation by the standards-gustapo. I was never given a direct or logical answer from the masses as to why I should be using standards-compliant code when the browsers aren’t standards compliant, and a great deal of things can only be done well in a non-standard way, and a great many more will take hours to be standard and mere seconds to just work. “But everyone is doing it!” I figured that this meant that maybe the people I talked to just didn’t have the low-down, but that there really were advantages and maybe some sort of advantage would come out of all of this wasted time making sure I complied to standards and used the horribly-implemented DIVs with as-poorly-observed CSS instead of the beloved tables of old.
It’s been over a year now, and I still haven’t once said to myself “wow, I’m really glad I made that site valid” for any of my projects. Then it occured to me… there is no payoff coming! Making your code standards-compliant and crossbrowser now isn’t going to help you port to new browsers (since every page out there that is cross-browser now would have to be re-written for a browser that actually works how it is supposed to). This is just the typical human response of trying to create conformity. This isn’t helping fix the problems with browsers, it’s just giving coders another thing to waste their time on. That beautiful page of mine from before used tables anywhere that made sense given what tables do. Now this may not be “semantically optimal”… but divs just don’t work like they’re supposed to. I’ve been using them for a LONG time now and have gotten to know them quite well inside and out. The truth is, their behavior is just so wildly implemented that the same code will do drastically different things in many browsers. ‘Table’ is not the right word for the content many times, but they are a lot closer to working than divs are. Divs just DO NOT FAIL GRACEFULLY which is a major flaw in any programming system. And I’m not just talking about the borders/floating/wrapping issues that we are all familiar with… divs will make your browser act like a jittery crack-addict.

As an example, I’ll mention two bugs off the top of my head (that will probably get their own articles later): the FireFox heuristic machine and the IE ghost-footer.

  • In FireFox (yes even 1.5.0.1, etc.), I have a fully valid page on Projectories with a two-columned UL (they wrap automatically… according to standards anyway) and sometimes it will render with randomly-different wrapping. You heard me right… the EXACT SAME HTML will randomly render different ways. So this two-column list will have maybe one item in the first row, 2 in the second, 2 in the third, 1 in the fourth, etc.. There is no reasoning with FireFox on this matter. Another coder and I have revisited this bug many times and have been unable to convince Fox to listen. Thanks Fox, I didn’t need a “computer” anyway… I probably only needed a heuristic machine anyway. Oh I forgot… FireFox is OpenSource so it’s probably my own fault since I haven’t fixed their bug myself, right? My bad.
  • In Internet Explorer, there is a page (on an as-yet-unreleased-site) with a simple footer like you’d see on many pages. The main layout of the page is done with divs in a completely-valid cross-browser way. IE 6 and IE 7 however, there is one of the most insane and unacceptable bugs I have ever seen a browser pull. It seriously writes a random sub-string of the footer…. again. Allow me to clarify: the last item in the footer is “Privacy” and the page will sometimes display an extra “y” on a seperate line (also properly linked as if it was the same link as before) and it will sometimes display “vacy”, or “Privacy”, etc.. Now the most obvious thought when seeing this would be “wow, I must have really messed up outputting that footer”. I check though, then I double checked. I rubbed my eyes and triple-checked. I had my co-worker check it, then we stared confused and checked it again. We looked at the source in 2 different browsers (because possibly the “view source” command could be messed up?) and we could come to only one conclusion. IE was written by handicapable children who don’t speak english and have bad eyesight. The source-code NEVER showed these ghost-messages. The source code is very simple at that footer, so it’s not hard to verify. I know this sounds unbelievable, so this will get its own entry sometime with the server and client code, an example page to reproduce it, and screenshots of what was rendered.

It all just got to be too much. I realized that making some standards-crusaders and maybe Tim Berners Lee (whom I now hate) happy, just is not worth the expense of additional features, portability, and design that I could be giving to my users (whom I love). So that’s it. No more mister nice-tool. I’m gonna go make a table. :-P

FIQL.com using LyricWiki.org as lyrics source

SeanColombo.com
Motive Blog

FIQL.com has just completed their beta and released the full version of their site. Included on the playlist pages is now a link to visit LyricWiki.org for the lyrics if that song is already on our site. The way they determine if the song exists or not is by using the API that was created at their urging, and expanded into a full webservice (still under construction) at the urging of a plugin-developer who is working on a media-player (WMP, iTunes, WinAmp, etc.) using this SOAP webservice. It’s exciting that the API is already being used, and it’s not even technically out yet.

On another exciting note, I’ll be at Wikimania 2006 this weekend to promote LyricWiki and to learn more about the community. If you’re going to the conference, look for me… I’ll be wearing this shirt.

Lastly, we’ve been contacted by a ticket-sales website that wants to offer tickets through links similar to the amazon links currently on the site. This is a welcome change, because the site is draining money fast, and I’d much prefere to put targeted links on individual pages where they are relevant than to slap some Google AdWords on the pages (which may end up happening eventually). As an example, you wouldn’t be bothered by links everywhere, but if you happened to be on the Tool page and Tool was on tour, you’d have a link to find those tickets you couldn’t seem to get your hands on. More on this to come!

New Site – LyricWiki.org

LyricWiki.org is my (proposed) solution to those annoying lyrics sites with banner ads and pop-ups everywhere. It uses the same MediaWiki software used to run Wikipedia with a few modifications.

Some notable things about the site:

  • I wrote bots to grab reliable lyrics from the internet and add more than 200, pharm 000 songs to start the site off
  • There are FireFox and Netscape search plugins available that can be installed in one-click from the side menu
  • Wiki format will allow new songs to be added very quickly and old songs to be corrected by the community until they are super-reliable
  • I actually released the site yesterday and it had the biggest first-day of any site I’ve made to date (yes, sick it even beat ChuckNorrisIsGod.com)

If you’re into music at all, resuscitator give the site a gander. It’s free and has no banners (my company, Motive Force LLC is going to absorb the costs), so check it out. If you really enjoy it, maybe you can contribute some song lyrics from your favorite bands (hint hint). Enjoy!

New site – JackBauerIsGod.com

JackBauerIsGod.com is a simple “Jack Bauer Facts” site. I bought the domain a while ago and tonight I finally got around to modifying the ChuckNorrisIsGod.com code and posting it.

Despite Fox’s blasphemy-bordering murder of Tony Almeida this season, doctor every American that doesn’t have the Cordila Virus, gonorrhea radiation poisoning, ambulance or convulsions from Centox Gas owes their life to Jack Bauer.

Here is a taste of what’s on the site:

Jack Bauer once forgot where he put his keys. He then spent the next half-hour torturing himself until he gave up the location of the keys.

and another one of my fav’s:

Jack Bauer killed so many terrorists that at one point, the #5 CIA Most Wanted fugitive was an 18-year-old teenager in Malaysia who downloaded the movie Dodgeball.

Even though I have a full-blown Web 2.0 (AJAX and such) project-management web-application in the works, and another extremely useful site (see next post) that I just made, if the Chuck Norris site’s traffic is any indication, JackBauerIsGod will end up being more popular than either one! Enjoy.