Saturday, November 14, 2009

From Java to PHP: Observations

When I changed jobs a couple months ago I moved from a Java/Oracle environment to a PHP/Oracle environment. While I use Java for a variety of tasks, only one of which is creating web applications, I'll try to keep the comparisons inside the common domain that PHP and Java share. Following are some of the contrasts I find notable.

A little background

I've been creating websites as a hobby (non-professionally) for about six years. I taught myself the most basic PHP and MySQL skills required to create database-driven websites, and wrote a lot of procedural code to enable fairly simple CRUD web applications. During this period prior to my formal computer science education I was motivated by entrepreneurial ideas rather than an interest in programming itself. As a result, my designs were (mostly) functional but not terribly pretty, efficient or maintainable. The only DRY code I wrote at the time was the ubiquitous header & footer includes.

Formally I learned object-oriented programming in Java, and after graduation a couple years ago I joined the aforementioned Java/Oracle outfit. This was an invaluable first job after graduation where I was exposed to some of the requisite basics and standbys in the corporate Java/J2EE world: Spring, Struts, Hibernate. I was introduced to invaluable utilities and libraries that many Java developers depend on such as log4j, jdom, ant, jsch and a variety of other mature open source libraries that make our lives simpler and more productive.

Comparisons

In my travels on the web I have discovered that there are few absolutes when it comes to the limitations and capabilities of any one programming language. Sure, certain languages have traits which lend themselves to more easily solving certain problems, but in general most languages are capable of manipulating the file system, interacting with databases and, in general, doing what needs to be done. It might not be feasible to write a Web 2.0 blogging web application in any number of languages; but then again, I'm almost certain it's possible in any language.

Many impressive projects out there stand on their own feet as first class citizens in a programmer's toolbox rather than as secondary or subservient to the languages in which they are implemented. Take Ant, for instance. While it is certainly not the first build tool created, it took on a life of its own outside the Java-as-language world and is used to build projects in many other other languages.

In the end, arguments about the superiority of one language over another (of which there are an over abundance) in my experience never yield winners. Ruby [on Rails] vs PHP tends to be a favorite comparison ever since Rails became popular a few years back. I will hold my tongue in that comparison as I have not used Ruby or Rails enough to have a good assessment of it.

So, after that long winded introduction, having done my best to convince you of my semi-impartiality and my rejection of the idea of loving a language for its own sake, I'll get on with the Java and PHP comparisons.

Compiled vs. Interpreted

I've never taken too much much stock in the performance comparisons between [semi-]compiled languages and interpreted languages for web use. In my experience these are not considerations that come into play for most web applications. Clearly one should not set out to implement an application in an interpreted language if it requires extreme execution speeds; but when it comes to most web applications I've found that causes for excessive response times are caused by poor design decisions rather than the language interpreter or compiler.

Java has that extra step between writing code and running code -- compilation from source code to byte code. When working in the Java world I really disliked the way this step interrupted the cycle of writing code and testing changes. It's not uncommon to make and test code changes many dozens of times per working day, so the break in concentration is not the only concern here: often times I wondered how much time all those "several seconds" of downtime at compilation added up to at the end of the month.

PHP does not have that extra step, so testing changes to web applications is a bit more snappy. Ctrl + S and you're ready to run. What you save is what gets executed, so you don't have to deal with issues where your IDE caches and older version of what you were working on. The only potential downside to this simplified cycle is that at times it makes me feel less thoughtful about the changes I'm making, since I know I'm just a refresh button away from knowing if my loop counter should have started at 1 instead of 0. Sometimes it's quicker to debug a problem by fixing one error and then retrying rather than reading through a block of code and understanding everything that's going on and diagnosing the problem logically, and I think this is not a good thing because it lends itself to missing edge cases. This is a personal flaw that I have been working to correct, rather than a reflection on the language itself. I think interpreted languages like PHP win on this front.

Built-in Data Structures

With Java I spend a lot of time considering which data structures to use, as there are many to choose from in the standard library and many additional opportunities to configure or modify those that exist. Anonymous classes in Java make it simple to create one-off variants of classes without having to clutter up the source code with more boilerplate code. Even with all the complaints, I found myself regularly using and enjoying many parts of the Java Collections Framework.

With PHP you're given the wondrous, do-all, be-all: array. Thus far I've managed to get by surprisingly well with the available data types, and I think this stems from the statelessness of PHP and most web applications -- you're generally not going to read thousands of records into memory in a single request. And once you've got your records, how often do you need to manipulate them or access them in a fashion more complicated than simple linear iteration? You rarely read in more than a hundred records into memory at a time, so collections and arrays tend to act as mere temporary buckets for presorted database records.

Types vs. No Types

This is one place where I have a strong preference for Java over PHP. I find that being explicit about variable types reduces runtime exceptions significantly, leads to more self-documenting code, and makes basic manipulations of data less verbose. With PHP I find myself cluttering up my code with is_integer, gettype, ===, and other checks to ensure I have the correct data type, rather than handling this at very specific places (Integer.parseInt("3"), for example) and having a guarantee about the data type in my variables.

The benefit of the typeless variable, presumably, is to make things simpler. I find it does anything but that. If I want to be absolutely sure variable a holds the same value as variable b, I have to be cognizant of the possibility that they're not even of the same type.

I really dislike typeless method signatures. A lot. I want to know exactly what data type a function is returning. I want to know what type its parameters should be. I want to be able to overload methods with different parameter type combos. I know there is some support type hinting, but if I have a setter method I want it to only accept booleans and I want it to only return void. In PHP I can pass "3" into this setter and it will happily chug along. This is truly my biggest gripe and frustration about PHP. One has to be ever so careful, especially in situations where the difference between "0" and 0 and false, or 3 and "3" and true, is critically important.

Scope

In Java, scope is quite explicit. If you're accessing a variable myVar inside of a class method, you know exactly what you're dealing with. In PHP, things can bleed into places you might not expect them to. The most infamous of which is the now-deprecated/ removed register globals. This is just something one has to be aware of, and can be dealt with effectively if you structure things with scope leak in mind. My preference is to encapsulate everything possible and only let each block of code have access to variables/function in the same domain.

Language Evolution

PHP has evolved, or perhaps become more developed, considerably since it entered wide use. The jump from version 4 to 5 gave us a more complete object-oriented environment and other much-needed features. While Java still has a more robust and complete set of OO features, PHP continues to add cool new features like closures.

Java has some features that I really miss. Enumerations, for instance, are something I originally looked at as overkill when they were added to Java, but I've come to rely on them heavily. Sure, it's possible to create a poor man's enumeration in PHP, but then you add even more code to your project to act as low level language constructs, and the code around it becomes very dependent.

I don't know what's currently on the horizon for PHP or Java, as I rarely have the opportunity to work in the newest versions, but I'm sure both have some interesting things in the works.

Conclusion

The above are just my initial observations after diving head first into PHP5 after a couple years in the Java world. I still have a soft spot in my heart for Java, and will continue to use it, but am enjoying my professional PHP experience thus far.

No comments:

Post a Comment