Sunday, December 6, 2009

Yii Framework: choosing the best web application framework

After experimenting with a few different programming languages and even more frameworks, I've settled upon a PHP framework called the Yii Framework. This post describes how I arrived at Yii and what other options I considered before making my decision.

Several years ago I wrote a functioning, but ugly and difficult to modify, web application. The code that's in production right now works well, as I spent a lot of time making sure it was solid and had good cross-browser support. However, there are several new features I'd like to add and some issues I'd like to address before it takes off and makes me a millionaire (we all can dream, no?). But the convoluted and frustrating mess of spaghetti code in my original implementation has stalled these ambitions.

For the past few years I have been too busy with work and other projects, and this one has regretfully stagnated. From time to time I felt the itch and would set out to find the perfect language + framework in which to build my application. Following is a rough chronology of my various attempts to rewrite and extend my application using "the best" web application framework.

Attempt #1: PHP + Zend Framework beta

Having experience with PHP, and comforted by the fact that Zend Framework is built by the people who built PHP, my initial foray with ZF seemed like an good idea despite its pre-1.0 status.

The main problem I ran into with ZF was my own ignorance.

  1. I wanted my hand held, but Zend Framework was more of a toolkit than an MVC environment.
  2. This was my first experience with a sophisticated web application framework. Conceptually I thought I understood the Model-View-Controller pattern, but I'd never used one before and didn't fully grasp its nuances or the best practices.
  3. Despite having a few years of experience with Java, I'd only programmed strictly procedural PHP up to that point. In retrospect this really wasn't a major hurdle, but coupled with the other issues it was a contributing factor.

The quick start guide was not nearly as good as it is today, and getting everything set up was frustrating. I thought I was missing the how-to guide that would explain how to get everything the way I wanted and send me on my way. I didn't really understand what a framework was or what ZF offered.

Attempt #2: PHP + CakePHP

After being frustrated by the learning curve and the lack of guidance with Zend Framework I stumbled upon CakePHP and got excited once again. It offered a structured, this-is-how-you-do-it, approach to building web applications.

The lack of documentation in some key areas is what eventually drove me away from CakePHP. I spent many hours trying to get some forms to behave exactly how I wanted them to, and after several days without much to show for it coupled with a few ugly "hacks" in place to get the behavior I wanted, I didn't feel like things were going the way I wanted. Things slowly fizzled out as I became overwhelmed by other projects, and the rewrite halted.

Attempt #3: Ruby on Rails

While considering job prospects after my imminent graduation from college, I briefly got caught up in the Ruby on Rails phenomenon. I ordered a book, built a simple application, and was thoroughly blown away. Everything just clicked. The hype was justified and my only regret was not having taken a look at RoR sooner.

After a while, however, I ran into some issues that nagged me too much to continue on. Ruby on Rails was at first refreshing in its "one way -- the right way" approach to designing web applications. After the agnosticism of Zend Framework I was initially receptive to Rails.

But the considerable, backwards-incompatible differences between the dated version of Rails used in the books I purchased, and the more recent versions, caused several headaches. I spent hours trying to figure out how to do what I considered to be very simple things for common use cases. When I did find the answers I was usually surprised by their simplicity, but the hunt for them was the challenge, and the fact that I was learning Ruby at the same time made progress inch ahead at a snail's pace.

I think I could have been successful with Ruby on Rails if I was more patient at the time, but it required more of an investment than I was willing to give at the time. I did enjoy learning the basics of Ruby, but I was disappointed to discover that it was not my silver bullet.

Other concerns were limited web hosting support, more complicated deployment steps, and various claims of limited scalability. At the time, the first was (and may still be?) a legitimate concern. The other two may not have been issues so much as excuses I deferred to as I shifted my time to other pursuits and stopped working on the rewrite.

I realized at this point that programming frameworks have an up-front investment in time that cannot be ignored. One cannot simply port their code to a framework without learning and accepting the various quirks of the framework (and language). As I only had a small amount of time available to invest and I was looking for immediate results, I shelved the project yet again.

Attempt #4: Python + Django

Quite a bit of time passed since the previous attempt. After hearing about python for years I finally decided to see what it was all about one evening, and found myself glued to the very well-written introductory Python tutorial. After working in Java professionally for quite some time, learning python was really... fun. It seemed to have the interesting superpowers of those languages you learn in college but never use professionally, but with a large user-base, excellent documentation, few technical limitations, wide support and an inexplicable "wow, that's cool" factor.

I took a look at the small selection of viable Python web application frameworks and after much fretting over cutting edge vs. conservative options, settled upon Django. It was really quite easy to get started with but I eventually (inevitably) realized once again that I was starting down a path where I needed to be serious and willing to invest a lot of time if I was going to be successful. I didn't have the time to invest, so things stalled yet again.

Attempt #5: PHP + Yii Framework

After gaining professional experience in a variety of frameworks in a couple different languages, and feeling more disciplined and confident that I can master a chosen language/framework, I have selected the Yii PHP Framework. I make this choice after much research and contemplation, not because I believe Yii is the best framework out there, but because my initial experimentation has been quite enjoyable and when I attempted to solve the issues I faced in previous attempts I found that Yii offered simple, documented solutions.

To be honest, once I settled on PHP as the language this was really a toss-up between Yii and Zend Framework. I investigated ZF's progress since my initial attempt and found that it has really become a lot more solid and full-featured than it was a few years ago. If I had more experience with Python I'd probably be seriously considering that option, but, alas, I'm eager to get something finished quickly and I know I have the skills in PHP to be up and running quickly.

The main negative for Yii is the fact that it is relatively new, and is still technically in beta. Having kept an eye on it for quite some time now, seeing it implemented successfully by several companies, and knowing how brilliant the lead programmer is all give me some comfort that it is a legitimate contender. The other factor that pushed me to Yii is that I can still use Zend Framework components inside my Yii application if I choose.

Why not Code Igniter, newer versions of CakePHP, Symfony, or any of the other myriad of PHP frameworks out there? I can't claim to have fully vetted all options because I simply don't have the time. I took a look at documentation, examples, the size and vibrancy of communities, and even many flame wars between the fanatics of each, and felt that Yii was the best fit for my purposes.

P.S. Please do not take offense to my personal experiences if I had problems with your framework of choice. I know there are experts who can bend each of these frameworks to their will to accomplish almost any task, and I'm absolutely not saying that any of these frameworks is the best choice, or not a good choice, for anybody. These are simply my experiences with the version, documentation and time constraints I faced at the time I tried each of them out. I would enjoy going back and trying out some of these frameworks a second time to see how they have improved since I gave them a spin.

Sunday, November 22, 2009

Writing a recursive Java puzzle solver

A couple years ago I wrote a puzzle solver in Java. The puzzle in question goes by many names, and you can play various implementations of it on several different social networking sites and elsewhere. It probably began as a board game but I'm not familiar with its name. I think this is a nice, simple example of how to use a recursive algorithm to solve a problem using brute force.

I do not condone cheating and I ask that you not use my code for that purpose.

In order to "solve" a puzzle one must know what a solved puzzle looks like. The puzzle is a grid of letters, typically with dimensions between 4x4 and 5x5, but that is not a fast rule (our solver will allow the dimensions to be configured at runtime). Typically there is a time limit to find as many words as possible in the grid, after which point the round ends and all players' scores are tallied.

Here's an example 4x4 game grid (the numbers are just row/column labels and are not part of the board itself):

 1 2 3 4
[A E K C] 1
[M R F S] 2
[B D C I] 3
[L P V R] 4

The challenge is to find as many words as possible, using each space in the grid only once per word, and building words by hopping from space to contiguous space. The player can start at any place on the grid, and a contiguous space includes horizontal, vertical and diagonal moves. For example, in the grid above one word we could build is 'RISK' -- It starts on (r4, c4): R -> (r3, c4): I -> (r2, c4): S -> (r1, c3): K.

The first question we need to answer when designing our solver: what is a valid word? I know 'Risk' is a word, but how would the computer know that? The answer is, we need a dictionary of valid words which the program can refer to when solving the puzzle. This dictionary of allowed words varies in practice, but so long as our dictionary matches the game provider's dictionary, everything will work out nicely. I will not be providing a word list, as the actual words are irrelevant to the algorithm.

Our program will assume the dictionary is a flat text file with one word per line. It need not be alphabetical or sorted in any way.

Here's an example of how our file might look:

---------------- BEGIN FILE --------------------
----------------- END FILE ---------------------

Now that we know the rules for solving the puzzle and we have a list of words which are valid, we can write our solver!

I will break our solution into three different files/classes:

  • (This will be an in-memory representation of the game board)
  • (This will be the in-memory dictionary)
  • (This is where the algorithm itself will be implemented, and will also contain the main method)

First I'll start with the Dictionary. This is the data structure where we'll store the list of valid words. The reason we store them here rather than just access them directly from the text file above, is speed. If we had to open the flat file and search the entire list of words every time we were checking if a word was valid, the program would run extremely slowly.

Our goal here is to make word lookups very speedy since we'll be doing a lot of them every time we solve a puzzle. The data structure we use here will either make or break our implementation.

Here's the Dictionary class:

import java.util.HashSet;

 * @author Ryan LaHue
 * This is our data structure to hold the dictionary of valid words.

public class Dictionary extends HashSet {
 public Dictionary(String filename) throws FileNotFoundException, IOException {
 public void buildDictionary(String textFile) throws FileNotFoundException, IOException {
    //Takes in the name of a file containing words -- one per line.
    //Builds the dictionary in memory from this file.
    BufferedReader reader = new BufferedReader(new FileReader(textFile));
    String word = reader.readLine();

  while(word != null) {
    word = word.toLowerCase();
    word = reader.readLine();

As you can see, the Dictionary is a subclass of HashSet, which I picked as a reasonable data structure to hold the dictionary words. HashSet builds on top of the hash table data structure. Our Dictionary doesn't take long to build, has very quick access times (very important), and doesn't allow duplicates.

Here's the Board class:

 * @author Ryan LaHue
 * This is a representation of the puzzle board.
 * It is a grid consisting of rows and columns, with one letter per slot.

public class Board {
 private String [][] board;
 private int rows;
 private int cols;
 private BufferedReader keyboard;
 public Board() {
  keyboard = new BufferedReader(new InputStreamReader(;
 public int getRows() {
  return rows;
 public int getCols() {
  return cols;
 public String getLetterAt(int column, int row) {
  return board[column][row];
 public void buildBoard() {
  //Prompts the user for the dimensions and letters on the board
  rows = iQuery("Enter the number of rows: ");
  cols = iQuery("Enter the number of columns: ");

  board = new String[cols][rows];
  for(int i=0;i<rows;i++) {
   String rawRow = sQuery("Enter the first row: ");
   rawRow = rawRow.replace(" ","");
   for(int j=0;j<rawRow.length();j++) {
    //Convert each row of letters into an array
    board[i][j] = String.valueOf(rawRow.charAt(j));
 public void printBoard() {
  //Prints a representation of the board entered by the user
  System.out.println("---The Board---");
  for(int i=0;i<cols;i++)
   for(int j=0;j<rows;j++) {
    System.out.print( board[i][j] + " ");
 private String sQuery(String query) {
  //Returns a String typed by the user
  String line;
  try {
   line = keyboard.readLine();
  }catch(Exception e) {
   System.out.println("Invalid entry");
   return sQuery(query); //Keep prompting until a string is entered 
  return line;
 private int iQuery(String query) {
  //Returns an int typed by the user
  int answer = 0;
  try {
   answer = Integer.parseInt(keyboard.readLine());
  }catch(Exception e) {
   System.out.println("Invalid entry");
   return iQuery(query); //Keep prompting until a valid number is entered 
  return answer;

The Board class is the puzzle board. It also has some methods which are used to construct the board from user input.

Here's the Solver class:
import java.util.HashSet;
import java.util.Set;

 * @author Ryan LaHue
 * This class contains the main method, which takes as an argument the dictionary filename
 * After the dictionary is created it prompts the user for the board dimensions and board letters.
 * After the board is set up, it prints out unique words as it finds them while traversing the board recursively.
 * The rules and parameters:
 * 1) Build words between 3 and 9 letters long
 * 2) Use each slot on the board once per word
 * 3) Build words by moving to a letter in a slot adjacent to the last letter added to the current word.
public class Solver {
 private Dictionary dictionary;
 private Board board;
 private static final int MAX_WORD_LENGTH = 9;
 private static final int MIN_WORD_LENGTH = 3;
 private Set foundWords = new HashSet();
  private int numIterations = 0;
 public static void main(String[] args) throws Exception {
    Solver solver = new Solver();
    solver.dictionary = new Dictionary(args[0]); //args[0] = File containing all valid words
    solver.board = new Board();
 public void solve() {
  //This method solves the puzzle, utilizing the SBoard and SDictionary in memory

  for(int i=0;i<board.getCols();i++) {
   for(int j=0;j<board.getRows();j++) {
        //For each space on the board

        //Create a temporary map of the board; each space defaults to false (i.e. unused)
    boolean memoryBoard [][] = new boolean [board.getCols()][board.getRows()];

    //Solve the game, attempting to use each slot as the first letter
    solve("", memoryBoard, i, j);
  System.out.println("Number of iterations: " + numIterations);
 public void solve(String currentWord, boolean memoryBoard[][], int currentColumn, int currentRow) {

  //Immediately back up the memoryBoard and pass a new one in recursion;
  //Otherwise multiple recursions will be changing the same memoryBoard
  boolean [][] newMemoryBoard = new boolean [board.getCols()][board.getRows()];
  for(int i=0;i<board.getCols();i++) {
   for(int j=0;j<board.getRows();j++) {
    newMemoryBoard[i][j] = memoryBoard[i][j];

    //Exit if the potential word grows too long
  if(currentWord.length() == MAX_WORD_LENGTH) {

    if(!spaceExists(currentRow, currentColumn) || memoryBoard[currentColumn][currentRow]) {
      //If this space does not exist or it has already been used, end this recursion

    //Mark this space as used
    newMemoryBoard[currentColumn][currentRow] = true;

  //Add this slot's letter to the pre-existing word
  currentWord = currentWord + board.getLetterAt(currentColumn, currentRow);
  if(dictionary.contains(currentWord) && currentWord.length() >= MIN_WORD_LENGTH) {
   if(foundWords.add(currentWord)) {
        //Print this word out since it hasn't previously been discovered
  //Try building on this word by using every neighboring letter
  solve(currentWord, newMemoryBoard, currentColumn - 1, currentRow - 1); //lower-left
  solve(currentWord, newMemoryBoard, currentColumn, currentRow - 1); //left
  solve(currentWord, newMemoryBoard, currentColumn + 1, currentRow - 1); //upper-left
  solve(currentWord, newMemoryBoard, currentColumn - 1, currentRow); //below
  solve(currentWord, newMemoryBoard, currentColumn + 1, currentRow); //right
  solve(currentWord, newMemoryBoard, currentColumn - 1, currentRow + 1); //lower-right
  solve(currentWord, newMemoryBoard, currentColumn, currentRow + 1); //above
  solve(currentWord, newMemoryBoard, currentColumn + 1, currentRow + 1); //upper-right

  private boolean spaceExists (int row, int col) {
    //Determines whether this row/column exists on the board
    if(row < 0
    || col < 0
    || row >= board.getRows()
    || col >= board.getCols()) {
   //No such space exists
   return false;
    return true;

The Solver class is where all the logic exists. It's all built around the solve method, which is called recursively to exhaust all possibilities as it looks for words.

The limit to word length (set to 3-9 letters in the code) is useful for larger boards. If we didn't cap it at a reasonable length then boards as small as 5x5 would lead to a phenomenal number of recursions. On my computer, with a 4x4 board and no cap, it took about a minute to solve the puzzle with 96237136 recursions. A 5x5 board went several minutes before I stopped it. I do not know the formula to calculate recursions based on a grid of size (X,Y), but it appears to be exponential.

The word cap is a simple heuristic I implemented to stop recursions that are unlikely to reveal undiscovered words. While it means we'll miss any word longer than 9 letters, for larger boards it dramatically increases the speed with which words are discovered. The fact that our program misses 10-letter and longer words means that it doesn't guarantee complete solutions for boards larger than 3x3.

Is this a good solution? That depends on your definition. If you're a cheater who is trying to score the most points, this is a perfectly good solution as it starts printing out words almost immediately and fairly quickly (depending on letter distribution, of course). If you need every matching word in a puzzle, and your dictionary contains words 10 letters in length or longer, then it is a bad solution with the cap; but if you remove the cap then it will find these long words.

The trade-off here is speed vs completeness. Memory is a third sliding variable, which could be a factor if this needed to run in a low-resource environment.

Can we improve upon this solution? Absolutely. If memory was not a concern then we could store every substring of every dictionary word, in our dictionary (along with a flag stating whether the word is a substring or a full word). At each point if the word we're building isn't found in the dictionary as either a full word or a substring, we can stop that recursion short.

Another possibility would be to explicitly search for each word in the dictionary rather than to let the puzzle drive the search. This would probably call for a different data structure.

Is there a perfect solution that improves in the areas of speed, completeness and memory? Almost certainly. Is there a "best" solution that wins against all other solutions in all three areas? I doubt it, as these things tend to be trade offs.

Setting up SyntaxHighlighter version 2.x on Blogger

I ran into a couple snags when setting up SyntaxHighlighter on Blogger. You should be aware that there are major changes to the SyntaxHighlighter API between version 1.x and version 2.x. Since many of the tutorials floating around do not specify which version they're for, it can be a little frustrating to set up.

  1. I downloaded version 2 from the author's website.
  2. Followed the instructions here:
  3. Done!

And here's what it looks like:

echo 'This is a test!';

P.S. One thing to keep in mind when following the instructions is that if you turn off Blogger's automatic line break conversion (line breaks become <br/>), your existing posts will lose their line breaks. I had to go back and put explicit <p> and </p> tags around each of my paragraphs in my posts.

Saturday, November 14, 2009

From Java to PHP: Observations

When I changed jobs a couple months ago I moved from a Java/Oracle environment to a PHP/Oracle environment. While I use Java for a variety of tasks, only one of which is creating web applications, I'll try to keep the comparisons inside the common domain that PHP and Java share. Following are some of the contrasts I find notable.

A little background

I've been creating websites as a hobby (non-professionally) for about six years. I taught myself the most basic PHP and MySQL skills required to create database-driven websites, and wrote a lot of procedural code to enable fairly simple CRUD web applications. During this period prior to my formal computer science education I was motivated by entrepreneurial ideas rather than an interest in programming itself. As a result, my designs were (mostly) functional but not terribly pretty, efficient or maintainable. The only DRY code I wrote at the time was the ubiquitous header & footer includes.

Formally I learned object-oriented programming in Java, and after graduation a couple years ago I joined the aforementioned Java/Oracle outfit. This was an invaluable first job after graduation where I was exposed to some of the requisite basics and standbys in the corporate Java/J2EE world: Spring, Struts, Hibernate. I was introduced to invaluable utilities and libraries that many Java developers depend on such as log4j, jdom, ant, jsch and a variety of other mature open source libraries that make our lives simpler and more productive.


In my travels on the web I have discovered that there are few absolutes when it comes to the limitations and capabilities of any one programming language. Sure, certain languages have traits which lend themselves to more easily solving certain problems, but in general most languages are capable of manipulating the file system, interacting with databases and, in general, doing what needs to be done. It might not be feasible to write a Web 2.0 blogging web application in any number of languages; but then again, I'm almost certain it's possible in any language.

Many impressive projects out there stand on their own feet as first class citizens in a programmer's toolbox rather than as secondary or subservient to the languages in which they are implemented. Take Ant, for instance. While it is certainly not the first build tool created, it took on a life of its own outside the Java-as-language world and is used to build projects in many other other languages.

In the end, arguments about the superiority of one language over another (of which there are an over abundance) in my experience never yield winners. Ruby [on Rails] vs PHP tends to be a favorite comparison ever since Rails became popular a few years back. I will hold my tongue in that comparison as I have not used Ruby or Rails enough to have a good assessment of it.

So, after that long winded introduction, having done my best to convince you of my semi-impartiality and my rejection of the idea of loving a language for its own sake, I'll get on with the Java and PHP comparisons.

Compiled vs. Interpreted

I've never taken too much much stock in the performance comparisons between [semi-]compiled languages and interpreted languages for web use. In my experience these are not considerations that come into play for most web applications. Clearly one should not set out to implement an application in an interpreted language if it requires extreme execution speeds; but when it comes to most web applications I've found that causes for excessive response times are caused by poor design decisions rather than the language interpreter or compiler.

Java has that extra step between writing code and running code -- compilation from source code to byte code. When working in the Java world I really disliked the way this step interrupted the cycle of writing code and testing changes. It's not uncommon to make and test code changes many dozens of times per working day, so the break in concentration is not the only concern here: often times I wondered how much time all those "several seconds" of downtime at compilation added up to at the end of the month.

PHP does not have that extra step, so testing changes to web applications is a bit more snappy. Ctrl + S and you're ready to run. What you save is what gets executed, so you don't have to deal with issues where your IDE caches and older version of what you were working on. The only potential downside to this simplified cycle is that at times it makes me feel less thoughtful about the changes I'm making, since I know I'm just a refresh button away from knowing if my loop counter should have started at 1 instead of 0. Sometimes it's quicker to debug a problem by fixing one error and then retrying rather than reading through a block of code and understanding everything that's going on and diagnosing the problem logically, and I think this is not a good thing because it lends itself to missing edge cases. This is a personal flaw that I have been working to correct, rather than a reflection on the language itself. I think interpreted languages like PHP win on this front.

Built-in Data Structures

With Java I spend a lot of time considering which data structures to use, as there are many to choose from in the standard library and many additional opportunities to configure or modify those that exist. Anonymous classes in Java make it simple to create one-off variants of classes without having to clutter up the source code with more boilerplate code. Even with all the complaints, I found myself regularly using and enjoying many parts of the Java Collections Framework.

With PHP you're given the wondrous, do-all, be-all: array. Thus far I've managed to get by surprisingly well with the available data types, and I think this stems from the statelessness of PHP and most web applications -- you're generally not going to read thousands of records into memory in a single request. And once you've got your records, how often do you need to manipulate them or access them in a fashion more complicated than simple linear iteration? You rarely read in more than a hundred records into memory at a time, so collections and arrays tend to act as mere temporary buckets for presorted database records.

Types vs. No Types

This is one place where I have a strong preference for Java over PHP. I find that being explicit about variable types reduces runtime exceptions significantly, leads to more self-documenting code, and makes basic manipulations of data less verbose. With PHP I find myself cluttering up my code with is_integer, gettype, ===, and other checks to ensure I have the correct data type, rather than handling this at very specific places (Integer.parseInt("3"), for example) and having a guarantee about the data type in my variables.

The benefit of the typeless variable, presumably, is to make things simpler. I find it does anything but that. If I want to be absolutely sure variable a holds the same value as variable b, I have to be cognizant of the possibility that they're not even of the same type.

I really dislike typeless method signatures. A lot. I want to know exactly what data type a function is returning. I want to know what type its parameters should be. I want to be able to overload methods with different parameter type combos. I know there is some support type hinting, but if I have a setter method I want it to only accept booleans and I want it to only return void. In PHP I can pass "3" into this setter and it will happily chug along. This is truly my biggest gripe and frustration about PHP. One has to be ever so careful, especially in situations where the difference between "0" and 0 and false, or 3 and "3" and true, is critically important.


In Java, scope is quite explicit. If you're accessing a variable myVar inside of a class method, you know exactly what you're dealing with. In PHP, things can bleed into places you might not expect them to. The most infamous of which is the now-deprecated/ removed register globals. This is just something one has to be aware of, and can be dealt with effectively if you structure things with scope leak in mind. My preference is to encapsulate everything possible and only let each block of code have access to variables/function in the same domain.

Language Evolution

PHP has evolved, or perhaps become more developed, considerably since it entered wide use. The jump from version 4 to 5 gave us a more complete object-oriented environment and other much-needed features. While Java still has a more robust and complete set of OO features, PHP continues to add cool new features like closures.

Java has some features that I really miss. Enumerations, for instance, are something I originally looked at as overkill when they were added to Java, but I've come to rely on them heavily. Sure, it's possible to create a poor man's enumeration in PHP, but then you add even more code to your project to act as low level language constructs, and the code around it becomes very dependent.

I don't know what's currently on the horizon for PHP or Java, as I rarely have the opportunity to work in the newest versions, but I'm sure both have some interesting things in the works.


The above are just my initial observations after diving head first into PHP5 after a couple years in the Java world. I still have a soft spot in my heart for Java, and will continue to use it, but am enjoying my professional PHP experience thus far.