Sunday 31 December 2006

2006 Year End Thoughts

Can't believe another year has come and gone. It seems like only yesterday that I wrote the 2005 Year End Thoughts, and only a few days ago that I came back to China and started my career in Exoweb. Time really does fly when you are having fun, and I certainly have been having a great deal of fun in the last few years. Sometimes stressful and sleepless fun, but fun all the same :)

General summary of thoughts:

Things that I am happy about:

  • Exoweb
    • Teams improving, growing
    • Culture
  • Personal
    • Facing the different challenges over the last year
    • Learning to delegate

Things that need to be improved:

  • Exoweb
    • Giving back to FOSS
    • Creating a continuous learning culture

  • Personal
    • Working with people

Goals for 2007:

  • Make self (almost) redundant
  • Teach, not do
Things that I am happy about - Exoweb

Teams improving and growing - This last year has really seen a dramatic growth in the quality of the people within Exoweb. While continuously improving our HR process has helped us get some really good people, the major change has been the steady improvement of people already within Exoweb. There are various reasons for this but ultimately, I believe it is because we have finally passed a critical mass of smart people and everyone is now learning from each other. Processes/practices such as code reviews, ExoForums and blogging have helped encourage the sharing of information and it appears to be self-perpetuating. People are demanding high standards from each other and everyone brings in their own knowledge and skills. My primary concern is no longer raising standards but knowing when to step aside and let people more knowledgable than me figure things out.

Culture - building a good company culture isn't easy and I'll be one of the first to admit that I haven't really got the faintest clue how to go about it. Culture building is all about people and people skills aren't my greatest strength. Perhaps it is because I'm so clueless about culture building that I'm amazed that Exoweb has somehow produced (possibly by accident) a relaxed, open culture that many people find really attractive and unique. It's not perfect by a long shot and there's still so much more we can improve. Yet it is quite gratifying to hear people tell me that they find the culture of Exoweb one of the key selling points of the company and that they have never worked in a better place. Not everyone feels that way of course, but enough do. Personally, I've never worked with a better bunch of people, in a more comfortable environment. I am naturally biased though.

Things that I am happy about - Personal

Facing and Surviving the Challenges of the 2006 - 2004 and 2005 had different, lower level challenges for me. I joined Exoweb in 2004 as a senior team lead, so my main concern then was ensuring the success of a project with a small (5 devs total) team. The challenges then were a lot simpler! 2005 brought the challenge of maintaining quality in a team that was growing too large for me to personally review all code. 2006 was quite different - Exoweb continued to grow and the challenges that came up daily kept changing. The early part of 2006 was a battle to figure out how to scale technically. Or rather, how to ensure that the things I used to do still were being done when it was becoming obvious that one person could not possibly do it all.

The technical aspect of that problem was solved in early to mid 2006, among other things by the growing abilities of the team and instituting a more scalable version of code reviews (we created a cool trac plugin for this). This had the most fortuitous side effect of promoting learning even more from each other. With a great team that is continuously learning, most of my earlier technical challenges faded away.

Late 2006 was more an issue of scaling Project and Human Resources (HR) management responsibilities. The funny part was that as my team got better (and larger), the seniors within the team came to the conclusion that they did not want to do either PM or HR work, pushing all that to me. As a result, the workload in this aspect grew a lot faster than the team did in 2006. Surviving the PM aspect was done by the fine art of delegation (more about that later) while the HR aspect is still a serious work in progress.

Ultimately, I'm happy that I survived all the challenges of 2006. Ken in 2005, looking at these challenges, would have been quite intimidated. It was a good idea that I went into 2006 blind to the challenges ahead :). Looking back, I can certainly see many areas I could have done things far better but at least I can say that I haven't made an absolute mess of things.

Learning to Delegate - Every book about management talks about how one needs to delegate. Yet they tend to gloss over how to delegate. One thing I quickly learned long ago was that if you just pass a task to someone and hope that they will do it right, 90% of the time things turn out badly. It took some experimentation and trial and error, but in the end, my great lesson in 2006 was realizing that it all boiled down to figuring out who I could delegate what to. Everyone is different, with different strengths and weaknesses. The challenge was to find someone (or combination of people) that had the right strengths to do the task on hand. Not a skill I am strong at (more later about this), so it was harder than it should have been. But as I write this today, I have a good team that functions very well together, with most of the critical tasks covered and working well together.

Yes, I realize this may seem blindingly obvious to some people but it wasn't that obvious to me.

Things that need to be improved - Exoweb

Giving back to FOSS - Despite being strong believers and users of Free/Open Source Software, we don't contribute back nearly enough. Sure, some of us personally have done some work in FOSS advocacy or have code contributions here and there. However, as a company, I am still quite dissatisfied with what we have contributed back. Besides contributing server space (python.cn, Beijing LUG, etc), software usage, bug reports and a few patches here and there, we have given very little back to the ecosystem that makes our business model possible. Despite having people who profess to genuinely believe in FOSS, despite having a contribute back policy and allocating a percentage of developers time to such activities, too little is contributed back. This is something that we will have to focus more on in 2007.

Creating a continuous learning culture - Possibly because most of the seniors of Exoweb possess either the Learner or Input talents (see First Break All The Rules), we tend to expect that everyone will be like us - given the opportunity, will always try to learn and improve themselves. Unfortunately, that is not really the case. Some really talented software developers don't seek out knowledge for the sake of learning but are satisfied with learning only what is required to complete their task. Or despite the best intentions, they need a little external pressure. So despite having a 10% time self-improvement/contribute back policy in Exoweb, too many people do not take advantage of this. Yet continuous learning is a vital aspect of continuously improving the abilities of the organization.

Things that need to be improved - Personal

Working with People - I've touched on this previously, but I'm basically much better at computers and hardware than people. To me, computers seem so predictable - given a fixed input, they typically produce a fixed output. Humans are so much more variable, with too many factors to consider. Yet management is about people, not about computers. According to the Peter Principle, I'm quickly rising to my level of incompetence :)

However, the level of understanding of people I'm looking for might be a bit higher than most. The ability to figure out a person's strengths and weaknesses and combine them with other people/processes that complement their strengths and compensate for their weaknesses is a very rare talent. If you look at most management practices today, they are built to solve this talent shortage. Most large organizations have processes that cater to the lowest common denominator - they allow the organization to survive mistakes made by less competent people, but they get in the way of the truly talented hitting their full potential. We sometimes call that bureaucracy.

Most managers are either poor at or unwilling to manage individuals. It is hard to manage individuals - you have to really understand your team and know how to combine them to achieve maximum results. Most prefer to assume that every human being is interchangeable, that one person can be swapped for another without problems. This only works if you are using people at the lowest of their abilities, so that the job can be done by almost anyone. It does not work when you are trying to make the most out of everyone's unique combination of strengths and talents.

This will probably be my primary challenge in 2007 - to become competent at managing individuals.

Goals for 2007

Make myself (almost) redundant - I have delegated a large amount of my work to others already. I hope to finish this in 2007. Might be a while before I can delegate all the HR management aspects, but all the technical and project management aspects should be possible within 2007. I certainly hope to organize things so that I can go on a month long vacation and no one would notice :)

Teach, not do - One thing that is really hard for me - delegating tasks to someone else instead of just rolling up my sleeves and getting it done in a couple of hours. Only problem is that there are only so few hours in a day and so many problems that need to get resolved. Making myself redundant requires that I restrain myself from digging into problems and instead focus on teaching/guiding others to take over from me. Teaching is a large investment of time - it is always faster to do it yourself than to teach. But without this investment, the organization will never scale.

2007 looks like it will be bringing quite some challenges, many in areas where my strengths do not lie. Still, I wouldn't have it any other way. Life isn't fun without challenges and I can't think of a better team of people to face the unknown with than the crew of Exoweb. Happy New Year everyone!

Thursday 28 December 2006

Time To Turn In My Geek License

Apparently, I'm spending too much time doing management and turning into a PHB. No one seems to think I'm technically competent anymore. Some recent conversations:

While training a junior project manager:

Me: "So remember, with new tickets, ask a technical senior to help you estimate how many hours are required to complete the task ..."

A few days later, during the usual morning meeting discussing what tickets to create, prioritize, etc:

Me: "... so I think we should put this task in Milestone x, priority critical, estimated time 8 hours"

Jr: "Ok, sounds good. I'll go get an estimate from senior x and take care of it."

Me: "Wha ... didn't I just give you an estimate?"

Jr: "Yes, but you said I needed an estimate from a senior ..."

While showing a relatively new developer the rankings of everyone in Exoweb:

Me: "... and here we have the mid-levels devs, split in 3 sub levels. Finally, these are the seniors ..."

Dev: "Wait, you're considered a senior?"

After relating the above two incidents to yet another developer:

Me: "... so apparently they don't think I'm a senior anymore ..."

Dev: "Well, if you don't tell them, it's not obvious ..."

If anyone wants me, I'll be out getting a haircut, buying some suits and gaining a lot of weight ...

Tuesday 5 December 2006

KISS (or why MS CS students have a bad time in interviews ...)

It's that time of the year when Master's students are hunting for jobs in China and we are flooded with resumes from students with Masters of Science in Computer Science (MScCS). We've spent the last couple of weekends interviewing the candidates that passed the front interviews and it has not been pretty. In fact, it has been pretty sad.

The biggest problem we encounter with MScCS grads is made very apparent in how they approach one of the our typical programming problems. This particular problem is fairly simple and with a little bit of thought, can be made into a linear (computer performs x more calculations for every element added to n, where x is a constant number) or O(n) type of equation. Most competent people will come up with a O(n^2) algorithm (computer performs n operations for every additional element added to n, resulting in rapidly increasing number of calculations per element added) to it at first, then after a little thought, realize that there are a lot of duplicated operations, refactor that out and come up with the linear solution.

MScCS are a little different - almost all the ones we have interviewed to date encountered this problem, probably came up with the O(n^2) equation ... then went off the deep end. They would inevitably come up with complex, fancy algorithms that utterly failed to solve the problem. These fancy algorithms would often handle the common case but fail on the boundary cases. They were fragile, easily tricked or prone to failure. When these flaws were pointed out, these candidates would come up with even more complex algorithms or add a lot more if/else checks ... resulting in even more brittle and unreliable code. None of them, if unaided, could come up with the ideal solution.

Well, that's not really true. One particular candidate, after coming up with 4 complex, unworkable algorithms, finally said, "well, since you have given me such a tight time limit, I have no choice but to brute force it." He then proceeded to give the ideal solution. Linear time, handles all boundary cases. But he would rather give 4 non-working solutions rather than the working, "inelegant" solution.

What I find strange and rather disturbing about these interviews is that it is somehow related to the mindset of those who are doing their MScCS. Bachelor level grads or people with several years of working experience rarely make this mistake. They either do it right or not at all. Yet for some strange reason, the MScCS students seem to value fancy algorithms over _working_ algorithms.

Maybe it's the MScCS curriculum here in China which tries to focus on fancy algorithms. Maybe it's just that the MScCS feel the need to prove that their abilities are above the norm for CS graduates. Maybe we just have incredibly bad luck with our candidates. Whatever the reason, extremely few MScCS fresh grads have passed our interviews. As you can imagine, in a production environment, we highly value working code and eschew the fancy algorithms, especially algorithms that needlessly complex. Simplicity and correctness is far more important in our craft.

The famous quote attributed to Brian Kernigham comes to mind:

"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"

When writing algorithms, another quote from Agile development comes to mind - "do the simplest thing that can possibly work." Keep It Simple ... er, let's call it Keep It Stupidly Simple (KISS) and avoid insulting anyone :)

Wednesday 13 September 2006

Need your MacBook repaired? Wait a while ...

So after all the fun I had with the MacBook, I eventually found that my MacBook fell victim to the usual MacBook discoloration and random shutdown problems. So did the other MacBook in Exoweb that was purchased at the same time. Last week, we sent the first of them in for repairs with Apple. Guess what the company sysadmin got back as a response from Apple?

"Apple said there's no hardware in the entire world to fix your laptop---they have to manufacture it first. "

So if you're like me and need all these issues fixed, I guess you're going to have to wait a while for the spare parts. At least for me, the random shutdown is limited to once per day, when I work the computer too hard after it has been idle for too long (either in suspend mode or just sitting there unused for several hours.) Once it has shut down once, pressing and holding down the power button for 5 seconds restarts the computer just fine and as long as I keep it busy the rest of the day, the random restarts go away.

Saturday 3 June 2006

Macbook Day 3

8 am saturday morning and I am up, messing around with development, writing code and tests. Big deadline coming up? No. Urgent bugfixes? No. Just testing out the MacBook as a development environment. So far, I must say I'm quite happy.

Since my projects all run on debian stable in production, my development needs are quite simple - a Unix-like environment and the usual array of FOSS tools. With a little bit of effort (about 4 hours, most spent just waiting for things to download), I managed to duplicate my production environment via darwin ports on OSX. The only thing that didn't match was the operating system - OSX does work a little differently from Debian linux.

This worked out decently. Writing code, especially when you're used to using emacs in a terminal, is the same everywhere. Running tests is usually the biggest problem. The problem I had with the old powerbook was that it took 90 minutes to 2 hours to run the full suite of the tests at the time, a bit too long for the write-test-commit cycle. On our server class machines (3 Ghz Xeons), these tests take 20 minute to run. Still too long (refactoring coming up shortly) but acceptable.

With the MacBook, tests take 30 minutes to run natively. Not quite a match for server class machines, but comparable to and in some cases, faster than our development desktops. If we refactored the tests to run more in parallel, results would be better as current tests only use one of the two cpus in the MacBook.

However, there is one problem with running all this on OSX - it's not exactly our production environment and we have been bitten before with slight differences between behaviour on different systems. What works fine on Gentoo crashes and burns on Debian ... etc.

In comes the Parallels virtualization product, a virtual machine solution targeted for OSX on Intel. A quick download of the software, a few clicks and 30 minutes later, I had a minimal Debian stable install running. Another 30 minutes later, I had a full development environment installed and running inside the virtual machine. The virtual machine was fast, responsive and fully functional. The full suite of tests took 45 minutes to run - not zippy, but still far better than the powerbook and usable until the great test refactoring happens.

I'm really pleased with this laptop. All my needs are more than adequately met by this system for the forseeable future. I just might need to upgrade the hard drive in a few months, as the 60G I'm starting with is a bit small. All these virtual machine images will eventually add up.

Thursday 1 June 2006

MacBook Day 1

I read PlanetPython a lot, and one of the interesting things I've noticed is the number of Mac posts I've seen there over time. Various Python personalities keep getting their latest MacBook (Pro) and talking about their great experiences with them.

Well, I gave in (really did not take much convincing) and joined the crowd. Got myself the lowest end MacBook, the 1.83 GHz model, with the lowest end specs. Then immediately maxed out the RAM to 2G. I figured I could live with the 60G hard drive (I don't actually pack very much) but I might one day need to run the parallels virtualization software and more memory never hurts when you're trying to run multiple OSes at the same time.

Day one has been pretty good. It really didn't take all that long to get up and fully functional on the new system. All OSX systems include a nice utility to migrate your data over, but I lacked a firewire cable so I had to manually scp my home directory from my Powerbook over to the MacBook. After installing all the other apps I used regularly (Colloquy, Adium, Firefox, OpenOffice.org, darwin ports), everything else just worked like the old Powerbook, just a lot faster. It's nice how OSX puts almost all your user settings into your home directory, making moving everything really easy.

It's funny how the lowest end MacBook beats the crap out of an 8 month old Powerbook, but I guess that's just the power of the new Intel chips. Things are responding much faster now and I believe I can actually use this new system for serious development, rather than ssh'ing to servers to run CPU/IO intensive tasks. We'll see though. Ports is still building the dependencies needed to run my main project applications.

I suppose lots of other people have given their impressions on the MacBook, but my own impressions have all been really positive so far. The screen is much brighter and crisper than the old PB. I like the much larger trackpad and the new keyboard is decent, though it requires a little more force than the old PB. My fingers must have gotten weak.

Heat really doesn't appear to be that much of a problem. I tried running both CPUs at 100% for about 15 minutes and while it got warm, it was usable. In regular usage (darwin ports compiling stuff in one window, me writing in another), it gets a little warmer than the PB, but still usable on bare skin lap. Would be a lot nicer in winter than summer though.

I plan to use the system extensively for development, especially while traveling around Europe later this month. This laptop is going to get a much more intensive workout than the PB. After I found that the PB took up to 3 times longer than my regular desktop to run a full suite of tests on one of my projects, I stopped using it for actual development (was great as an emailing/document generating machine). We'll see how things turn out after darwin ports finishes compiling all the dependencies and I run the full suite of tests for the first time.

Tuesday 16 May 2006

Full Emacs Keybindings in OSX

One fun thing about reading blogs - you learn little gems that you wouldn't encounter otherwise, sometimes even when you search for it. When I got my Mac in August last year, I was thrilled to find that it supported emacs style key bindings in most Cocoa applications. However, it wasn't complete support - while all the control key bindings worked (^f, ^b, ^p, ^n, etc), the ALT/option button bindings did not. So I missed useful keys like page up (Alt-v) or forward one word (Alt-f).

I did spend a couple of hours searching on this, particularly in the help documents and knowledge base contained in the Mac and on google. No luck. I eventually resigned myself to working without the Alt keys and chugged along mostly happy. That is, until today, when I read Erica Sadun's blog on the Mac DevCenter RSS feed.

Turns out that all I really needed was this Apple Developer article on Key Bindings in OSX, including Emacs examples. It was as simple as adding my own custom definitions in ~/Library/KeyBindings/DefaultKeyBinding.dict.

The OSX key binding capability is actually quite impressive. You can even do multi-keystroke bindings, such as ^x^f (if you really miss Emacs that much).

Another day, another great functionality to rave about in OSX :)

Wednesday 29 March 2006

Interview Programming Problems

Another Saturday done, another interviewing round finished. Thought I would put down into words what I look for when reviewing the programming problems done by candidates. I don't really care if candidates know what I look for - if they can do it in an interview, they can do it in their daily work. Especially so when code reviewers are likely to be at least as watchful as I am in an interview.

As an illustration, I'm using a problem we previously used in our written tests. We replaced it recently because everyone answered it in almost the same way, making it useless as a differentiator between candidates. The problem is as follows:

Implement a function "intersection". The function takes two ASCII strings and returns an ASCII string containing only those characters, that are simultaneously present in both arguments. The result should be as short as possible. For example:
intersection("abde", "bexy") may return "be" or "eb"
intersection("exoweb", "candidate") should return "e"

Almost all solutions (that work) are variants of the same form. Below was the minimum acceptable code to pass first round screening in Exoweb, in a prettied up python format:

def intersection(a, b):
intersections = ""
for char1 in a:
for char2 in b:
if char1 == char2:
intersections = intersections + char1
return intersections

(trivia - usage of the "in" keyword is up to 3 times faster than a str.find() call on my laptop)

(trivia #2 - something like 50% of candidates who make it to the written test are unable to write even the above snippet. After this test, 90% of all candidates who have submitted their resume have been eliminated)

There are two problems with the code above, one pretty obvious (not reading the requirements) and the other a lot less so (performance problem). The first is repeated characters in the return statement and the second is an algorithm that does not run in linear time.

The first problem is easy. Given "aaabbb" and "bbbccc", the algorithm above returns "bbbbbbbbb". The problem specification says "The result should be as short as possible." Failure to read the spec or forgetting to check for this is bad, but not fatal as long as one spots this quickly.

The second problem is one that less than 1% of the candidates manage to avoid - algorithmic complexity. If strings a and b were of length n, the double for loops in the algorithm result in a O(n^2) algorithm. For every character added to the length of string n, the computer can end up doing up to n+1 times more computations. This quickly becomes impossible.

On my laptop, with a data set tweaked for the worst case scenario, I get the following execution times:

(n=1,000) 0.00372 seconds
(n=10,000) 0.29497 seconds
(n=100,000) 30.20992 seconds

For every time I increase n by an order of magnitude (*10), the execution time increases by roughly two orders of magnitude (*100). Following this progression, a value of n=1,000,000 would take around 3,000 seconds or 50 minutes!

This problem is relatively easily solved and there are multiple solutions. For those languages without rich libraries, one solution is to build a 128 char length array (the problem specifies ASCII, which is only 128 values) and to run through each string once, putting a value into the array to specify that the character was found. Once complete, it's a matter of scanning all 128 elements to see what was found in both strings. All these operations are in linear time. This has also the advantage of ensuring that the returned result has no duplicates.

For languages with richer libraries or built ins, you can also use hashed containers or even set data types. We disallow using Sets in python because it would simply be too trivial. In Python 2.4, Set data types are built in and the code would look like this:

def intersection(a, b):
return ''.join(set(a).intersection(set(b)))

Sets in Java aren't quite so feature rich, lacking the intersection() method, so we still allow it in Java. A non-set method in Python, using just the standard built-in data types might look like this:

def intersection(a, b):
char_seen = {}
intersections = {}
for char in a:
char_seen[char] = True

for char in b:
if char_seen.has_key(char)
intersections[char] = True

return ''.join(intersections.keys())

With a n=1,000,000 string size, this takes 1.7 seconds on my laptop, much faster than the expected 50 minutes required by the inefficient, O(n^2) algorithm. With the n = 100,000 string size, the algorithm takes 0.18 seconds, the expected roughly linear decrease in time.

The algorithm above can certainly be optimized further, for different focus areas. Using two dictionaries does waste a bit of memory, and there are probably faster ways of doing this. There are probably readability tweaks too.

In our interviews, it does not matter if the code has flaws on the first try (it must work though), as long as the interviewee can understand the problem when pointed out and fix them. No one is perfect and mistakes are to be expected. We just try to minimize them and fix them as soon as possible.

Saturday 18 March 2006

Teaching Software Engineering

Heh. Having spent a not insignificant proportion of the last 1.5 years doing HR work, I feel a great deal of sympathy when reading of the plight of others when doing HR. Some amusement too, as I recognize the problems and issues faced.

Today's fun article comes courtesy of planetpython.org, regarding teaching the Waterfall method in schools. I wince in sympathy because almost all of the people I interviewed, if they knew anything about development, knew only this method. Yet it is a method we (as in Exoweb) know doesn't work very well for us.

It's a nice, sunny Saturday afternoon so I'm too lazy to ruminate on why schools put too much emphasis on the Waterfall method and SEI methodologies, but I have been recently rambling to colleagues about a few complaints I had with my own college experience in software engineering:

  • Overly simplified
  • Short term projects
  • No Challenge
Overly Simplified

This is related to the Waterfall issue. I realize that colleges first try to teach us the basics, then try to teach us the more complex stuff. But sometimes, the basics are so overly simplified that we learn the wrong things. e.g. the Waterfall method. To me, the failure of the Waterfall method is the assumption that it is possible to get perfect requirements and that they will never change. Working life has taught me that no plan survives first contact with reality. That lesson was most painfully learned.

What is sad is that too many people I meet still stubbornly stick to what they were taught in college. I still see too many people/organizations spending months trying to gather all the requirements while competitors gain a head start by producing an imperfect but workable product. I see man-years of developer time spent haggling over little requirement details, only to find the client or market has changed requirements in the time it took for them to sort out the exact details.

Yes, requirements are important and it is the cheapest stage in the software development process to make changes. Cowboy hacking just as frequently leads to disasters. However, there is a point of diminishing returns and most people following the Waterfall process go way past this point. Agile Development offers the best middle ground that I have found to date.

So, to wrap up this section of the rant, if schools would quit simplifying stuff too much, the tragedy of the 1 year requirements gathering phase would not occur.

Short Term Projects

Almost all college projects are for the duration of a single class - a single semester of a few months in length. This means that a student typically spends an entire semester building a system that works, then forgets about it afterwards.

The problem with this approach is that, like construction, it is much easier to build a small shack than it is to build a skyscraper. If you are just slapping a few pieces of wood together to cover some random stuff in your backyard, you really aren't concerned about how good the foundation is or if the darned thing collapses a few months later. It's not that hard to rebuild it. On the other hand, screw up the foundation of a skyscraper and very horrible things happen. Like software, those screw ups become apparently very late, when the cost of changing things (or failure) is very high. Yet the one semester projects mostly teach us the habits required to build small shacks.

Challenge

There is a quote from Peopleware that I enjoy about good builders:

"The minimum that will satisfy them is more or less the best quality they have achieved in the past."

This seems to be true for myself (not that I consider myself a great builder) and for many great developers that I respect. I cannot be sure that this applies to everyone, but it seems true enough for most.

The problem is that most schools don't really hold their students up to high standards or even show them that it exists. If the "best" that they've done is code that doesn't even compile (I know quite a few professors don't even bother to check this), then they will always be satisfied producing crap because they don't know any better.

I see this in some fresh grads that I interview - they are, in theory, some of the smartest kids graduating that year from their college. They have the highest grades, they've achieved more than their peers ... they think they are the king of the world. The only problem is that compared to the truly best in the world, they are crap. They don't automatically strive to improve their code, they use suboptimal algorithms, miss various corner cases, etc.

I have had classmates that have graduated after 2 years of courses taught in C++, yet still not know what a pointer is. I have interviewed candidates who graduated with a bachelor's degree in computer science, but have never written a line of code in their life. These schools do a great disservice to our profession and society in general (i.e. think of the cost of all that crappy code out there).

I know this has been suggested before by others, but perhaps one thing that would make things better would be a minimum competency exam, administered by a certification board. Professions such as law, medicine and accounting all have professional organizations that set minimum standards and administer an exam that all practicing members of that profession must pass in order to practice being a lawyer, doctor or certified public accountant. Perhaps we are approaching a time when software developers too must meet a minimum competency before being allowed to work on things like nuclear power plant controls or medical equipment. I know I would sleep better at night knowing my pointer-incompetent classmate was not writing the code for medical equipment that would one day be used on me.

Wednesday 1 March 2006

HR at Exoweb

Greg and I got curious this morning about what our interviewees were writing about their experience on the web and decided to do a bit of searching. This ended up in me getting curious about a batch of HR related matters. Final result is a bunch of weird trivia:

Interview stats:

  • Distinct resumes received in February: 1308
  • Called for pre-screening interviews: 186
  • Passed pre-screening: 19
  • Job offers given: 3

Ouch. We have a huge attrition rate (0.2% get offers). Will write in more detail about the transition from stages 1-2 and 2-3 in later blogs and what typically kills a candidate.

Other fun tidbits we found from scanning bbs posts:

"Those guys must be poor! They're sharing offices with another company! Don't work there!"

Heh. When we moved to this office in 2004, Exoweb was all of 12 people, but we found this large space to renovate into a great loft. We ended up inviting 2 other companies owned by good friends (and fellow FOSS users) to join us. Since then, all of us have at least doubled in size, filling up the entire loft space and overflowing. Although it doesn't look like it, we actually take up the entire top floor of our building, except for one stubborn company that refuses to move out and give us total control of the floor.

"They have an all you can drink policy! Bunch of drunkards!"

We have an all you can drink soft drinks benefit. But I guess it doesn't help that some of the pictures of our office posted on the web have included pictures of "herb liquor tasting party" or "empty bottles after christmas party".

Monday 20 February 2006

Circular Dependencies When Upgrading Debian Testing (Etch)

With an office of 30+ users who run debian testing on their desktops, it's not a big surprise that any problems with debian testing can really come and bite us. Recently, a few developers who had been particularly slow with their upgrading hit a really bad circular dependency bug that basically stopped their upgrade cold in the water and prevented them from going any further. The bug in question is the initramfs-tools, kernel 2.6 and udev circular dependency.

The main problem is that udev requires a _running_ >= kernel 2.6.12 (soon to be >= kernel 2.6.15) to even be installed. It is not enough that you are just about to install the kernel. You must be running the latest kernel, which means the kernel must already be installed. The kernels on the other hand, depend on initramfs-tools .. which depends on udev. So udev will not install until you are running a kernel >= 2.6.12 but you cannot install those kernels unless udev is installed ... ouch.

Those who upgraded frequently enough hit that sweet spot when the latest debian kernel was 2.6.12 but did not require udev, so it could all be installed just fine. It did require a reboot after installing the kernel to install udev, as documented in the notes, but it was possible to continue. Those who took too long, or fixed their kernel to a particular version for various reasons eventually hit this bug when they did upgrade.

In the end, a few of the developers that were not quite so familiar with debian ended up reinstalling their system from scratch (debian testing install CD drops a >= 2.6.12 kernel in right away, avoiding the problem). There is a way to break this circular dependency without reinstalling though.

The 2.6.15 kernel (and possibly earlier versions as well. Did not check) does not absolutely require initramfs-tools. It is only the default option. Running dpkg -I on a kernel package shows:

Package: linux-image-2.6.15-1-k7
Version: 2.6.15-4
Section: base
Priority: optional
Architecture: i386
Depends: module-init-tools (>= 0.9.13), initramfs-tools | yaird | linux-initramfs-tool

linux-initramfs-tools is a virtual package, so useless for us there. However, yaird is also an acceptable dependency. The solution then is to install yaird first, removing initramfs-tools, then install the rest of the mess (linux-image-2.6.15-1-x, udev).

Users of kernels that are too old may still be out of luck though, as hints given in the debian bug report suggests that even yaird requires a not too old 2.6 kernel.

Ah well. It is an unstable time again in debian testing, after the relative calm while sarge was being prepared for debian stable. There are quite a few circular dependencies now and people are reporting problems upgrading. In some cases, those upgrading from rather old debian sarge systems to the latest testing report that their desktop environments have become flaky (gnome and kde both). Switching to the other desktop, or purging/reinstalling those desktops seems to fix things.

Amazingly though, KDE 3.5.1 has made it into debian testing a mere 20 days (or less, I only noticed it today) after its official release. Certainly not the slow debian days anymore.

Sunday 15 January 2006

Computer Science vs Software Engineering

This article entitled Software Engineering, Not Computer Science (PDF), is probably the clearest definition of the difference between the two fields that I have seen to date. It also provides very interesting food for thought because many yearn to do computer science, yet most of us are employed doing software engineering.

In a way, it is a pity that there are not more computer science jobs available. I have encountered a few really smart people who love the discipline and would no doubt advance the field of computer science if they were given the chance. They just made absolutely horrible software engineers as they were not really interested in producing products. They were only interested in creating new things, no matter how unrelated or inapplicable to their task at hand.

Wednesday 11 January 2006

The Paranoid Programmer: From Junior to Mid

While chatting with a fellow developer, the question was asked: "How does one go about raising one's skills?" The answer to this sort of question is different for every person - every person has different talents and weaknesses and develops in different ways. At this moment in time, looking at the current Exoweb team, there are a few areas that I would particularly emphasize:

  • Paranoia
  • Mapper vs Packer
  • Quality Plateau
  • Knowledge Portfolio Investing

This particular entry is written mostly for Exoweb developers, but any feedback, comments or suggestions are welcome. Update 2006-01-15: Changed the title. Besides the fact that I've previously written something on what makes a senior, what I've written here will only get someone up to a mid level developer in Exoweb. There are a lot more things I left out on what makes a senior, like the soft skills.

Paranoia

Paranoia is good in a developer. Or perhaps some would prefer to refer to it as boundary checking. At any rate, it is always good for a developer to consider that Murphy's Law (anything that can go wrong, _will_ go wrong) is something we encounter far too often in our daily life. Once code is written and being inspected for improvements (you do go through your code again and see if you can improve it, right?), it helps a lot if the developer considers what can go wrong and how one can safeguard one's code against this.

As an example, one area that developers typically forget in web programming is url encoding. For instance, some insert usernames into the url as a variable. e.g. /user/john/details or /user_details?username=john. However, they forget that usernames can often include characters that are not legal in urls, such as spaces, &, ? or others. Worse, they may not even be ascii. In our global environment, it is no longer uncommon to encounter a lot of unicode characters. This of course leads to much pain later. Competent developers learn the first time they make this mistake and never repeat it. The superstars never make this mistake in the first place.

Paranoia can only go so far - you will miss something. Fortunately, that's what code reviews, pair programming, even more paranoid seniors and users^H^H^H^H^H beta testers are for - helping you catch your errors. But it helps the user experience (and your career) a lot if you catch as many of the bugs as possible before anyone else sees them.

Mapper vs Packer

The terminology comes from the Programmer's Stone, and it refers to a mindset. Are you a memorizer (pack information into your brain) or one who figures out the fundamental principles (maps connections between data points)? Packers have a tough time making senior in Exoweb because seniors are the ones that handle the most unusual, newest problems. For that, a packer has to find and pack the appropriate response. That can be rather hard to do. Instead, we need people who are adaptable to new situations and can figure out solutions to problems. Nothing is more annoying than a person constantly bombarding you with questions that could easily be answered with a little thought.

The Quality Plateau

Yet another term from the Programmer's Stone, this time in Day 2 of the website (I consider the first two days the most valuable). I wish they had a HTML tag to that particular section so I could link directly to it, instead of telling people to search for the heading. At any rate, that site shows how even code that is considered well written can be improved and made more readable. You may or may not agree with the style or the different methods used, but it is an eye opening experience - 26 lines of code reduced to 11 much more readable lines of code.

The Quality Plateau is not about reducing unnecessary variables, cramming things into as small a space as possible or holy wars about coding conventions. It is primarily about looking at your code and constantly finding ways to improve it. This mindset may seem expensive at first as you spend time looking over already functional code, but the long run benefits are enormous. Each time you see a way to improve your code, you learn something new for the next bit of code you write. Over time, you get better and better and start producing top notch code without much effort.

Knowledge Portfolio Investing

If the Quality Plateau is about constantly improving your code, then Knowledge Portfolio Investing is about improving yourself. This particular phrase comes from The Pragmatic Programmer, a book I highly recommend. As knowledge workers whose tools are only our intelligence and a computer, our greatest value lies in what is in our heads. If we do not constantly invest in increasing that asset, we will one day find ourselves penniless in our job - we will not have the value left to justify the high salaries that we believe we deserve.

It is hard to take time out to invest in ourselves, to learn something new every day. Work is tiring, our personal lives often seem more interesting and something always seems to come up. But none of us would have gotten this far if we hadn't invested time and effort in improving ourselves. No one in Exoweb has ever studied only the bare minimum required of them in school. Work should be no exception.

This is one of the reasons why Exoweb allocates 10% of work hours to employee improvement and tries to minimize overtime - to give every one of our developers time to continue developing themselves. This is a win-win situation for all parties involved as Exoweb's time and resource investment results in more competent and skilled developers. However, this all depends on the developers actually taking advantage of this time.

Final Thoughts

There are plenty of other things and more will be added over time as the situation change and other needs become more obvious. Of all of the above, the area that is ultimately most important is #4 - Knowledge Portfolio Investment. In the end, if a person is constantly trying to develop themselves, they will learn all the other areas.

The future, our industry and our targets are always moving. We can never be satisfied with hitting all the goals we set today, because by the time we hit them, needs will have changed and new goals are needed. However, if we are at least moving in the right direction, there will be a much shorter distance to travel to the new target after we hit the old one.

Sunday 8 January 2006

Public Key Missing in Apt

Just a quick blog about some key wierdness in the debian testing apt-get setup. Not sure where the exact problem stems from, but since the new year started, all debian updates are signed with the 2006 gpg key, which my testing systems did not seem to have. So you would end up with this error after doing an update:

Get:1 http://box.exoweb.net testing Release.gpg [378B]
...
Fetched 2810kB in 24s (114kB/s)
Reading package lists... Done
W: GPG error: http://box.exoweb.net testing Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 010908312D230C5F

The problem being that the public key at the end is not recognized. Looking at the key management utility for apt (apt-key) didn't show any simple way for it to download the correct key from the debian keyring, so I ended up having to use a bit of a kludge. These were the commands I had to run (as root):

gpg --keyserver keyring.debian.org --recv-key 2D230C5F
gpg --armor --export 2D230C5F | apt-key add -

The first line downloads the public key and adds it to the root user's list of public keys. The command exports this from the root user's keylist to apt-key. The cleanest way to do this would probably be to use wget to get the actual key from its appropriate location, then pipe it to apt-key (it would be a one liner too). However, that is clunkier to do since one has to look up the appropriate location of the key, etc. In the end, adding just one public key to root's keyring was no real deal.

Ah well, back to your regularly scheduled hacking ...