Category Archives: General

Duplicate Values in Binary Trees

Every once in a while I get an email from a reader pointing out a possible error in the book. There are definitely some errors in there. But sometimes they aren’t errors, but I treat them as such because there’s obviously something that needs to be clarified.

Here’s a recent email from a reader (hi Mark!) that warrants extra discussion:

Another possible error I spotted is that binary search trees usually do not contain duplicate nodes, so the sentences at the top of page 64 should say that the nodes to the left are all less than the nodes own value and the nodes to the right are all greater than the nodes own value. But, if duplicate nodes were allowed, then either the left or the right would have the “or equal to” but not both (the current wording has both sides having “or equal to.”.

It’s definitely true that most binary search trees (BSTs) you see in examples don’t contain nodes with duplicate keys. However, that doesn’t mean it isn’t possible. If you can only store one value in a node and you have multiple values that map to the same key then you’ll need multiple nodes with duplicate keys to represent all those values. (In real life, a BST will have a value associated with it, say a pointer to a structure. BST examples in books and interviews won’t add the extra complexity of value management, but it’s there in practice.)

What Mark is objecting to is the “less than or equal to” and “greater than or equal to” wording when describing how the nodes of a BST relate to their parents. He’s asserting that either:

  1. The left child must be less than or equal to the parent and the right child must be greater than the parent; or
  2. The left child must be less than the parent and the right child is greater than or equal to the parent.

Enforcing either statement will still give you a BST. But it will make it impossible to create a balanced tree. (Recall that in a balanced tree the difference in heights of the left and right subtrees of a node is either 0 or 1.)

Let’s say you have a small tree of 7 nodes, each with value “5”. Create a tree from them using either of Mark’s rules and what do you end up with? A tree with nothing but left or right children — effectively a linked list. The tree has a height of 7.

If you relax the rules to allow for “less than or equal to” and “great than or equal to” then you can create a balanced tree of height 3. Balanced trees are great because you can search them in O(log n) time.

Of course, all of this hinges on allowing duplicate keys in a BST. If you don’t allow duplicate keys then the point is moot since “less than or equal to” and “greater than or equal to” devolve to “less than” and “greater than”. This is the likeliest case, in fact, since most tree implementations I’ve seen treat the keys as a set, not just a collection.

Anyhow, obviously something that could use some clarification in the book. Thanks Mark!

Feel free to send questions and comments about anything you’ve read in the book to Sometimes I’m slow with my mail, so if you don’t get a response after a couple of days, resend it to remind me to reply…

The T-Shirt Signal

Occasionally I’m asked what’s the biggest difference between working at Google and working for my previous employer. There are some obvious answers like better compensation, better perks, more opportunity, smarter colleagues, impact-based promotion system, etc. But really it all boils down to the additions to my wardrobe. I’m talking t-shirts, folks!

It seems silly, but the number and variety of t-shirts you get as a software engineer is a strong signal as to what kind of company you’re working for. New t-shirts can mean:

  • The company has money to spare. (When it doesn’t, “frivolous” expenditures on things like t-shirts are the first to get chopped.)
  • The company values your team. (You get t-shirts designed specifically for your project and/or the members of your team.)
  • The company values its products. (You get t-shirts when a major new product or product update launches.)
  • The company supports its employees. (You get t-shirts for initiatives around women in computing, LGBT issues, diversity, etc.)
  • The company is a fun place to work. (You get silly t-shirts for silly reasons.)

It’s quite amazing how excited software engineers get about t-shirts. T-shirt distribution at our office is often a frenzied affair. It’s a simple and inexpensive perk that makes those software engineers feel just a little bit better about the work they do and the company they work for.