Backtracking algorithms are a class of algorithms designed to enumerate over a large number of possible solutions, searching for all combinations that satisfy a certain set of constraints.
In many cases, backtracking problems are easily recognizable: whenever a problem asks for one or more possible combinations of x such that all combinations satisfy some set of constraints, then it is probably a backtracking problem.
Backtracking solutions tend to have a similar structure and the process one can use to arrive at the solutions share a similar structure as well.
This article will detail the process one can use to approach backtracking problems, from problem-description to naive solution to optimized solution. I emphasize intuitive understanding over formality and readability over clever optimization strategies, so even the "optimized" solutions presented here, although much better than the naive ones, may not be as fast as possible. I’ll run through two different backtracking problems from LeetCode to demonstrate this problem-solving process in action.
The problem-solving process for backtracking problems is as follows:
- Solve a simple version of the problem manually
- Identify end-of-path conditions & dead-ends
- Identify success-cases
- Generate a naive algorithm to explore all possible combinations
- Optimize by avoiding combinations that will lead to dead-ends
Backtracking algorithms have a recursive, as well as an iterative, component to them.
A simple backtracking problem is: “Find all permutations of elements in an array”. A permutation of a collection is a specific arrangement of all elements in that collection. A permutation of
Finding all permutations of
[4,5,6] involves finding all permutations that start with 4, then all permutations that start with 5, then all permutations that start with 6.
Finding all permutations that start with 4 involves finding all permutations that start with 4, 5 and then all permutations that start with 4, 6. If the array was
[4,5,6,7] then finding all permutations that start with 4 would involve all that start with 4, 5, all that start with 4, 6, and all that start with 4, 7.
A pattern is starting to emerge from this tedious description: For each element i in the array of length n, we are going to find all permutations of the array of remaining elements (with length n-1), append those permutations to an array in which i is at the beginning, and place that permutation in the final result.
This description screams recursion: we are iterating over a collection of elements and, for each element, we are doing the same thing with a slightly smaller data set.
Applying the Process
So, let’s look at the permutations problem on LeetCode.
Step 1: Solve it manually
The first step to approaching this problem is to solve a simple version by hand. Luckily, the example is simple, so I’ll just draw out how I’d generate these solutions.
First, I’d start with 1:
Since I know that a permutation of a collection cannot hold duplicate elements, and since I’m starting with all permutations that begin with 1, I know that whenever I run into a 1, I’ve encountered a possibility that isn’t allowed: a “dead-end”.
When I add a 2 to the permutation I’m building up, I’m then searching for all permutations that begin with
[1,2]. Therefore, adding both another 1 or another 2 represent a dead-end.
Since I know that a permutation of a collection is the same size as the original collection, where the elements are in a different order, I know that once I’ve found 3 elements, where all elements are different, I’ve found a possibility that should be included in the solution: a “success-case”.
Let’s move on to all solutions that start with 2:
Similar to before, I can see that whenever I encounter a 2, I’ve hit a dead-end. Whenever I make it to three distinct elements, where each one is different, I know I’ve found another valid permutation.
I’d do the same thing with 3… I’ll skip the diagram for brevity.
Step 2: Identify end-of-path conditions & dead-ends
A dead-end in a backtracking algorithm is a possible combination that does not satisfy the specific constraints to which the solution must adhere.
Solving it manually has clearly revealed that a dead-end is characterized by any combination in which there are duplicates.
Further, we see that any path, regardless of whether it results in a successful permutation, ends once 3 elements have been collected. We term this condition the “end-of-path” condition.
Step 3: Identify success cases
Solving it manually has also revealed that a permutation is any combination of elements in which the number of elements is equal to the number of elements in the input and all elements are unique.
The end-of-path conditions, dead-ends and success-cases are collectively called the “constraints” of the solution. As I mentioned earlier, backtracking algorithms are algorithms that identify one or more possibilities such that each possibility satisfies a given set of constraints.
Step 4: Naive solution
After we have solved it manually and extracted the constraints, it’s time to generate a naive approach to the problem.
Notice that in the above hand-illustrations, I drew out the duplicate elements as possibilities even though I didn’t need to. In fact, I was still a bit clever: as soon as I found a duplicate, even if the duplicate was the 2nd element instead of the third, I disregarded — or “backtracked” from — that particular possible combination. My naive solution won’t even do that: when I start with 1, I am going to explore all other possibilities that start with 1. That means I am going to explore all possibilities that start with
1,3, and so on until I have a combination of 3 elements. Once I have a combination of 3 elements, I will decide whether or not to push it into the resulting collection.
In other words, the naive solution isn’t going to “backtrack” upon finding a bad combination; it isn’t going to stop immediately upon hitting a dead-end. Rather, it’s just going to assume every combination is successful until it reaches the primary success-case: that it has collected 3 elements.
It’s going to look something like this:
This solution has three functions in it.
allElementsUnique is just a convenience function to help us determine whether the end-of-path is a success-case or dead-end.
Most backtracking solutions have this structure: there is a top level function that initializes a solution set, a helper function that performs the backtracking, modifying the solution set, and then the top level function returns the solution set.
The backtracking solution to most problems will look similar to the backtracking function shown above. It has five distinct parts:
- The end-of-path and pass/fail check.
- The iteration over possible starting values
- The addition of a possible value into the current possible solution
- The recursive call
- The clean-up (removing what was done to the possible solution one level down)
if/else here between the end-of-path check and the iteration is needed because recursive calls will never terminate otherwise. This is not always the case. If there is logic surrounding the recursive call that:
a) will only make the recursive call when some condition is
b) will inevitably be false at some point
Then the recursive calls will stop at some point and the top-level
else clause can be removed. You’ll see this first-hand when I present a more efficient solution to the permutations problem below.
Step 5: Optimize by avoiding dead-end combinations
In the naive solution above, the algorithm tries all combinations, even those that could have been recognized as dead-ends prior to reaching three elements. As you can see, it passes on LeetCode but it’s slower than most passing submissions:
So, instead of trying all dead-end paths, a better approach would be to back out of following any path that will ultimately lead to a dead-end. That means keeping track of each element that the algorithm has already added to the current combination and preventing the algorithm from recursing down a path that adds an already-added element to the current combination.
The trick is to move complexity out of the success-case and add some complexity to the branching logic. Ideally, the algorithm should only reach the
end-of-path when the path is a success. With smart branching logic, we can actually prevent the algorithm from running into any dead-ends.
I’m going to remove the check for duplicates and instead add a cache for numbers that the algorithm has already-encountered. This will also allow me to remove the top-level
if/else because the iteration at each level is bound to terminate once all numbers are cached.
There are ways to further optimize this but the above modification illustrates the point: start with simple branching and complex success-case test. Then make the branching more sophisticated and simplify the success-case test.
Once again, the ideal algorithm is one that only reaches the end-of-path during success-cases and never during dead-ends.
Problem 2: Word Search
Solutions for the permutations problem on LeetCode has a 52% acceptance rate.
Let’s try one that’s a bit tougher, with only a 30.2% acceptance rate: Word Search.
An aside about understanding requirements
When solving programming problems, it behooves the problem-solver to fully explicate and understand requirements and edge cases. This came pretty naturally when solving permutations due to its simplicity, though it was still necessary to analyze the description and example-cases for implicit and explicit requirements. Analyzing the problem for the permutations problem revealed the following:
- A permutation of a collection is a re-arrangement of the elements in a collection
- Each element in the original collection should only appear in a permutation once
- The expected input is a single collection
- The expected output is an array of permutations of the input collection, including the original input collection (and, therefore, an ordered collection is a permutation of itself).
In fact, understanding requirements and sketching out test-cases is an important part of any problem-solving process: backtracking or otherwise. Check out this article by Launch School for a great systematic approach to breaking down and solving problems.
Okay, let’s return to the problem at hand: Word Search.
Analyzing the problem description reveals the following:
- The first input is a 2D grid of characters, which means each element is uniquely located with a coordinate:
- The second input is a string of characters
- The expected output is a boolean
- Our solution should determine whether the string can be constructed by combining letters in the grid.
- There are constraints placed on the combination. It is not enough to determine if the string can be constructed by combining the characters in the grid: the combination has to follow certain rules. The way the backtracking algorithm is going to check if a combination exists is by attempting to construct a valid combination according to the given constraints.
6. These are the individual constraints, or rules, demarcating valid character combinations from invalid ones:
- A character at a certain index can be used at most 1 time in any valid combination.
- For any character in any valid combination, the subsequent character must exist in a horizontally or vertically adjacent position.
- For any coordinate
(x,y), all of the following are considered adjacent positions:
- Further, a valid adjacent position
(a, b)of coordinate
(x, y)is one in which
a >= 0 && b >= 0 && a < numOfGridColumns && b < numOfGridRows. I say
<=because the grid is made up of arrays whose index starts at 0 rather than 1, so the elements in the last column of a grid with 5 columns will have an
xvalue of 4.
Now that the requirements are clear, I’ll start by solving the problem manually.
Step 1: Solve it by hand
Since this problem is more complex, I’ll solve it with a different target string as well. In this case, the target string will not actually exist. This is important because the manual solution contains the initial algorithm I want to implement. In the same way that writing down one’s thoughts provides new insight, solving the problem on paper reveals the structure of the problem. After solving it manually, all I have to do is act as a translator by converting the process I followed into code. The best part about this is that I didn’t have to know how to solve this problem manually before I started writing (and in fact this is the first time that I have ever personally looked at & attempted the Word Search problem, so you and I are in the same boat). Many processes outside of my conscious awareness are uniting to produce a solution that I will then translate into the language of the interpreter (since I will be implementing the solution in Ruby this time around). Solving by hand == solving by intuition. Once it’s solved by hand, I just need to make explicit all of the implicit steps I took.
This time, I tried to find a word that couldn’t be constructed according to the given rules. Notice that I stopped iterating through the coordinates once I passed all coordinates that contained “S”. This was to keep the drawing from appearing too cluttered and because it’s easy to see that the final row of cells would look the same as the first row of cells.
Now that I’ve solved two simple versions of the problem by hand, it’s time to proceed to the next step.
Step 2: Identify end-of-path conditions & dead-ends
I can see that I have reached the end of the path when I have explored a path of cells equal to the length of the target string.
So, I know that I’ve arrived at the end-of-path case when the path I’ve explored consists of a number of cells equal to the length of the target string.
The dead-end cases, as I can see, occur when:
- The contents of the cells I’ve collected, when appended together in the order in which I collected them, do not match the contents of the target string
- The current target coordinate is out of bounds of the grid
- Non-adjacent coordinates are used
- The same coordinate is reused
In either case, I will have to back up one step and continue my search down a different sub-path.
Step 3: Identify success cases
The success-case in this solution is encountered when:
- I’ve reached the end-of-path and
- The contents of the cells I’ve collected, when appended together in the order in which I collected them, do match the contents of the target string.
Step 4: Naive solution
Now it’s time to put it all together: the end-of-path case(s), the dead-end case(s), the success case(s), and implement the simple algorithm that I intuitively used to solve two instances of the problem by hand.
First, I’ll describe in words what I seemed to be doing when I was solving each instance of the problem by hand:
Describe the Manual Algorithm
For each remaining coordinate in the grid, I check if it contains the target character. If it doesn’t, I’ll skip to the next coordinate.
— — If I am at the final remaining coordinate and it does not contain the target character, I know to return false.
— — Else, I store that coordinate as the current coordinate and update the target character to the subsequent character in the target word. Then, for each of the current coordinate’s adjacent cells, if that cell has not yet been used in the current path and it contains the target character, I will repeat this process until I reach the end-of-path.
Notice that my solve-it-by-hand solution (as expressed in the drawings) uses a few “smart-branching” techniques. First, it uses some logic to skip over coordinates that don’t contain the target character. Second, it also skips over cells that have already been used in the current path. Third, for each cell at the top level, it is only looking through adjacent cells. This prevents a lot of unnecessary steps. For a problem of this limited scope, it’s probably fine to implement it like this right away. However, for more complex problems, this way of thinking can cause more trouble than it’s worth.
Remember: you play how you practice. I don’t want to try to jump straight to a smart solution, so I’ll keep some of the smart branching logic out of my initial solution despite the fact that doing so feels contrived.
In lieu of the above “smarter” solution, here is the algorithm I’ll be implementing first:
For each cell in the grid
Check stored cells length
If length of stored cells == length of target word and
contents of stored cells == target word and
all cells are unique (no coordinate is reused)
then return true Otherwise:
Store the current cell
Repeat this process with the updated cell-store
I’m not skipping cells that have already been used. I’m not skipping cells that don’t contain the target character. This algorithm will find every single combination of n cells (where n is the length of the target word).
Since the algorithm has to explicitly check for adjacent cells, make sure the cells are unique, and make sure the cells match the target word, the implementation ends up being a bit long. You can check out this gist to view the implementation of the naive solution.
It would be beneficial for you to grab the code in that gist and modify it to implement the “smarter” algorithm before reading on…
You can also test solutions to the problem in LeetCode here.
This naive solution is extremely slow. After submitting it to LeetCode, I can see that it passes all tests until it exceeds the time limit.
Step 5: Optimize by avoiding dead-end combinations
Now it’s time to optimize. The naive solution to Word Search was highly inefficient due to the fact that it would hit every end-of-path possible.
Recall that an ideal backtracking solution only reaches the end-of-path when that end-of-path will be a success-case: all dead-ends are avoided as early as possible.
The trick to optimizing these kinds of solutions is to look at the “constraint-satisfaction tests” (ie, the tests that check if an end-of-path is a success-case or dead-end) and move them to the branching logic.
It’s not as simple as copying and pasting, of course: you have to translate them from routines that check if a data structure has a set of predetermined target-properties once the algorithm reaches the end-of-path to logic that only allows the algorithm to construct “ideal” data structures in the first place.
In other words:
Before-optimization: The algorithm explores all possible paths to the end, building a candidate solution. Once it reaches an end-of-path, it checks if the candidate solution satisfies the predetermined constraints.
After-optimization: The algorithm explores a subset of paths to the end, building a candidate solution, such that it only reaches an end-of-path when the constructed candidate solution satisfies the predetermined constraints.
Recall that the naive solution (see gist here) explores all possible combinations of cells of length n (where n is the length of the target word). Constructing a combination of length n is the end-of-path condition. After the naive solution reaches an end-of-path, it will run a series of tests to determine whether the constructed solution satisfies the constraints. Namely, it will check if:
- The candidate does not reuse a cell.
- The characters at each cell on the given board match the target word when concatenated together in the order in which they appear in the candidate combination.
- Each subsequent cell is adjacent to the one before it.
Also, recall that the naive solution only passed 21 tests before exceeding the time limit placed on submissions.
I am going to optimize the naive solution step-by-step by transforming each constraint-satisfaction test into branching logic. Doing this will incrementally decrease the number of times the algorithm hits the end-of-path until it only hits the end-of-path if the candidate combination is a successful one.
I’ll start with the adjacent condition: Instead of testing that a candidate combination only contains cells adjacent to each other, I’ll only iterate through the adjacent cells of each current cell, to begin with.
This is closer to what I drew in the initial diagram: at the top level of iteration, the algorithm visits each cell in the grid. Each time it “hops” down a level to explore a path, it only visits cells adjacent to the cell at the level above. Therefore, there are two distinct types of iteration occurring. The solution is going to look something like this.
Reducing the number of paths the algorithm explores did improve the runtime, which is evident by looking at the results of submitting it to LeetCode. Instead of passing 21 tests before exceeding the time limit — as the naive algorithm did — it passes 28…
Next, I’ll remove the
cellsAreUnique test from the end-of-path to the branching logic, ensuring that the algorithm doesn’t explore any candidates that contain duplicates. Essentially, all this change entails is checking if a current cell’s adjacent cell has already been added to the candidate combination at an earlier time.
The resulting solution looks like this.
I’ll submit to LeetCode once again and, once again, I see a (minor) improvement: from 28 test cases passed to 30 test cases passed before the algorithm exceeds the time limit.
The final optimization will have to dramatically improve the algorithm’s runtime in order to make it all the way to 87. The final optimization is, of course, to only consider cells for any candidate combination if the contents of a cell match the correct character in the target word. I.e., the contents of a cell matches the i-th character, where i is the number of cells already pushed into the candidate combination.
The updated code is shown here. The performance of the solution dramatically improves, as shown by the LeetCode submission below.
Okay… so it still ran out of time for some inputs. So close, yet so far away. Let’s look at how else I can optimize.
The first thing I notice is that the solution continues to iterate through cells even when a successful candidate has been found. So I’ll add an early return between lines 9 and 10. That will prevent some unnecessary iterations.
The next thing I notice is that the
alreadySeen algorithm iterates through the candidate combination (denoted by the variable
seen) for every iteration over every adjacent cell during every recursive call. It’d be better to cache the cells that have already been collected such that the
already_seen check can be done in a single lookup.
What I am imagining is that this
seenCache will always remain consistent with the
seen collection. This means that whenever an element is added to — or removed from —
seenCache will update accordingly.
With the addition of a cache and an early return, I’ll resubmit the solution. You can see this version of the solution here.
Unfortunately, it still times out at 85/87 test cases.
It is definitely possible to rework this solution, while still performing backtracking, and make it really performant. But to do so will be to venture outside of this formulaic approach and cleverly rework a lot of it. However, 85/87 isn’t too shabby when we started out with less than half of the test cases passing before timeout.
Can you figure out how to write a faster solution while using backtracking?
There are some approaches that use a lot less code, written by people who are a lot more clever than I am. If you get stuck, there’s no harm in looking at a solution to understand how it’s done and “reverse engineer” the author’s mental model.