What is Branch and Bound Algorithm?
These notes follow the discussion of branch and bound algorithms in Computer Algorithms by E. Horowitz, S. Sahni and S. Rajasekaran.
Here we describe a mathematical model of the process of choosing the next node to expand. This model also includes a function for deciding when to abandon a search path before reaching the end of the path. Abandoning searches early attempts to minimize computational efforts to find the minimal solution.
The basis of branch and bound algorithms is a ranking function. The ranking function assigns a value to each node in the graph. At each step, a branch and bound algorithm uses the ranking function to decide which node to expand next. In contrast, the usual DFS (Depth First Search) and BFS( Breadth First Search) exploration algorithms perform a blind search of the graph. Ideally, the ranking function, c(x), ranks nodes based on the cost of a minimal solution reachable from node x. The problem with this ranking function is that the minimal solution must be known ahead of time. Instead, the ranking function uses an estimate, g(x), of the cost of a minimal solution reachable from node x. Using g(x) to rank nodes may require exploring unnecessary nodes in the graph?particularly if g(x) is not a good estimate.The final element of the ranking function measures the cost of reaching a node from the root. As the searches gets farther from the root node, the node falls in the ranking.
The function h(x) measures the cost of reaching node x from the root node. After including a function of h(x), the ranking function c(x) is
c(x) = f(h(x)) + g(x)
in which f determines how much significant to give the cost of reaching node x. If f(h(x))= 0 then the algorithm makes long, deep searches of the graph. If f(h(x))> 0 then the algorithm considers nodes close to the root before making long potentially fruitless forays into the graph. A node is live if its subgraph might contain a minimal solution. Live nodes are potential candidates for exploration. A node is dead if its subgraph can not contain a minimal solution. If the estimated cost of a minimal solution reachable from x is less than the actual minimal solution reachable from x then a node can be killed when the estimated cost is greater than the least upper bound on a solution. In symbols, if c(x)>= upper then an optimal solution is not reachable from x (assuming upper is the least upper bound on a solution).It should come as no surprise that tuning c by adjusting f and g requires empirical studies of representative data sets.
Traveling Salesperson Problem
Two branch and bound solutions to the TSP are described in this section. Each solution estimates the cost of a minimal tour using a reduced cost matrix. In the first solution, the nodes in a path through the search tree represent cities on a tour. In the second solution, nodes in the search tree represent the inclusion or exclusion of an edge between two cities in the optimal tour. Suppose the distances between n cities are given in an nxn matrix. A row (or column) in this matrix is reduced if at least one entry in the row is 0 while the others are all positive. A matrix is reduced if all of its rows are reduced.
For a non-leaf node S with parent R, the cost of a tour including a trip from R to S can be estimated by a reduced cost matrix A. Matrix A is constructed by:
set all entries of row R and column S to 1 to avoid making a second visit to R or S in an optimal tour.
set A(S, 1) to ∞ to prevent making a premature trip from S to 1.
For each row or column that does not contain all 1, subtract each entry in each row by the smallest value in each row. Subtract each entry in each column by the smallest value in each column.The estimated length of the shortest tour including a trip from R to S is given by:
c(S) = f(h(R) + 1) + c(R) + A(R, S) + r
where r is the total value subtracted in step 3. The optimal values of f and h are determined empirically.
In its solution, trees were built by deciding which node to visit next. For example, the nodes in the ith level of the graph are the options for the ith city in a tour. Another solution can be built by deciding whether or not to include a trip between a particular pair of cities in each step. In this solution, the 2 root node might represent the decision to include or non-include a trip between cities 6 and 8. The left child represents the inclusion this trip and the right child represents the exclusion of this trip. The same distance cost estimate function c is used in this solution. The efficient implementation of this algorithm requires a policy for choosing which trip between cities to consider next in the search. A good policy is to choose cities R, S such that a trip between R and S has a high probability of being included in the optimal tour. One way of choosing R and S is to choose cities which result in the highest approximate cost for not including a trip between R and S in the tour.
C# Sample Program:
The algorithm is coded in C# using the Data adapters and data sets for accessing databases for reading and writing sales person’s visited cities and their costs with different problem IDs. There is a set of modules that are designed to implement the Branch and Bound algorithm. The program is designed so that it takes input from a database named “tsp.mdf” with the path “C:/tsp.mdf”. The “tProblems” table contains the problem information and the “tMatrics” table contains the matrices that contain the cost of traveling between each pair of cities. The tables have the following structure:
field type comments key number problemID integer Identifier for this problem in the tMatrices and tSolutions tables cities integer Number of cities in this problem timeBound float How many seconds allowed for solving this problem
field type comments key number problem integer Identifier for this problem, matches problemID in tPRoblems row integer Which row of the cost matrix is this row of the table? contents string Space-separated list of costs.
The branch and bound algorithm is implemented to solve the tsp problem within the given time bound. If time expires before the algorithm finds a solution, then the program returns the best solution found so far and exit.
The output is written to a table called tSolutions. This table has the following fields:
field type comments key number This is automatically updated, you can ignore it problem integer Identifier for this problem, matches problemID in tPRoblems cost integer Cost of your solution path string Order in which you visited the cities in your solution. Start with the city in the first row. Name the cities A, B, C, … Z, a, b, … z where the city in the first row is A, the city in the second row is B and so forth exactSolution boolean Was your solution exact? timeSpent float Amount of time your algorithm spent solving this problem. If you run out of time, it is ok for this value to be slightly larger than the time bound (because you probably won’t check the time between every pair of instructions and you will need a little time to turn off your timer and output the solution).
Project Files : Branch_Bound_code.zip