A 3-approximation algorithm for the k-level uncapacitated facility location problem Karen Aardal∗ Fabi´an A. Chudak† David B. Shmoys‡ Abstract In the k-level uncapacitated facility location problem, we have a set of demand points where clients are located. The demand of each client is known. Facilities have to be located at given sites in order to service the clients, and each client is to be serviced by a sequence of k different facilities, each of which belongs to a distinct level. There are no capacity restrictions on the facilities. There is a positive fixed cost of setting up a facility, and a per unit cost of shipping goods between each pair of locations. We assume that these distances are all nonnegative and satisfy the triangle inequality. The problem is to find an assignment of each client to a sequence of k facilities, one at each level, so that the demand of each client is satisfied, for which the sum of the setup costs and the service costs is minimized. We develop a randomized algorithm for the k-level facility location problem that is guaranteed to find a feasible solution of expected cost within a factor of 3 of the optimum cost. The algorithm is a randomized rounding procedure that uses an optimal solution of a linear programming relaxation and its dual to make a random choice of facilities to be opened. We show how this algorithm can be derandomized to yield a 3-approximation algorithm. Introduction In the classical single-level uncapacitated facility location problem, we want to decide where to locate facilities, and to which open facility each client should be assigned, so as to minimize the total cost of setting up the facilities and of servicing the clients. In the k-level uncapacitated facility location problem, each client must be serviced by a sequence of k different facilities. This problem has applications, for instance, in more complex logistical systems where central depots ship goods through smaller depots, and further down the hierarchy, to the clients. We refer to Kaufman, Vanden Eede, & Hansen [10], Van Roy & Erlenkotter [15], Tcha & Lee [14], Barros [2], and Aardal, Labb´e, Leung, & Queyranne [1] for further applications of multi-level location problems, as well as optimization algorithms and heuristic approaches. Since the k-level problem is NP-hard even if k =1, an approximation algorithm is useful in order to quickly find feasible solutions of good quality. It was observed already in the 1970’s that the linear relaxation of the so-called “strong” integer programming formulation of the 1-level problem provides values close to the integer optimum value, ∗aardal@cs.uu.nl. Department of Computer Science, Utrecht University. Research partially supported by the ESPRIT Long Term Research Project nr. 20244 (Project ALCOM-IT: Algorithms and Complexity in Information Technology), by the project TMR-DONET nr. ERB FMRX-CT98-0202, both of the European Community, by NSF grant CCR-9307391 through David B. Shmoys, Cornell University, and by NSF through the Center for Research on Parallel Computation, Rice University, under Cooperative Agreement No. CCR-9120008. †chudak@cs.cornell.edu. IBM TJ Watson Research Center, Yorktown Heights, NY 10598. This research was done while the author was a graduate student at the School of Operations Research & Industrial Engineering, Cornell University, Ithaca, NY 14853. Research partially supported by NSF grants DMS-9505155 and CCR-9700029 and by ONR grant N00014-96-1-00500. ‡shmoys@cs.cornell.edu. School of Operations Research & Industrial Engineering, Cornell University. Research partially sponsored by NSF grants CCR-9700029 & DMS-9505155 and ONR grant N00014-96-1-0050O. and so it was natural to hope to find an approximation algorithm with a good performance guarantee, based on this relaxation. In practice, the dual-ascent based heuristic by Erlenkotter [7] produces excellent solutions, but in the worst case this algorithm performs arbitrarily poorly. We shall say that an algorithm for a particular optimization problem is a ρ-approximation algorithm if it is a polynomial-time algorithm that is guaranteed to find a feasible solution of objective function value within a factor of ρ of the optimum. For the 1-level problem where the objective is to maximize the “profit” of servicing the clients, a (1 − 1 )-approximation algorithm was developed by Cornu´ejols, e Fisher, & Nemhauser in 1977 [5], but for the version where we want to minimize the total cost, a constant performance guarantee remained elusive until recently. In 1997, Shmoys, Tardos, & Aardal [13] considered the special case in which the service costs satisfy the triangle inequality, and gave a 3.16-approximation algorithm based on the linear relaxation for the 1-level problem. This performance guarantee was improved by Guha & Khuller [9] to 2.408, and later by Chudak & Shmoys [3, 4], who presented a 1.736-approximation algorithm. These algorithms are based on the linear relaxation as well. We give a 3-approximation algorithm for the k-level facility location problem, which is the first constant performance guarantee for any k ≥ 2. (This improves on a preliminary version of this result, a 3.16-approximation algorithm for just the 2-level case [13]; the other results of this conference paper [13] will be published in a separate journal paper.) For a more extensive overview of algorithmic results for various facility location problems we refer to Cornu´ejols, Nemhauser, & Wolsey [6], Shmoys et al. [13], and Chudak [3]. 2The k-level uncapacitated facility location problem Let D be the set of demand points. Each client j ∈ D must be assigned to precisely one facility at each of the k levels. Let Fl be the set of sites where facilities on level l,1 ≤ l ≤ k may be located, and assume that the sets Fl are pairwise disjoint. We denote the set of all such sites, ∪k Fl,by F. l=1 The cost of setting up a facility at site i is fi. We assume that fi > 0for each i ∈ F. Whenever we use the the expression “facility i” we mean “facility at site i”. The cost of shipping between points i,j ∈F∪D is equal to cij . Throughout this paper, we make the following assumptions on the service costs: • cij ≥0, foreach i,j ∈F ∪D; • cij = cji,for each i,j ∈ F ∪D, i.e., the service costs are symmetric, • cik ≤cij + cjk ,for each i,j,k ∈ F ∪D, i.e., the service costs satisfy the triangle inequality. We shall use pto denote a sequence of facilities il ∈ Fl , l =1,...,k,and shallrefer to p as a path of facilities. The set of all possible paths is denoted by P. Each client must be assigned to precisely one path p ∈ P. The total service cost incurred by assigning client j to path p =(i1,i2,...,ik)is equal to cpj = ci1 ,i2 + ci2 ,i3 + ···+ cik−1 ,ik + cik ,j. Let xpj be equal to 1 if client j is assigned to path p, and 0 otherwise. Furthermore, let yil be equal to 1 if facility il is open at level l. If we relax the constraints that xpj ∈{0,1},for each p ∈ P and j ∈ D,and yil ∈{0,1},for each il ∈ Fl , l =1,...,k, we obtain the following linear programming relaxation of the k-level uncapacitated facility location problem: k =min l=1 il ∈F l p∈Pj∈D subject to cpj xpj (1) fil yil + zLP xpj =1, for each j ∈D, (2) p∈P xpj − yil ≤ 0, for each j ∈ D and il ∈ Fl , l =1,...,k, (3) p:pil xpj ≥ 0, for each p∈ P,and j ∈ D, (4) yil ≥ 0, for each il ∈ Fl,l =1,...,k. (5) In the integer programming formulation, constraints (2) impose that each client is assigned to precisely one path. To interpret constraints (3), fix a client j, and a facility il on level l.No assignment of client j to a path using facility il is possible unless facility il is open. If yil is equal to 0, then the sum of all assignment variables for client j to use paths containing facility il must also be 0. The upper bounds xpj ≤ 1 are not necessary in the model due to the combination of constraints (2) and (4). Similarly, we do not need to state the constraints yil ≤ 1 explicitly since the sum in constraints assume that the fixed cost fil Let vj and wil ,j denote the dual variables corresponding to the primal constraints (2) and (3), respectively. The dual problem corresponding to the linear programming relaxation is given below: (3), xpj , is bounded from above by the sum in constraints (2), = 1, and since we p∈P xpjp:pil >0, for each il ∈ Fl . =max zLP (6) vj j∈D subject to for each p∈ P,and j ∈ D, (7) vj − wil ,j ≤ cpj , il∈p wil ,j ≤ fil , for each 1 ≤ l ≤ k,and il ∈ Fl , (8) wil ,j ≥ 0, for each 1 ≤ l ≤ k, il ∈ Fl,and j ∈ D. ∈jD (9) Since the dual has a polynomial number of variables, we can use the ellipsoid algorithm to solve this linear program in polynomial time, given a polynomial-time separation algorithm [8]. In other words, we must design a polynomial-time algorithm that, given (v,w), either concludes that (v,w)is feasible, or else outputs a constraint that is violated by it. Constraints (8) can be checked explicitly (since there are only a polynomial number of them). For each j ∈ D, the constraints (7) can be checked by a straightforward shortest path computation. Notice that both edge and node weights need to be considered. Since the dual linear program can be solved in polynomial time, the primal can also be solved in polynomial time. Furthermore, the optimal solution found by this ellipsoid- based algorithm (or essentially any polynomial-time linear programming algorithm) will have only a polynomial number of positive variables. In fact, we could also assume that algorithm finds an optimal basic solution, see Gr¨otschel, Lov´asz, & Schrijver [8]. In the analysis of our algorithm, we will use the following lemma in order to bound the value of the service cost of a feasible integer solution. Lemma 1 Let (¯x,y¯) and (¯v,w¯), respectively, be optimal solutions to the primal and dual linear programs. Then x¯ pj >0 implies that cpj ≤ v¯ j ,for each p∈ P and j ∈ D. Proof: Consider the following set of complementary slackness conditions: x¯ pj (cpj − v¯ j + w¯ il ,j )=0, for each p∈ P,and j ∈ D, (10) il∈p which must hold for any pair of optimal solutions to the linear program and its dual. If x¯ pj > 0, then cpj − v¯ j + cpj ≤ v¯ j . il ∈p w¯ il ,j =0, or equivalently, cpj + il∈p w¯ il ,j =¯vj .Since w¯ il ,j ≥ 0, we have that The algorithm We next present a randomized algorithm that produces feasible solutions of expected value no more than 3 times the optimum value. After the description of the algorithm, we prove the claimed performance guarantee, and then describe how the algorithm can be derandomized. Description of the algorithm. We shall refer to the algorithm described below as Random Path. We start by solving the linear programs (1) – (5) and (6) – (9) to optimality. We denote the optimal solutions by (¯x, y¯)and (¯v, w¯), respectively. The aim of our algorithm is to construct so-called clusters of facility and demand locations. After rounding, each cluster consists of a path of facility locations, and a set of clients that are serviced uniquely by this path. Each cluster is centered at a certain demand point, as described below. During the execution of our algorithm we maintain: 1. A feasible fractional solution (x, y). Initially, we set (x, y):=(¯x, y¯). 2. Aset C of clusters. Initially, we set C := ∅. ¯¯ 3. Aset D of demand points that are not part of any cluster in C. Initially, D := D. ¯ In each iteration, we choose a client j ∈ D with minimum value of v¯ j +¯cj ,where c¯ j = p∈P cpj xpj . Let jt denote the client chosen in iteration t. Client jt is the center of the cluster Ct,which is computed in the following way. Let Pt be the set of paths of facilities that service client jt by any positive fraction, i.e., Pt = {p ∈ P : xp,jt > 0}. Also, let Ftl = {il ∈ Fl : il belongs to at least one path in Pt}, for each l =1,...,k,and Ft = ∪k Fl.Let Tt = {j ∈ D : xpj > 0 such that p contains at least one l=1 t facility in Ft}.Notice that jt ∈ Tt. For an illustration of the sets Pt and Tt, we refer to Figure 1. Let Ct be the set of all clients in Tt and all facilities in Ft, i.e., Ct = Tt ∪ Ft. level 1 level k jt level k pt :paths in Pt : clients in Tt Figure 1: The paths in Pt, the client jt, and the other clients in Tt. Next, choose a path p ∈ Pt with probability xp,jt .Callthe chosen path pt. Round all variables {yil }il∈pt to 1 and all variables {yil }il∈Ft −pt to 0. We assign each client in Tt to the path pt;that is, we set xpt ,j =1 for each j ∈ Tt and xpj =0 for each j ∈ Tt and p ∈ Pt −{pt}.The situation ¯¯ after rounding is illustrated in Figure 2. We now set D := D−Tt and C:= C∪{Ct}. We iterate this ¯ process until D = ∅. Notice that each client belongs to precisely one cluster in C when the algorithm terminates. level 1 level k jt pt : clients in Tt Figure 2: The solution created by rounding within the cluster Ct. We are now ready to prove the following theorem. Theorem 2 The algorithm Random Path runs in polynomial time, and produces a feasible integer solution whose expected total cost is no more than 3 times the optimum cost. Proof: We first show that our algorithm produces a feasible integer solution. In the rounding step of iteration t we create a cluster by first choosing a certain client jt as a center. Then we randomly choose one of the fractional paths pt ∈ Pt. Each variable yil : il ∈ pt is set to 1, so all facilities in path pt are open. Finally, we assign all clients in the set Tt to the path pt. The assignment of the ¯ clients in Tt to the path pt is integer and feasible. No client in D −Tt is fractionally serviced by any of the facilities in Ft, by definition, and so they are unaffected by the current rounding step. Hence, a feasible fractional solution (x,y) is maintained during the execution of the algorithm. Upon termination of the algorithm the solution (x,y)is integral. We now need to prove that the algorithm has the claimed performance guarantee. Consider the cluster Ct created in iteration t. Although the algorithm updates the solution (x,y) throughout its ¯ execution, at the start of iteration t, wehavethat xpj =¯xpj ,for each p ∈ Pt and j ∈ D,and yil =¯yil ,for each il ∈Fl t , l =1,...,k. Hence, the expected fixed cost incurred in iteration t is equal to k k ( fil )¯xp,jt = fil ( x¯ p,jt ) ≤ fil y¯ il , (11) p∈Pt il ∈pl=1 il∈F l p:pil l=1 il ∈F l tt where the inequality holds due to constraints (3). 5 Furthermore, the expected service cost for client jt is equal to cp,jt x¯ p,jt =¯cjt . (12) p∈Pt To obtain a bound on the expected service cost for each client j∈ Ct,j= jt, we need to observe what happens in the rounding step. Since client jbelongs to the set Tt, it is fractionally serviced through at least one facility that is also fractionally servicing the center jt of cluster Ct.Let p¯ be the path that fractionally services jand that has at least one facility in common with a path p ∈ Pt. Let facility iq be the facility with largest level index that belongs to both paths p¯ and p;see Figure 3. After rounding, client jis completely serviced by path pt. Due to the triangle inequality we can pt facility iq ¯pp j jt Figure 3: bound the cost cpt ,j from above by the sum of the cost of traversing p¯ from jto iq,and thecost of traversing p from iq to jt, and the cost of traversing all of pt from jt. Hence, by linearity of expectation and the fact that the costs are nonnegative and symmetric, we can bound from above the expected service cost for jby cpj¯ + cp ,jt +¯cjt ≤ v¯ j +¯vjt +¯cjt ≤ 2¯vj +¯cj , (13) where the first inequality follows from Lemma 1, and the second inequality follows from the fact that the choice of client jt ensures that c¯ jt +¯vjt ≤ c¯ j +¯vj . If we combine the expressions (11)–(13), then we conclude that the total expected cost for all clusters in C is bounded by kk f + (¯cj +2¯vj )=2 v¯ j + f + c pj x¯ pj =3zLP . ¯ ¯ y il il y il il l=1 il∈F l j∈Dj∈Dl=1 il∈F l j∈Dp∈P The running time of the algorithm is dominated by the time required to solve the linear programming relaxation and its dual. As noted above, these linear programs can be solved in polynomial time by the ellipsoid algorithm. Derandomizing the algorithm. We have proved that our randomized algorithm delivers a solution for which the expected cost is at most three times the cost of the optimal solution. There is a wide range of techniques known to “derandomize” such algorithms, including such approaches as the method of conditional probabilities; for an extensive discussion of these methods, the reader is referred to the textbook of Motwani & Raghavan [12]. The idea common to all of these methods is to deterministically find a set of “random choices” which lead to a solution of cost no more than the given expected value. For example, if one chooses the cheapest among a small set of choices, where the expectation can be expressed as some weighted average of the same set, then one is assured that the cost of the solution found is no more than the given expectation. It is precisely this line of reasoning that leads to a simple derandomization of our algorithm. Since the expected cost of cluster Ct created in iteration t by the algorithm Random Path is bounded by k fil y¯ il + (¯cj +2¯vj ), (14) l=1 il ∈F l j∈Tt t there must exist a path p ∈ Pt such that the cost of cluster Ct obtained by choosing path p is no tt more than the bound on the expected cost (14). Hence, in a derandomized version of our algorithm, we can choose any path in iteration t that yields a cost of cluster Ct that is less than or equal to the bound (14). One specific possibility is to evaluate the cluster cost for each path in Pt,and to take the path that yields the minimum cost. Since we have assumed that the fractional solution to be rounded has only a polynomial number of positive variables, there are only a polynomial number of paths to consider in each iteration. Theorem 3 The derandomized version of Random Path is a 3-approximation algorithm. Discussion An interesting open question is whether the techniques used by Chudak & Shmoys to improve the performance guarantee for the 1-level problem can be extended to the k-level problem as well. In our algorithm, as well as in the algorithm of Shmoys, Tardos & Aardal [13], the value of the fixed costs of the integer feasible solution does not increase, compared to the linear programming solution, so the increase in total cost over zLP is all due to the bound on the service cost. Chudak & Shmoys gave a more refined rounding procedure so that the service cost and the facility cost are better balanced. A related question is to obtain a tight bound on the duality gap, the worst-case ratio between the integer and fractional optimal values, for the k-level problem (and also the 1-level problem), given different assumptions on the objective function. It would also be interesting to investigate how our algorithm performs in practice. The most substantial obstacle is the solution of the linear programming relaxation, but it would be interesting to see if a cutting-plane approach to solving the dual linear program (and hence the primal) would be possible with reasonable computation times. Such an approach might be further enhanced by preprocessing the input to reflect the fact that most (expensive) paths are unlikely to be used in an optimal primal solution. This approach has been successful, for example, in computational work on the traveling salesman problem, where one typically includes only the ten shortest edges from each vertex of the underlying graph in the initial formulation. Once this initial linear program has been solved to optimality, one can compute the reduced cost of every excluded variable to verify whether the current solution is optimal with respect to the complete set of variables. Acknowledgments We wish to thank J. K. Lenstra and A. M. Verweij for useful comments on a preliminary version of this manuscript. References [1] K. Aardal, M. Labb´e, J. Leung, and M. Queyranne. On the two-level uncapacitated facility location problem. INFORMS J. Comput., 8:289–301, 1996. [2] A.I.M.B.de Barros. Discrete and Fractional Programming Techniques for Location Problems. Ph.D. Thesis, Erasmus Universiteit Rotterdam. Tinbergen Institute Research Series No. 89. Thesis Publishers, Amsterdam, 1995. [3] F. A. Chudak. Improved approximation algorithms for uncapacitated facility location. In: R.E. Bixby, E.A. Boyd, R Z. R´ıos-Mercado (eds.), Integer Programming and Combinatorial Optimization, 6th International IPCO Conference. Lecture Notes in Computer Science 1412, pages 180–194, Springer-Verlag, Berlin Heidelberg, 1998. [4] F. A. Chudak and D. B. Shmoys. Improved approximation algorithms for the uncapacitated facility location problem. Manuscript, 1999. [5] G. Cornu´ejols, M. L. Fisher, and G. L. Nemhauser. Location of bank accounts to optimize float: An analytic study of exact and approximate algorithms. Management Sci., 8:789–810, 1977. [6] G. Cornu´ejols, G. L. Nemhauser, and L. A. Wolsey. The uncapacitated facility location problem. In: P. Mirchandani and R. Francis (eds.), Discrete Location Theory, pages 119–171. John Wiley and Sons, Inc., New York, 1990. [7] D. E. Erlenkotter. A dual-based procedure for uncapacitated facility location. Operations Research 26:992-1009, 1978. [8] M. Gr¨otschel, L. Lov´asz, and A. Schrijver. Geometric Algorithms and Combinatorial Optimization. Springer, Berlin, 1988. [9] S. Guha, and S. Khuller. Greedy strikes back: improved facility location algorithms. In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 649–657, 1998. [10] L. Kaufman, M. vanden Eede, and P. Hansen. A plant and warehouse location problem. Operational Research Quarterly, 28:547–557, 1977. [11] P. B. Mirchandani and R. L. Francis (eds.). Discrete Location Theory. John Wiley and Sons, Inc., New York, 1990. [12] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, Cambridge, 1995. ´ [13] D. B. Shmoys, E. Tardos, and K. Aardal. Approximation algorithms for facility location problems. In 29th ACM Symposium on Theory of Computing, pages 265–274, 1997. [14] D. Tcha and B. Lee. A branch-and-bound algorithm for the multi-level uncapacitated location problem. European J. Oper. Res., 18:35–43, 1984. [15] T. J. Van Roy and D. Erlenkotter. A dual based procedure for dynamic facility location. Management Sci., 28:1091–1105, 1982.