Approximation Algorithms for Facility Location 
Problems 

David B. Shmoys 

Cornell University, Ithaca NY 14853, USA 

Abstract. One of the most flourishing areas of research in the design 
and analysis of approximation algorithms has been for facility location 
problems. In particular, for the metric case of two simple models, the 
uncapacitated facility location and the k-median problems, there are 
now a variety of techniques that yield constant performance guarantees. 
These methods include LP rounding, primal-dual algorithms, and local 
search techniques. Furthermore, the salient ideas in these algorithms and 
their analyses are simple-to-explain and reflect a surprising degree of 
commonality. This note is intended as companion to our lecture at CONF 
2000, mainly to give pointers to the appropriate references. 

1A tale of two problems 

In the past several years, there has been a steady series of developments in the 
design and analysis of approximation algorithms for two facility location problems: 
the uncapacitated facilily location problem, and the k-median problem. 
Furthermore, although these two problems were always viewed as closely related, 
some of this recent work has not only relied on their interrelationship, but 
also given new insights into the ways in which algorithms for the former problem 
yield algorithms, and performance guarantees, for the latter. 

In the k-median problem, the input consists of a parameter k,and n points in 
a metric space; that is, there is a set N and for each pair of points i, j ∈N,there 
is a given distance d(i, j) between them that is symmetric (i.e., d(i, j)= d(j, i), 
for each i, j ∈N), satisfies the triangle inequality (i.e., d(i, j)+ d(j, k) ≥d(i, k), 
for each i, j, k ∈N), and also has the property that d(i, i)= 0 for each i ∈N. 
The aim is to select k of the n points to be medians, and then assign each of the 
n input points to its closest median so as to minimize the average distance that 
an input point is from its assigned median. Early work on the k-median problem 
was motivated by applications in facility location: each median corresponds to a 
facility to be built, and the input set of points corresponds to the set of clients 
that need to be serviced by these facilities; there are resources sufficient to build 
only k facilities, and one wishes to minimize the total cost of servicing the clients. 

In the uncapacitated facility location problem, which is also referred to as the 
simple plant location problem, the strict requirement that there be k facilities is 
relaxed, by introducing a cost associated with building a facility; these costs are 
then incorporated into the overall objective function. More precisely, the input 


consists of two sets of points (which need not be disjoint), the potential facility 
location points F, and the set of clients C; in this case, we shall let n denote |F ∪ 
C|.For each point i ∈F, there is a given cost fi that reflects the cost incurred in 
opening a facility at this location. We wish to decide which facilities to open so as 
to minimize the total cost for opening them, plus the total cost of servicing each 
of the clients from its closest open facility. The uncapacitated facility location 
problem is one of the most well-studied problems in the Operations Research 
literature, dating back to the work of Balinski [2], Kuehn & Hamburger [11], 
Manne [14], and Stollsteimer [18, 19] in the early 60’s. 

Throughout this paper, a ρ-approximation algorithm is a polynomial-time 
algorithm that always finds a feasible solution with objective function value 
within a factor of ρ of optimal. Hochbaum [8] showed that the greedy algorithm 
is an O(log n)-approximation algorithm for this problem, and provided instances 
to verify that this analysis is asymptotically tight. In fact, this result was shown 
for the more general setting, in which the input points need not belong to a 
metric space. 

Lin & Vitter [12] gave an elegant technique, called filtering, for rounding 
fractional solutions to linear programming relaxations. As one application of this 
technique for designing approximation algorithms, they gave another O(log n)approximation 
algorithm for the uncapacitated facility location problem. Furthermore, 
Lin & Vitter gave an algorithm for the k-median problem that finds a 
solution for which the objective is within a factor of 1 + of the optimum, but is 
infeasible since it opens (1 + 1/ )(ln n +1)k facilities. Both of these results hold 
for the general setting; that is, the input points need not lie in a metric space. 
In a companion paper, Lin & Vitter [13] focused attention on the metric case, 
and showed that for the k-median problem, one can find a solution of cost no 
more than 2(1 + ) times the optimum, while using at most (1 + 1/ )k facilities. 

The recent spate of results derive algorithms that can be divided, roughly 
speaking, into three categories. There are rounding algorithms that rely on linear 
programming in the same way as the work of Lin & Vitter, in that they 
first solve the linear relaxation of a natural integer programming formulation 
of the problem, and then round the optimial LP solution to an integer solution 
of objective function value no more than factor of ρ greater, thereby yielding 
a ρ-approximation algorithm. The second type of algorithm also relies on the 
linear programming relaxation, but only in an implicit way; in a primal-dual 
algorithm, the aim is to simultaneously derive a feasible integer solution for the 
original problem, as well as a feasible solution to the dual linear program to its 
linear relaxation. If one can show that the objective function value of the former 
always is within a factor of ρ of the latter, then this also yields a ρ-approximation 
algorithm. Finally, there are local search algorithms, where one maintains a feasible 
solution to the original problem, and then iteratively attempts to make 
a minor modification (with respect to a prescribed notion of “neighboring” solutions) 
so as to yield a solution of lower cost. Eventually, one obtains a local 
optimum; that is, a solution of cost no more than that of each of its neighboring 
solutions. In this case, one must also derive the appropriate structural proper



ties in order to conclude that any locally optimal solution is within a factor of 
ρ of the global optimum. These three classes of algorithms will immediately be 
blurred; for example, some algorithms will start with an LP rounding phase, but 
end with a local search phase. 

One other class of algorithmic techniques should also be briefly mentioned. 
Arora, Rao, & Raghavan [1] considered these two problems for geometrically-
defined metrics. For the 2-dimensional Euclidean case of the k-median problem 
(either when the medians must be selected from among the input points, or 
when they are allowed to be selected arbitrarily from the entire space) and 
the uncapacitated facility location problem, they give a randomized polynomial 
approximation scheme; that is, they give a randomized (1 + )-approximation 
algorithm, for any fixed > 0. No such schemes are likely to exist for the general 
metric case: Guha & Khuller [7] proved lower bounds, respectively, of 1.463 
and 1+1/e (based on complexity assumptions, of course) for the uncapacitated 
facility location problem and k-median problems. 

2 LP rounding algorithms 

The first approximation algorithm with a constant performance guarantee for the 
(metric) uncapacitated facility problem was given by Shmoys, Tardos, & Aardal 
[17]. Their algorithm is an LP rounding algorithm; they give a natural extension 
of the techniques of Lin & Vitter [13] to yield an algorithm with performance 
guarantee equal to 3/(1 − e−3) ≈ 3.16. 

Guha & Khuller [7] observed that one can strengthen the LP relaxation 
by approximately guessing the proportion of the overall cost incurred by the 
facilities in the optimal solution, and adding this as a constraint. Since there are 
only a polynomial number of reasonably-spaced guesses, we can try them all (in 
polynomial time). Guha & Khuller showed that by adding a local search phase, 
starting with the solution obtained by rounding the optimum to the “right” LP, 
one obtains a 2.41-approximation algorithm. For their result, the local search 
is extremely simple; one checks only whether some additional facility can be 
opened so that the overall cost decreases, and if so, one adds the facility that 
most decreases the overall cost. 

The LP rounding approach was further strengthened by Chudak & Shmoys 
[5, 6] who used only the simple LP relaxation (identical to the one used by Lin 
& Vitter and Shmoys, Tardos, & Aardal), but relied on stronger information 
about the structure of optimal solutions to the linear programming relaxation 
to yield a performance guarantee of 1 + 2/e ≈ 1.74. Another crucial ingredient 
in their approach is that of randomized rounding, which is a technique introduced 
by Raghavan & Thompson [16] in the context of multicommodity flow; 
in this approach, the fractional values contained in the optimal LP solution are 
treated as probabilities. Chudak & Shmoys showed how to incorporate the main 
decomposition results of [17] so as to obtain a variant that might best be called 
clustered randomized rounding. Chudak & Shmoys also showed that the tech



nique of Guha & Khuller could be applied to their algorithm, and by doing so, 
one improves the performance guarantee by a microscopic amount. 

The first constant performance guarantee for the k-median problem also 
relied an LP rounding approach: Charikar, Guha, Tardos, & Shmoys [4] gave 
a 20/3-approximation algorithm. A wide class of combinatorially-defined linear 
programs has the property that there exists an optimal solution with the 
property that each variable is equal to 0, 1/2, or 1; as will be discussed by 
Hochbaum in another invited talk at this workshop, this property often yields 
a 2–approximation algorithm as a natural corollary. Unfortunately, the linear 
relaxation used for the k-median problem does not have this property. However, 
the algorithm of [4] is based on the idea that one can reduce the problem to an 
input for which there is such an 1/2-integral solution, while losing only a constant 
factor, and then subsequently round the 1/2-integral solution to an integer 
one. 

3 Primal-dual algorithms 

In one of the most exciting developments in this area, Jain & Vazirani [9] gave an 
extremely elegant primal-dual algorithm for the uncapacitated facility location 
problem; they showed that their approach yields a 3-approximation algorithm, 
and it can be implemented to run in O(n2 log n) time. In fact, Jain & Vazirani 
proved a somewhat stronger performance guarantee; they showed that if we 
overcharge each facility used in the resulting solution by a factor of 3, then 
the total cost incurred is still within a factor of 3 of the optimum cost for the 
unaltered input. In fact, this overcharging is also within a factor of the 3 of the 
value of the LP relaxation with the original cost data. Mettu & Plaxton [15] give 
a variant of this algorithm (which is not explicitly a primal-dual algorithm, but 
builds substantially on its intuition) that can be implemented to run in O(n2) 
time, which is linear in the size of the input; this variant does not have the 
stronger guarantee (but is still a 3-approximation algorithm). 

This stronger performance guarantee for the uncapacitated facility location 
problem has significant implications for the k-median problem. One natural 
connection between the two problems works as follows: the facility costs can 
be viewed as Lagrangean multipliers that enforce the constraint that exactly 
k facilities are used. Suppose that we start with an instance of the k-median 
problem, and define an input to the uncapacitated facility location problem by 
letting C = F = N , and setting the cost of each facility to be a common value 
φ.If φ = 0, then clearly all facilities will be opened in the optimal solution, 
whereas for a sufficiently large value of φ, only one facility will be opened in the 
optimal solution. A similar trade-off curve will be generated by any (reasonable) 
algorithm for the uncapacitated facility location problem. 

Jain & Vazirani observed that if their approximation algorithm is used, and 
φ is set so that the number of facilities opened is exactly k, then the resulting 
solution for the k-median problem has cost within a factor of 3 of optimal. 
Unfortunately, the trade-off curve need not be continuous, and hence it is possible 


that no such value of φ exists. However, if this unlucky event occurs, then one 
can generate two solutions from essentially equal values of φ, where one solution 
has more than k facilities and the other has fewer; Jain & Vazirani show how 
to combine these two solutions to obtain a new solution with k facilities of cost 
that is within a factor of 6 of optimal. Charikar & Guha [3] exploited the fact 
that these two solutions have a great deal of common structure in order to give 
a refined analysis; in this way, they derive a 4-approximation algorithm for this 
problem. 

Mettu & Plaxton [15] also show how to extend their approach to the uncapacitated 
facility location problem to obtain an O(n2)-time algorithm for the 
k-median problem; in fact, their algorithm has the miraculous property that it 
outputs a single permutation of the input nodes such that, for any k, the first 
k nodes constitute a feasible solution within a constant factor of the optimal 
k-node solution. Thorup [20] has also given linear-time constant-factor approximation 
algorithms for the k-median problem; if the metric is defined with respect 
to a graph with m edges, then he gives a 12+o(1)-approximation algorithm that 
runs in O(m)time. 

4 Local search algorithms 

Local search is one of the most successful approaches to computing, in practice, 
good solutions to NP-hard optimization problems. Indeed, many of the most 
visible methods to the general scientific community are variations on this theme; 
simulated annealing, genetic algorithms, even neural networks can all be viewed 
as coming from this family of algorithms. In a local search algorithm, more 
narrowly viewed, one defines a graph on the space of all (feasible) solutions, 
where two solutions are neighbors if one solution can be obtained from the other 
by a particular type of modification. One then searches for a node that is locally 
optimal, that is, whose cost is no more than each of its neighbors, by taking a 
walk in the graph along progressively improving nodes. 

Korupolu, Plaxton, & Rajaraman [10] analyzed a local search algorithm in 
which two nodes are neighbors if exactly one facility is added (or, symmetrically, 
deleted) in comparing the two solutions, or both one facility is added and one 
facility is deleted. They show that a local optimum with respect to this neighborhood 
structure has cost within a factor of 5 of optimal. One further issue 
needs to be considered in deriving an approximation algorithm: the algorithm 
needs to run in polynomial time. This can be accomplished in a variety of ways, 
but typically involves computing a “reasonably good” solution with which to 
start the search, as well as insisting that a step not merely improve the cost, 
but improve the cost “significantly”. In this way, they show that, for any > 0, 
local search can yield a (5 + )-approximation algorithm. Charikar & Guha [3] 
give a more sophisticated neighborhood structure that yields another simple 
3-approximation algorithm. Furthermore, they show that rescaling the relative 
weight of the facility and the assignment costs leads to a 2.415-approximation 
algorithm based on local search. Finally, they show that all of the ideas above 


can be combined: LP rounding (on multiple LPs augmented as in [7] and augmented 
with a greedy improvement phase), primal-dual algorithms (improved 
with the rescaling idea), and the improved local search algorithm. In this way, 
they improve the performance guarantee of 1.736 due to Chudak & Shmoys to 

1.728. This might be the best guarantee known as of this writing, but for these 
two problems, it seems unlikely to be the last word. 

References 

1. S. Arora, P. Raghavan, and S. Rao. Approximation schemes for Euclidean k-
medians and related problems. In Proceedings of the 30th Annual ACM Symposium 
on Theory of Computing, pages 106–113, 1998. 
2. M. L. Balinksi. On finding integer solutions to linear programs. In Proceedings 
of the IBM Scientific Computing Symposium on Combinatorial Problems, pages 
225–248. IBM, 1966. 
3. M. Charikar and S. Guha. Improved combinatorial algorithms for the facility location 
and k-median problems. In Proceedings of the 40th Annual IEEE Symposium 
on Foundations of Computer Science, pages 378–388, 1999. 
´ 


4. M. Charikar, S. Guha, E. Tardos, and D. B. Shmoys. A constant-factor approximation 
algorithms for the k-median problem. In Proceedings of the 31st Annual 
ACM Symposium on Theory of Computing, pages 1–10, 1999. 
5. F. A. Chudak. Improved approximation algorithms for uncapacitated facility location. 
InR. E.Bixby, E.A.Boyd, andR. Z.R´ıos-Mercado, editors, Integer 
Programming and Combinatorial Optimization, volume 1412 of Lecture Notes in 
Computer Science, pages 180–194, Berlin, 1998. Springer. 
6. F. A. Chudak and D. B Shmoys. Improved approximation algorithms for the 
uncapacitated facility location problem. Submitted for publication. 
7. S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. 
In Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, 
pages 649–657, 1998. 
8. D. S. Hochbaum. Heuristics for the fixed cost median problem. Math. Programming, 
22:148–162, 1982. 
9. K. Jain and V. V. Vazirani. Primal-dual approximation algorithms for metric 
facility location and k-median problems. In Proceedings of the 40th Annual IEEE 
Symposium on Foundations of Computer Science, pages 2–13, 1999. 
10. M. R. Korupolu, C. G. Plaxton, and R. Rajaraman. Analysis of a local search 
heuristic for facility location problems. In Proceedings of the 9th Annual ACMSIAM 
Symposium on Discrete Algorithms, pages 1–10, 1998. 
11. A. A. Kuehn and M. J. Hamburger. A heuristic program for locating warehouses. 
Management Sci., 9:643–666, 1963. 
12. J.-H. Lin and J. S. Vitter. -approximations with minimum packing constraint 
violation. In Proceedings of the 24th Annual ACM Symposium on Theory of Computing, 
pages 771–782, 1992. 
13. J.-H. Lin and J. S. Vitter. Approximation algorithms for geometric median problems. 
Inform. Proc. Lett., 44:245–249, 1992. 
14. A. S. Manne. Plant location under economies-of-scale-decentralization and computation. 
Management Sci., 11:213–235, 1964. 


15. R. R. Mettu and C. G. Plaxton. The online median problem. In Proceedings of 
the 41st Annual IEEE Symposium on Foundations of Computer Science, 2000, to 
appear. 


16. P. Raghavan and C. D. Thompson. Randomized rounding: a technique for provably 
good algorithms and algorithmic proofs. Combinatorica, 7:365–374, 1987. 
´ 


17. D. B. Shmoys, E. Tardos, and K. I. Aardal. Approximation algorithms for facility 
location problems. In Proceedings of the 29th Annual ACM Symposium on Theory 
of Computing, pages 265–274, 1997. 
18. J. F. Stollsteimer. The effect of technical change and output expansion on the 
optimum number, size and location of pear marketing facilities in a California 
pear producing region. PhD thesis, University of California at Berkeley, Berkeley, 
California, 1961. 
19. J. F. Stollsteimer. A working model for plant numbers and locations. J. Farm 
Econom., 45:631–645, 1963. 
20. M. Thorup. Quick k-medians. Unpublished manuscript, 2000.