Ian P. Gent and Toby Walsh
In recent years, new constraint satisfaction algorithms have often been benchmarked on hard, random problems. Indeed, the study of random constraint satisfaction problems has itself become part of mainstream research (see, for example, [1,3]). There are, however, many reasons for wanting a larger class of problems in our benchmark suites. In this paper, we motivate the need for a benchmark library for constraints. We identify why this is a more difficult task than in other areas like machine learning or theorem proving. Finally, we describe CSPLIB , the first release of a benchmark library which is available over the World Wide Web at http://csplib.org.
There are many motivations for developing a benchmark library for constraints.
Whilst there are many constructive benefits of a benchmark library, there are also several potential pitfalls.
The machine learning community has had a long established benchmark library, the UCI Machine Learning Repository. This library contains some 111 data sets, and these are widely used to compare machine learning algorithms. In 1993, Holte published results which raise interesting questions about this library . He showed that very simple classification rules (1-level decision trees which classify examples on the basis of just a single attribute) compare very favourably in accuracy with the much more complex rules learnt by the (then) state of the art C4 algorithm on 16 data sets commonly used in machine learning research, 14 of which are taken from the UCI Machine Learning Repository.
As Holte himself argues, the practical significance of this result depends on whether or not these data sets are representative of those that arise in practice. Since we might not expect ``real'' classification problems to be solved by very simple rules, we might doubt if the data sets used in this study are representative of the data sets that actually arise in practice. By extension, we might also doubt the representativeness of the whole repository. Holte accepts that many classification problems that arise in practice do not have simple solutions (for example, the prediction of the secondary structure of protein, or the classification of won and lost positions in chess) and that machine learning techniques optimized for easy problems will perform poorly on such ``hard'' problems. However, he argues that many real classification problems are not so hard. He notes that many of the data sets in the UCI repository are drawn from a real-life domain and have not been specially ``doctored'' to make them easier to classify. Although these data sets cannot represent the entire range of ``real'' problems (for example, they do not represent any of the hard problems mentioned earlier), he argues that the number and diversity of them suggests that they represent a class of problems that often arises. Whilst we accept much of his argument, we must nevertheless treat parts of it with caution. For instance, as many of the data sets are taken from studies of machine learning algorithms, there may be a bias to the sort of easy problems which current machine learning algorithms are able to solve.
Having taken on board lessons from this cautionary tale, what should a model constraints library look like? Some (though not all) of the desiderata of such a library include:
We might look to domains other than machine learning to see how to develop a benchmark library which meets these desiderata. In theorem proving, for example, the TPTP (Thousands of Problems for Theorem Provers) library was started by Suttner and Sutcliffe in 1993  and has since proved highly successful and influential. It is a valued resource in the theorem proving community, and there is an annual system competition based on problems drawn at random from TPTP. The library contains over four thousand different problems, in 28 different domains, as well as comprehensive references on their source and a variety of problem generators. A software tool converts the problems into input formats used by the major theorem proving systems. In addition, the library provides guidelines on how to evaluate theorem proving systems.
Whilst we can learn much from the success of TPTP, there is one major and very significant difference between theorem proving and constraint satisfaction. In theorem proving, there is little debate about representation. Users of TPTP are happy to accept a simple skolemized first-order language. In constraint satisfaction, representation is much more of an issue. Our ability to solve a constraint satisfaction problem may hinge upon exactly how we decide to represent it. A benchmark library for constraints cannot therefore be as prescriptive as in theorem proving about the representation of the problems in it. At the 1998 DIMACS Workshop on Constraint Programming and Large Scale Discrete Optimization, we discussed this problem and suggested that problems should be presented in natural language. We first suggested English, but the non-native speakers of English quickly and correctly pointed out the bias in such a choice. Any other representation may make assumptions about representation that are unwanted.
To be successful, the research community must become active users of the library and contributors to it. This is in fact the most important desideratum of all: if CSPLIB is not used and contributed to, the time spent in setting it up will have been wasted. However it is one aspect of the library beyond the control of the maintainers. We hope that researchers in constraints will come to view CSPLIB as a one stop shop for finding constraints benchmarks, and disseminating new benchmarks. We hope that when you read a paper at CP-2000 solving a new kind of problem with constraints methods, you will find the problem already posted at CSPLIB . More than that, we hope that CSPLIB will become a rich enough source of benchmarks that when you invent a new technique, you will surf to CSPLIB in the expectation of finding a challenge problem there which has the features which your technique can exploit. And more even than that, we hope that as you browse CSPLIB you will be inspired by the problems that remain open to invent new ways of using constraints to solve difficult problems. None of these possibilities can happen unless the community becomes active in submitting problems to the library. Our experience is that writing an entry takes less than an hour, and the service to the community is invaluable.
What kind of problems would now be found at CSPLIB if authors of papers at CP-98 had been able to contribute to it? Just from test instances reported in the proceedings  but not available in other benchmark libraries, CSPLIB would now contain instances from protein structure prediction, due date constraints in scheduling, ordinary differential equations, flow shop scheduling, temporal reasoning problems, key/lock configuration, handling aerial images, integration with a graphics editor, hoist scheduling problems, and vehicle routing problems. Some of these instances are available either in the web or detailed in published papers, but many are not easily available in any form.
From this list of problems, it is clear that we intend to be liberal in the variety of problems in CSPLIB . Here is a non-exhaustive list of some of the other ways in which we will be liberal:
To help users submit benchmark problems to CSPLIB , we have prepared some simple guidelines. We repeat these in full:
``We welcome submission of all your constraint problems. We hope that the library will thereby grow to become a valued resource within the constraints community.
A key factor in solving many constraints problems is the modelling of the constraints. We therefore specify all problems in CSPLIB using natural language. We also provide hints about modelling these problems.
As we want to help people benchmark their algorithms on problems in CSPLIB with minimum effort, we encourage users of the library to send us any tools that might be useful to others (e.g. C parsers for data files, AMPL specification of problems, ILOG Solver code, ...). All such code is placed in the library through the generosity of the authors, and comes with all the usual disclaimers.
To make comparison with previous work easier, we provide links to papers that use these benchmarks. If you use a problem in CSPLIB , please send us the URL to your paper so we can add it to the references section.
Finally, to make it easy to compare your results with others, we will provide a record of results. Please help us keep these records up-to-date by sending in your results promptly. ''
Entries in the benchmark library typically consist of an index page, a specification page, a reference page, a results page, one or more data files (where applicable), and one or more software tools. The specification page is a simple natural language description of the problem. Numerous links are provided to outside resources, including online version of papers which report results on the benchmark in question. Problems are divided into different subject categories. They range from simple combinatorial puzzles like the existence or non-existence of a Golomb ruler of certain size, to more realistic examples like scheduling the 1997/98 Atlantic Coast Conference basketball tournament. It is our intention that no problem will be too small or too big for inclusion in the library.
As an example of a typical entry, Barbara Smith has proposed template design, a problem extracted from the printing industry, as a possible benchmark for the constraints community. The problem appears as prob002 in CSPLIB , We reproduce the pages from CSPLIB concerned with this problem. One factor missing from the reproduction of these pages are the many hyperlinks to online sources of information. We believe this may be one of the most useful aspects of the library. We therefore encourage users to send us any relevant URLs.
CSPLIB was first released on March 4th, 1999. The release version contained fourteen problems, divided into five, overlapping subject areas: scheduling and related problems, bin packing and related problems, frequency assignment and related problems, combinatorial mathematics, and games and puzzles. The total size of the library is just over 1 MByte. In its first month of operation, the library has received more than 100 visitors each week. A low-volume mailing list is maintained to make announcements about significant changes to the library. In addition, users can email email@example.com to propose new entries, or to extend or correct existing ones. The site is located at http://www.csplib.org and is mirrored at http://www.cs.cornell.edu/home/selman/csplib in the United States to help improve the speed of access. We are very grateful to Bart Selman for his assistance with the mirror.
We have introduced CSPLIB , a benchmark library for constraints. With the support of the research community, we believe that this library can grow to be a very valued resource. Unlike many other domains (for example, theorem proving, or machine learning), representation remains a major issue in our ability to solve constraint satisfaction problems. All problems in the library are therefore specified in natural language so that users can devise their own representations. In addition, many entries contain example representations (e.g. CHIP or Solver code). The success or failure of the library depends on its uptake by the constraints community. We encourage you therefore to benchmark your algorithms on the problems contained within CSPLIB , to send us results and any other data of interest that can be included in the entries, to notify us of any errors, but most of all, to propose new problems. Without your help, we cannot succeed.
CSPLIB : a benchmark library for constraints
This document was generated using the LaTeX2HTML translator Version 95.1 (Fri Jan 20 1995) Copyright © 1993, 1994, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -dir temp -split 0 csplib.tex.
The translation was initiated by Toby Walsh on Fri Oct 8 11:23:06 BST 1999