More advanced modeling features
Overview
This chapter introduces some more advanced features of the modeling language in Mosel. We shall not attempt to cover all its features or give the detailed specification of their formats. These are covered in greater depth in the Mosel Reference Manual.
Almost all large scale LP and MIP problems have a property known as sparsity, that is, each variable appears with a non-zero coefficient in a very small fraction of the total set of constraints. Often this property is reflected in the data tables used in the model in that many values of the tables are zero. When this happens, it is more convenient to provide just the non-zero values of the data table rather than listing all the values, the majority of which are zero. This is also the easiest way to input data into data tables with more than two dimensions. An added advantage is that less memory is used by Mosel.
The main areas covered in this chapter are related to this property:
- dynamic arrays
- sparse data
- conditional generation
- displaying data
We start again with an example problem. The following sections deal with the different topics in more detail.
A transport example
A company produces the same product at different plants in the UK. Every plant has a different production cost per unit and a limited total capacity. The customers (grouped into customer regions) may receive the product from different production locations. The transport cost is proportional to the distance between plants and customers, and the capacity on every delivery route is limited. The objective is to minimize the total cost, whilst satisfying the demands of all customers.
Model formulation
Let PLANT be the set of plants and REGION the set of customer regions. We define decision variables flowpr for the quantity transported from plant p to customer region r. The total cost of the amount of product p delivered to region r is given as the sum of the transport cost (the distance between p and r multiplied by a factor FUELCOST) and the production cost at plant p:
minimize
p PLANT
(FUELCOST · DISTANCEpr + PLANTCOSTp) · flowpr
r REGION
The limits on plant capacity are give through the constraints
p
PLANT:
flowpr
r REGION
PLANTCAPp
We want to meet all customer demands:
r
REGION:
flowpr = DEMAND_r
p PLANT
The transport capacities on all routes are limited and there are no negative flows:
p
PLANT, r
REGION: 0
flowpr
TRANSCAPpr
For simplicity's sake, in this mathematical model we assume that all routes p
r are defined and that we have TRANSCAPpr=0 to indicate that a route cannot be used.
Implementation
This problem may be implemented with Mosel as shown in the following (model file transport.mos):
model Transport uses "mmxprs" declarations REGION: set of string ! Set of customer regions PLANT: set of string ! Set of plants DEMAND: array(REGION) of real ! Demand at regions PLANTCAP: array(PLANT) of real ! Production capacity at plants PLANTCOST: array(PLANT) of real ! Unit production cost at plants TRANSCAP: array(PLANT,REGION) of real ! Capacity on each route plant->region DISTANCE: array(PLANT,REGION) of real ! Distance of each route plant->region FUELCOST: real ! Fuel cost per unit distance flow: array(PLANT,REGION) of mpvar ! Flow on each route end-declarations initializations from 'transprt.dat' DEMAND [PLANTCAP,PLANTCOST] as 'PLANTDATA' [DISTANCE,TRANSCAP] as 'ROUTES' FUELCOST end-initializations ! Create the flow variables that exist forall(p in PLANT, r in REGION | exists(TRANSCAP(p,r)) ) create(flow(p,r)) ! Objective: minimize total cost MinCost:= sum(p in PLANT, r in REGION | exists(flow(p,r))) (FUELCOST * DISTANCE(p,r) + PLANTCOST(p)) * flow(p,r) ! Limits on plant capacity forall(p in PLANT) sum(r in REGION) flow(p,r) <= PLANTCAP(p)</p> ! Satisfy all demands forall(r in REGION) sum(p in PLANT) flow(p,r) = DEMAND(r) ! Bounds on flows forall(p in PLANT, r in REGION | exists(flow(p,r))) flow(p,r) <= TRANSCAP(p,r) minimize(MinCost) ! Solve the problem end-modelREGION and PLANT are declared to be sets of strings, as yet of unknown size. The data arrays (DEMAND, PLANTCAP, PLANTCOST, TRANSCAP, and DISTANCE) and the array of variables flow are indexed by members of REGION and PLANT, their size is therefore not known at their declaration: they are created as dynamic arrays. There is a slight difference between dynamic arrays of data and of decision variables (type mpvar): an entry of a data array is created automatically when it is used in the Mosel program, entries of decision variable arrays need to be created explicitly (see Section Conditional variable creation and create below).
The data file transprt.dat contains the problem specific data. It might have, for instance,
DEMAND: [ (Scotland) 2840 (North) 2800 (SWest) 2600 (SEast) 2820 (Midlands) 2750] ! [CAP COST] PLANTDATA: [ (Corby) [3000 1700] (Deeside) [2700 1600] (Glasgow) [4500 2000] (Oxford) [4000 2100] ] ! [DIST CAP] ROUTES: [ (Corby North) [400 1000] (Corby SWest) [400 1000] (Corby SEast) [300 1000] (Corby Midlands) [100 2000] (Deeside Scotland) [500 1000] (Deeside North) [200 2000] (Deeside SWest) [200 1000] (Deeside SEast) [200 1000] (Deeside Midlands) [400 300] (Glasgow Scotland) [200 3000] (Glasgow North) [400 2000] (Glasgow SWest) [500 1000] (Glasgow SEast) [900 200] (Oxford Scotland) [800 *] (Oxford North) [600 2000] (Oxford SWest) [300 2000] (Oxford SEast) [200 2000] (Oxford Midlands) [400 500] ] FUELCOST: 17where we give the ROUTES data only for possible plant/region routes, indexed by the plant and region. It is possible that some data are not specified; for instance, there is no Corby – Scotland route. So the data are sparse and we just create the flow variables for the routes that exist. (The `*' for the (Oxford,Scotland) entry in the capacity column indicates that the entry does not exist; we may write '0' instead: in this case the corresponding flow variable will be created but bounded to be 0 by the transport capacity limit).
The condition whether an entry in a data table is defined is tested with the Mosel function exists. With the help of the `|' operator we add this test to the forall loop creating the variables. It is not required to add this test to the sums over these variables: only the flowpr variables that have been created are taken into account. However, if the sums involve exactly the index sets that have been used in the declaration of the variables (here this is the case for the objective function MinCost), adding the existence test helps to speed up the enumeration of the existing index-tuples. The following section introduces the conditional generation in a more systematic way.
Conditional generation — the | operator
Suppose we wish to apply an upper bound to some but not all members of a set of variables xi. There are MAXI members of the set. The upper bound to be applied to xi is Ui, but it is only to be applied if the entry in the data table TABi is greater than 20. If the bound did not depend on the value in TABi then the statement would read:
forall(i in 1..MAXI) x(i) <= U(i)Requiring the condition leads us to write
forall(i in 1..MAXI | TAB(i) > 20 ) x(i) <= U(i)The symbol `|' can be read as `such that' or `subject to'.
Now suppose that we wish to model the following
xi
MAXI i=1, Ai>20 15
In other words, we just want to include in a sum those xi for which Ai is greater than 20. This is accomplished by
CC:= sum((i in 1..MAXI | A(i)>20 ) x(i) <= 15Conditional variable creation and create
As we have already seen in the transport example (Section A transport example), with Mosel we can conditionally create variables. In this section we show a few more examples.
Suppose that we have a set of decision variables x(i) where we do not know the set of i for which x(i) exist until we have read data into an array WHICH.
model doesx declarations IR = 1..15 WHICH: set of integer x: dynamic array(IR) of mpvar end-declarations ! Read data from file initializations from 'doesx.dat' WHICH end-initializations ! Create the x variables that exist forall(i in WHICH) create(x(i)) ! Build a little model to show what esists Obj:= sum(i in IR) x(i) C:= sum(i in IR) i * x(i) >= 5 exportprob(0, "", Obj) ! Display the model end-modelIf the data in doesx.dat are
WHICH: [1 4 7 11 14]the output from the model is
Minimize x(1) + x(4) + x(7) + x(11) + x(14) Subject To C: x(1) + 4 x(4) + 7 x(7) + 11 x(11) + 14 x(14) >= 5 Bounds EndNote: exportprob(0, "", Obj) is a nice idiom for seeing on-screen the problem that has been created.
The key point is that x has been declared as a dynamic array, and then the variables that exist have been created explicitly with create. In the transport example in Section A transport example we have seen a different way of declaring dynamic arrays: the arrays are implicitly declared as dynamic arrays since the index sets are unknown at their declaration.
When we later take operations over the index set of x (for instance, summing), we only include those x that have been created.
Another way to do this, is
model doesx2 declarations WHICH: set of integer end-declarations initializations from 'doesx.dat' WHICH end-initializations finalize(WHICH) declarations x: array(WHICH) of mpvar ! Here the array is _not_ dynamic end-declarations ! because the set has been finalized Obj:= sum(i in WHICH) x(i) C:= sum(i in WHICH) i * x(i) >= 5 exportprob(0, "", Obj) end-modelBy default, an array is of fixed size if all of its indexing sets are of fixed size (i.e. they are either constant or have been finalized). Finalizing turns a dynamic set into a constant set consisting of the elements that are currently in the set. All subsequently declared arrays that are indexed by this set will be created as static (= fixed size). The second method has two advantages: it is more efficient, and it does not require us to think of the limits of the range IR a priori.
Reading sparse data
Suppose we want to read in data of the form
i, j, valueij from an ASCII file, setting up a dynamic array A(range, range) with just the A(i,j)= valueij for the pairs (i,j) which exist in the file. Here is an example which shows three different ways of doing this. We read data from differently formatted files into three different arrays, and using writeln show that the arrays hold identical data.
Data input with initializations from
The first method, using the initializations block, has already been introduced (transport problem in Section A transport example).
model "Trio input (1)" declarations A1: array(range,range) of real end-declarations ! First method: use an initializations block initializations from 'data_1.dat' A1 as 'MYDATA' end-initializations ! Now let us see what we have writeln('A1 is: ', A1) end-modelThe data file data_1.dat could be set up thus (every data item is preceded by its index-tuple):
MYDATA: [ (1 1) 12.5 (2 3) 5.6 (10 9) -7.1 (3 2) 1 ]This model produces the following output:
A1 is: [(1,1,12.5),(2,3,5.6),(3,2,1),(10,9,-7.1)]Data input with readln
The second way of setting up and accessing data demonstrates the immense flexibility of readln. The format of the data file may be freely defined by the user. After every call to read or readln the parameter nbread contains the number of items read. Its value should be tested to check whether the end of the data file has been reached or an error has occurred (e.g. unrecognized data items due to incorrect formating of a data line). Notice that read and readlninterpret spaces as separators between data items; strings containing spaces must therefore be quoted using either single or double quotes.
model "Trio input (2)" declarations A2: array(range,range) of real i, j: integer end-declarations ! Second method: use the built-in readln function fopen("data_2.dat",F_INPUT) repeat readln('Tut(', i, 'and', j, ')=', A2(i,j)) until getparam("nbread") < 6 fclose(F_INPUT) ! Now let us see what we have writeln('A2 is: ', A2) end-modelThe data file data_2.dat could be set up thus:
File data_2.dat:
Tut(1 and 1)=12.5 Tut(2 and 3)=5.6 Tut(10 and 9)=-7.1 Tut(3 and 2)=1When running this second model version we get the same output as before:
A2 is: [(1,1,12.5),(2,3,5.6),(3,2,1),(10,9,-7.1)]Data input with diskdata
As a third possibility, one may use the diskdata I/O driver from module mmetc to read in comma separated value (CSV) files. With this driver the data file may contain single line comments preceded with !.
model "Trio input (3)" uses "mmetc" ! Required for diskdata declarations A3: array(range,range) of real end-declarations ! Third method: use diskdata driver initializations from 'mmetc.diskdata:' A3 as 'sparse,data_3.dat' end-initializations ! Now let us see what we have writeln('A3 is: ', A3) end-modelThe data file data_3.dat is set up thus (one data item per line, preceded by its indices, all separated by commas):
1, 1, 12.5 2, 3, 5.6 10,9, -7.1 3, 2, 1We obtain again the same output as before when running this model version:
A3 is: [(1,1,12.5),(2,3,5.6),(3,2,1),(10,9,-7.1)]Note: the diskdata format is deprecated, it is provided to enable the use of data sets designed for mp-model and does not support certain new features introduced by Mosel.
If you have any comments or suggestions about these pages, please send mail to docs@dashoptimization.com.