[Previous chapter] [Next chapter]

More advanced modeling features

Overview

This chapter introduces some more advanced features of the modeling language in Mosel. We shall not attempt to cover all its features or give the detailed specification of their formats. These are covered in greater depth in the Mosel Reference Manual.

Almost all large scale LP and MIP problems have a property known as sparsity, that is, each variable appears with a non-zero coefficient in a very small fraction of the total set of constraints. Often this property is reflected in the data tables used in the model in that many values of the tables are zero. When this happens, it is more convenient to provide just the non-zero values of the data table rather than listing all the values, the majority of which are zero. This is also the easiest way to input data into data tables with more than two dimensions. An added advantage is that less memory is used by Mosel.

The main areas covered in this chapter are related to this property:

dynamic arrays

sparse data

conditional generation

displaying data

We start again with an example problem. The following sections deal with the different topics in more detail.

A transport example

A company produces the same product at different plants in the UK. Every plant has a different production cost per unit and a limited total capacity. The customers (grouped into customer regions) may receive the product from different production locations. The transport cost is proportional to the distance between plants and customers, and the capacity on every delivery route is limited. The objective is to minimize the total cost, whilst satisfying the demands of all customers.

Model formulation

Let PLANT be the set of plants and REGION the set of customer regions. We define decision variables flow_pr for the quantity transported from plant p to customer region r. The total cost of the amount of product p delivered to region r is given as the sum of the transport cost (the distance between p and r multiplied by a factor FUELCOST) and the production cost at plant p:

minimize

_{p PLANT}

_{r REGION}

(FUELCOST · DISTANCE_pr + PLANTCOST_p) · flow_pr

The limits on plant capacity are give through the constraints

p PLANT:

_{r REGION}

flow_pr PLANTCAP_p

We want to meet all customer demands:

r REGION:

_{p PLANT}

flow_pr = DEMAND_r

The transport capacities on all routes are limited and there are no negative flows:

p PLANT, r REGION: 0 flow_pr TRANSCAP_pr

For simplicity's sake, in this mathematical model we assume that all routes p r are defined and that we have TRANSCAP_pr=0 to indicate that a route cannot be used.

Implementation

This problem may be implemented with Mosel as shown in the following (model file transport.mos):
model Transport
 uses "mmxprs"

 declarations
  REGION: set of string                 ! Set of customer regions
  PLANT: set of string                  ! Set of plants

  DEMAND: array(REGION) of real         ! Demand at regions
  PLANTCAP: array(PLANT) of real        ! Production capacity at plants
  PLANTCOST: array(PLANT) of real       ! Unit production cost at plants
  TRANSCAP: array(PLANT,REGION) of real ! Capacity on each route plant->region
  DISTANCE: array(PLANT,REGION) of real ! Distance of each route plant->region
  FUELCOST: real                        ! Fuel cost per unit distance

  flow: array(PLANT,REGION) of mpvar    ! Flow on each route
 end-declarations
 
 initializations from 'transprt.dat'
  DEMAND
  [PLANTCAP,PLANTCOST] as 'PLANTDATA'
  [DISTANCE,TRANSCAP] as 'ROUTES'
  FUELCOST
 end-initializations
 
! Create the flow variables that exist
 forall(p in PLANT, r in REGION | exists(TRANSCAP(p,r)) ) create(flow(p,r))
 
! Objective: minimize total cost
 MinCost:= sum(p in PLANT, r in REGION | exists(flow(p,r))) 
            (FUELCOST * DISTANCE(p,r) + PLANTCOST(p)) * flow(p,r)
 
! Limits on plant capacity
 forall(p in PLANT) sum(r in REGION) flow(p,r) <= PLANTCAP(p)</p>

! Satisfy all demands
 forall(r in REGION) sum(p in PLANT) flow(p,r) = DEMAND(r)
 
! Bounds on flows
 forall(p in PLANT, r in REGION | exists(flow(p,r))) 
  flow(p,r) <= TRANSCAP(p,r)
 
 minimize(MinCost)                       ! Solve the problem
 
end-model
REGION and PLANT are declared to be sets of strings, as yet of unknown size. The data arrays (DEMAND, PLANTCAP, PLANTCOST, TRANSCAP, and DISTANCE) and the array of variables flow are indexed by members of REGION and PLANT, their size is therefore not known at their declaration: they are created as dynamic arrays. There is a slight difference between dynamic arrays of data and of decision variables (type mpvar): an entry of a data array is created automatically when it is used in the Mosel program, entries of decision variable arrays need to be created explicitly (see Section Conditional variable creation and create below).

The data file transprt.dat contains the problem specific data. It might have, for instance,
DEMAND: [ (Scotland) 2840 (North) 2800 (SWest) 2600 (SEast) 2820 (Midlands) 2750]

                     ! [CAP  COST] 
PLANTDATA: [ (Corby)   [3000 1700]
             (Deeside) [2700 1600]
             (Glasgow) [4500 2000]
             (Oxford)  [4000 2100] ]

                           ! [DIST CAP] 
ROUTES: [ (Corby   North)    [400 1000]
          (Corby   SWest)    [400 1000]
          (Corby   SEast)    [300 1000]
          (Corby   Midlands) [100 2000]
          (Deeside Scotland) [500 1000]
          (Deeside North)    [200 2000]
          (Deeside SWest)    [200 1000]
          (Deeside SEast)    [200 1000]
          (Deeside Midlands) [400  300]
          (Glasgow Scotland) [200 3000]
          (Glasgow North)    [400 2000]
          (Glasgow SWest)    [500 1000]
          (Glasgow SEast)    [900  200]
          (Oxford  Scotland) [800    *]
          (Oxford  North)    [600 2000]
          (Oxford  SWest)    [300 2000]
          (Oxford  SEast)    [200 2000]
          (Oxford  Midlands) [400  500] ]

FUELCOST: 17
where we give the ROUTES data only for possible plant/region routes, indexed by the plant and region. It is possible that some data are not specified; for instance, there is no Corby – Scotland route. So the data are sparse and we just create the flow variables for the routes that exist. (The `*' for the (Oxford,Scotland) entry in the capacity column indicates that the entry does not exist; we may write '0' instead: in this case the corresponding flow variable will be created but bounded to be 0 by the transport capacity limit).

The condition whether an entry in a data table is defined is tested with the Mosel function exists. With the help of the `|' operator we add this test to the forall loop creating the variables. It is not required to add this test to the sums over these variables: only the flow_pr variables that have been created are taken into account. However, if the sums involve exactly the index sets that have been used in the declaration of the variables (here this is the case for the objective function MinCost), adding the existence test helps to speed up the enumeration of the existing index-tuples. The following section introduces the conditional generation in a more systematic way.

Conditional generation — the | operator

Suppose we wish to apply an upper bound to some but not all members of a set of variables x_i. There are MAXI members of the set. The upper bound to be applied to x_i is U_i, but it is only to be applied if the entry in the data table TAB_i is greater than 20. If the bound did not depend on the value in TAB_i then the statement would read:
forall(i in 1..MAXI) x(i) <= U(i)
Requiring the condition leads us to write
forall(i in 1..MAXI | TAB(i) > 20 ) x(i) <= U(i) 
The symbol `|' can be read as `such that' or `subject to'.

Now suppose that we wish to model the following

^MAXI

_{i=1, A_i>20}

x_i 15

In other words, we just want to include in a sum those x_i for which A_i is greater than 20. This is accomplished by
CC:= sum((i in 1..MAXI | A(i)>20 ) x(i) <= 15
Conditional variable creation and create

As we have already seen in the transport example (Section A transport example), with Mosel we can conditionally create variables. In this section we show a few more examples.

Suppose that we have a set of decision variables x(i) where we do not know the set of i for which x(i) exist until we have read data into an array WHICH.
model doesx
 declarations
  IR = 1..15
  WHICH: set of integer
  x: dynamic array(IR) of mpvar
 end-declarations

! Read data from file
 initializations from 'doesx.dat'
  WHICH
 end-initializations

! Create the x variables that exist
 forall(i in WHICH) create(x(i))

! Build a little model to show what esists
 Obj:= sum(i in IR) x(i)
 C:= sum(i in IR) i * x(i) >= 5

 exportprob(0, "", Obj)               ! Display the model
end-model
If the data in doesx.dat are
WHICH: [1 4 7 11 14]
the output from the model is
Minimize
 x(1) + x(4) + x(7) + x(11) + x(14)
Subject To
C: x(1) + 4 x(4) + 7 x(7) + 11 x(11) + 14 x(14) >= 5
Bounds
End
Note: exportprob(0, "", Obj) is a nice idiom for seeing on-screen the problem that has been created.

The key point is that x has been declared as a dynamic array, and then the variables that exist have been created explicitly with create. In the transport example in Section A transport example we have seen a different way of declaring dynamic arrays: the arrays are implicitly declared as dynamic arrays since the index sets are unknown at their declaration.

When we later take operations over the index set of x (for instance, summing), we only include those x that have been created.

Another way to do this, is
model doesx2
 declarations
  WHICH: set of integer
 end-declarations

 initializations from 'doesx.dat'
  WHICH
 end-initializations

 finalize(WHICH)

 declarations
  x: array(WHICH) of mpvar        ! Here the array is _not_ dynamic
 end-declarations                 !  because the set has been finalized

 Obj:= sum(i in WHICH) x(i)
 C:= sum(i in WHICH) i * x(i) >= 5

 exportprob(0, "", Obj)
end-model
By default, an array is of fixed size if all of its indexing sets are of fixed size (i.e. they are either constant or have been finalized). Finalizing turns a dynamic set into a constant set consisting of the elements that are currently in the set. All subsequently declared arrays that are indexed by this set will be created as static (= fixed size). The second method has two advantages: it is more efficient, and it does not require us to think of the limits of the range IR a priori.

Reading sparse data

Suppose we want to read in data of the form

i, j, value_ij

from an ASCII file, setting up a dynamic array A(range, range) with just the A(i,j)= value_ij for the pairs (i,j) which exist in the file. Here is an example which shows three different ways of doing this. We read data from differently formatted files into three different arrays, and using writeln show that the arrays hold identical data.

Data input with initializations from

The first method, using the initializations block, has already been introduced (transport problem in Section A transport example).
model "Trio input (1)"
 declarations
  A1: array(range,range) of real
 end-declarations

! First method: use an initializations block
 initializations from 'data_1.dat'
  A1 as 'MYDATA'
 end-initializations

! Now let us see what we have
 writeln('A1 is: ', A1)
end-model
The data file data_1.dat could be set up thus (every data item is preceded by its index-tuple):
MYDATA: [ (1 1) 12.5 (2 3) 5.6 (10 9) -7.1 (3 2) 1 ]
This model produces the following output:
A1 is: [(1,1,12.5),(2,3,5.6),(3,2,1),(10,9,-7.1)]
Data input with readln

The second way of setting up and accessing data demonstrates the immense flexibility of readln. The format of the data file may be freely defined by the user. After every call to read or readln the parameter nbread contains the number of items read. Its value should be tested to check whether the end of the data file has been reached or an error has occurred (e.g. unrecognized data items due to incorrect formating of a data line). Notice that read and readlninterpret spaces as separators between data items; strings containing spaces must therefore be quoted using either single or double quotes.
model "Trio input (2)"
 declarations
  A2: array(range,range) of real
  i, j: integer
 end-declarations

! Second method: use the built-in readln function 
 fopen("data_2.dat",F_INPUT)
 repeat
  readln('Tut(', i, 'and', j, ')=', A2(i,j))
 until getparam("nbread") < 6
 fclose(F_INPUT)

! Now let us see what we have
 writeln('A2 is: ', A2)
end-model
The data file data_2.dat could be set up thus:

File data_2.dat:
Tut(1 and 1)=12.5
Tut(2 and 3)=5.6
Tut(10 and 9)=-7.1
Tut(3 and 2)=1
When running this second model version we get the same output as before:
A2 is: [(1,1,12.5),(2,3,5.6),(3,2,1),(10,9,-7.1)]
Data input with diskdata

As a third possibility, one may use the diskdata I/O driver from module mmetc to read in comma separated value (CSV) files. With this driver the data file may contain single line comments preceded with !.
model "Trio input (3)"
 uses "mmetc"                      ! Required for diskdata

 declarations
  A3: array(range,range) of real
 end-declarations

! Third method: use diskdata driver
 initializations from 'mmetc.diskdata:'
  A3 as 'sparse,data_3.dat'
 end-initializations

! Now let us see what we have
 writeln('A3 is: ', A3)
end-model
The data file data_3.dat is set up thus (one data item per line, preceded by its indices, all separated by commas):
1, 1, 12.5
2, 3,  5.6
10,9, -7.1
3, 2, 1
We obtain again the same output as before when running this model version:
A3 is: [(1,1,12.5),(2,3,5.6),(3,2,1),(10,9,-7.1)]
Note: the diskdata format is deprecated, it is provided to enable the use of data sets designed for mp-model and does not support certain new features introduced by Mosel.

[Next] [Index]

If you have any comments or suggestions about these pages, please send mail to docs@dashoptimization.com.