Yet Another Math Programming Consultant: 1 vs 4 threads

Wednesday, November 21, 2012

1 vs 4 threads

Running a small MIP using 1 thread:

Optimize a model with 177 rows, 549 columns and 3800 nonzeros
Presolve time: 0.02s
Presolved: 177 rows, 549 columns, 3944 nonzeros
Variable types: 1 continuous, 548 integer (496 binary)
Found heuristic solution: objective 949.0000000

Root relaxation: objective 4.990000e+02, 947 iterations, 0.03 seconds

     0     0 499.00000    0   63 949.00000 499.00000 47.4%     -    0s
     0     0 499.00000    0   93 949.00000 499.00000 47.4%     -    0s
     0     0 499.00000    0   63 949.00000 499.00000 47.4%     -    0s
     0     2 499.00000    0   63 949.00000 499.00000 47.4%     -    0s
H 135   117                     910.0000000 499.00000 45.2% 47.3    0s
* 204    96              44     901.0000000 499.00000 44.6% 47.1    0s
1963   797 891.00000   34   53 901.00000 578.12129 35.8% 46.1    5s
6122 2135 822.02658   39   78 901.00000 720.84013 20.0% 34.5   10s
12502 3801     cutoff   44       901.00000 796.00000 11.7% 27.4   15s
21404 5575 891.00000   51   42 901.00000 862.00000 4.33% 21.7   20s
34494 6391 891.00000   42   53 901.00000 891.00000 1.11% 17.1   25s
52100 4430 891.00000   67   60 901.00000 891.00000 1.11% 12.8   31s
68445 3439     cutoff   72       901.00000 891.00000 1.11% 10.7   35s
94101 2702 infeasible   56       901.00000 891.00000 1.11%   8.9   40s
115791 2055     cutoff   55       901.00000 891.00000 1.11%   8.0   45s
141811 1165 891.00000   40   81 901.00000 891.00000 1.11%   7.2   50s
168722   165 891.00000   44   59 901.00000 891.00000 1.11%   6.7   55s

Explored 177443 nodes (1164346 simplex iterations) in 56.54 seconds
Thread count was 1 (of 8 available processors)

Optimal solution found (tolerance 0.00e+00)
Best objective 9.010000000000e+02, best bound 9.010000000000e+02, gap 0.0%
MIP status(2): Model was solved to optimality (subject to tolerances).

Now solve the same model with the same solver on the same machine using 4 threads:

Root relaxation: objective 4.990000e+02, 947 iterations, 0.03 seconds

     0     0 499.00000    0   63 949.00000 499.00000 47.4%     -    0s
     0     0 499.00000    0   93 949.00000 499.00000 47.4%     -    0s
     0     0 499.00000    0   63 949.00000 499.00000 47.4%     -    0s
     0     3 499.00000    0   63 949.00000 499.00000 47.4%     -    0s
* 141    97              50     901.0000000 499.00000 44.6% 50.9    0s
6876 3503     cutoff   64       901.00000 558.96236 38.0% 30.4    5s
31149 7878 infeasible   58       901.00000 891.00000 1.11% 18.1   10s
72874 4957     cutoff   63       901.00000 891.00000 1.11% 12.2   18s
87864 4665 891.00000   55    8 901.00000 891.00000 1.11% 11.3   20s

. . . .
97009933 49020 897.00000   80   47 901.00000 897.00000 0.44%   5.8 9085s
97054184 47405 infeasible   86       901.00000 897.00000 0.44%   5.8 9090s
97103347 45433 infeasible   95       901.00000 897.00000 0.44%   5.8 9095s
97174397    76 infeasible   97       901.00000 900.91810 0.01%   5.8 9100s

Explored 97174474 nodes (560614904 simplex iterations) in 9100.06 seconds
Thread count was 4 (of 8 available processors)

Optimal solution found (tolerance 0.00e+00)
Best objective 9.010000000000e+02, best bound 9.010000000000e+02, gap 0.0%
MIP status(2): Model was solved to optimality (subject to tolerances).

Solution times go from 1 minute to 2.5 hours! Extrapolating this to 8 cores gives me the shivers….

6 comments:

Tobias AchterbergNovember 26, 2012 at 4:30 AM
I do not think that this points to any weakness in Gurobi's (or any MIP solver's) parallelization. My guess is that what you are observing is just another incarnation of what is known as "performance variability" in MIP. Namely, for some MIP models, seemingly innocent changes like changing the number of threads or even changing the random seed of the solver can have a dramatic impact on solvability of the problem instance. This is not so much a property of the MIP solver, but more one of the MIP model or the actual problem instance. The underlying cause is, of course, the NP-hardness of MIP. If the solver is lucky, it finds the magic cutting plane or the good branching split. If it is unlucky, it triggers the exponential blow-up of NP-hardness.

As I said, some MIP models are more susceptible to performance variability than others. For example, this has been investigated in the work of Emilie Dana and in the MIPLIB 2010 paper of Koch et. al.
Using CPLEX 12.5, you can measure the performance variability for a given MIP model yourself. Just solve the model multiple times, using a different value for the CPX_PARAM_RANDOMSEED parameter each time (use "set randomseed " in the interactive CPLEX shell). Now, the variance across the solve times gives you an indication about the performance variability of the model.
ReplyDelete
Replies
Erwin KalvelagenNovember 26, 2012 at 7:51 AM
See Performance variability in mixed integer programming and miplib5.

I think for this particular case we could reduce the embarrassment factor by solving the problem on 1 thread and on (n-1) threads concurrently.
ReplyDelete
Replies
AnonymousNovember 26, 2012 at 8:32 AM
I can relate to you... happens every time in SAT. I don't know how MT is done in MIP, but in SAT we have multiple solver threads solving the same instance in parallel, sharing information between them. There, the best way to reduce the impact of this problem is by making the different threads take different strategies, and hoping that one is good. However, even with that tactic it is not guaranteed at all that this will not happen. In fact, it happens anyway, but less often. Though in SAT this also happens even in single-threaded mode, if the random seed is changed from one run to the other. I take this as being normal until we don't have a better understanding of SAT and NP-complete problems in general.
ReplyDelete
Replies
Greg GlocknerNovember 26, 2012 at 6:16 PM
We've reviewed the model. This isn't just random variability: it involves how cuts are generated. With Presolve=1, the model solves in under 2 seconds.
ReplyDelete
Replies
Erwin KalvelagenNovember 26, 2012 at 6:26 PM
Wow, speed up of 9100/1.21 or 7520:

Explored 261 nodes (44284 simplex iterations) in 1.21 seconds
Thread count was 4 (of 8 available processors)
ReplyDelete
Replies
Erwin KalvelagenDecember 17, 2012 at 4:41 PM
See also exploiting_erraticism_in_search.pdf on how we could use these variabilities to our advantage.
ReplyDelete
Replies

Add comment