Black box optimization competition

darkmighty|11 years ago

I really dislike the term "Black box optimization". There's no such thing. You have to make assumptions about your function, so in the end this is just rewarding people whose optimizers happen to match the chosen functions; but those functions are not made explicit whatsoever. That doesn't make any sense.

For example, if the output/input are floating point numbers than you can assume the domain/range is [-M,M]. Otherwise, with even the most clever function you have no guarantee of ever approaching the optimum, even if the function is continuous. Now even with a limited range there are no guarantees if the function is not well behaved -- so you have to again assume the function is well behaved. And for any assumption you make there is a condition on function for which it is terrible. There is no best assumption, or best algorithm, then. You could, for instance, assume the function is adversarial (trying to make your life difficult), for which the best algorithm is perhaps just sampling randomly the range, which is really a terrible algorithm -- but that's of course just another assumption, and a terrible one.

I would much prefer 'Typical function optimization', if you're optimizing unlabeled functions so frequently, or at least not try to hide the inevitable assumptions.

TL;DR: The contest may be useful, but the concept of "Black box optimization" is nonsense.

rer0tsaz|11 years ago

The domain is the unit cube [0, 1]^d using double precision floating point, see the documentation.

Making assumptions and testing them is very much part of the contest. You are even allowed to do this interactively.

jpfr|11 years ago

Yes, there is such a thing.

There exist many more techniques than trivially assuming some "template" function and fitting the function parameters against the data.

Have a look at nonparametric modelling techniques. For example kernel regression or gaussian processes. You either don't make any assumptions, or you take an uninformative prior that distributes over all possible results.

This competition evokes modelling, optimisation and the exploration/exploitation tradeoff. I'm sure there will be very interesting theory behind the winning entries...

silentvoice|11 years ago

Maybe a better term should be "blind" rather than black-box. I think the goal is simply to hold optimization to the same level of reproducibility that is expected of most scientific fields today, and if a researcher is allowed to introduce a hundred tunable parameters that makes their algorithm converge on all the standard test cases then they haven't created a reproducible optimizer - they have created a benchmark solver.

enkico|11 years ago

what is "typical" for one can be "rare" for the other, "black box" suggests that the participants do not know what is inside, the organizers on their side should make sure that the content is of some interest for the "real-world problems/applications"

what you are describing is related to the "no free lunch theorem", something one can attempt to deal with to get things working "in practice"

murbard2|11 years ago

It's a little strange that they do not have a track that gives gradient information, given that it is often a real world possibility. Also, this basically allows unlimited time between eval... So this becomes a contest about - coming up with a distribution over R^n -> R function - finding the optimal evaluation points to do Bayesian update

I predict the winner will use some a mixture of Gaussian processes with various kernels and stochastic control (with a limited look ahead, otherwise it blows up) to pick the test points.

pilooch|11 years ago

You can compute the gradient, it just has a high budget cost.

The usual winner is a flavor of CMA-ES, though they may have picked up the functions to avoid this.

obstinate|11 years ago

Seems really interesting. Too mathy for my skillset.

If I may, I propose that the organizers remove the restriction on disassembling the client library or intercepting network connections. This restriction seems like it cannot benefit the organizers, unless the protocol is insecure. People are going to ignore this rule anyway, and you can't stop them or even detect them doing it. So why put it in there? It's only going to generate ill will.

rer0tsaz|11 years ago

It's probably insecure, because you don't want to do 75,850,000 sequential evaluations over a network. It would take over a week for a single track with even just 10ms response time.

darklajid|11 years ago

Whoa. The servers for this competion are about 8km away. That's the most 'local' content I've ever seen on HN.

Unfortunately I have to agree with obstinate here. The pure math is too much for me and reverse engineering (still daunting, but interesting/possible) is not acceptable. If any HN person wins this contest, I offer beers close to the black box :)

pilooch|11 years ago

Shameless plug: anyone interested should be able to get baseline results and above easily by using libcmaes. I am one of the authors with no time to compete, but am interested in reports on how it goes. Also if you are a researcher or a student the lib should let you experiment easily with various custom strategies.

https://github.com/beniz/libcmaes

cshimmin|11 years ago

Wish I had seen something about this sooner. The competition began in January and ends on the 30th of this month.

nullc|11 years ago

That was my thought too! but instead of even click on the HN comments I went and wrote a contestant. Within a couple hours all my runs will have completed; assuming no big power failure I'll make the deadline! :)

(Uh, no doubt I won't do well, since I had no time to ... like.. actually test my code on any functions except a couple trivial trials. :P ... I hope they put up some kind of ranking information as soon as it closes. I have no idea if my results are awful or merely bad :) (and I probably shouldn't share best numbers before it closes) )

enkico|11 years ago

There are two tracks, the second has the deadline in the end of June.

ramgorur|11 years ago

1. You do not know what the function looks like, even there is no gradient information

2. You have a fixed number of probes M

2. Among M, You have N number probes to get the silhouette of the function (exploration).

3. Then from the rest of the (M - N) trials, you need to find the optima (exploiation).

Sounds more like a pseudo-science than a math problem to me.

Houshalter|11 years ago

Huh? Who said it was a math problem? And pseudoscience? Most real world optimization problems are like this. Sometimes don't get gradient information or unlimited trials.

The point of the task is to reward methods that work efficiently with limited trials and domain information, rather than who can run hillclimbing on the biggest computer or hand tune the parameters the best.

25 comments