some questions about the code #1

meet-cjli · 2019-04-20T12:24:29Z

Recently, I have read this paper and code. However, I have some questions about them. Firstly, I can't find details about the TYPE_CONJUGATE branch, especially the early termination criterion. Moreover, there is no such a function called eval_z. Finally, I can't understand the initial value of alpha and beta. Why do you set this as the initial value? And in the TYPE_CONJUGATE branch, you added 0.5 on the original basis.
And in
alpha = torch.log(X.new_ones(*size) / m) beta = torch.log(X.new_ones(*size) / m) exp_alpha = torch.exp(-alpha) exp_beta = torch.exp(-beta), there is a negative sign. However, in the Algoritom2, it's a positive sign. I have tried to set it as a positive sign, it still works. Is that means the initial values will have few impacts on the results?

The text was updated successfully, but these errors were encountered:

riceric22 · 2019-04-22T07:46:38Z

Hey @Sudsakorn123! The following should hopefully answer all of your questions.

The TYPE_CONJUGATE branch is an implementation of Equation 14 in the paper in Section 5.3, on provable defenses, for Wasserstein balls.
The early termination criteria is not anything algorithmically special: it's an optimization to the algorithm which checks whether the objective can even change to begin. In other words, it checks whether it is even possible for the objective to change within the ball within the local transport restriction. If there is no mass to move in areas with non-zero objective, then there's no need to run the sinkhorn iteration since the objective is the same throughout the ball, and so we can terminate early.
eval_z seems to have been accidentally removed while I was cleaning the code, I will add it back in shortly, thanks!
The initial values for alpha and beta are taken to reflect the same initial value used in the original sinkhorn algorithm from Cuturi 2013 (Algorithm 1). Beyond that, there's no other justification.
In the sinkhorn iteration, there is a factor of exp(-1). In the algorithm box of our paper and for the 2NORM branch, we combined this into the K variable, however in other texts (e.g. Cuturi 2013) it's common to distribute this exp(-1) factor in to the alpha and beta variables with exp(-1/2) each. You rightfully point out that the code is confusing since the 2NORM branch uses the version in the paper, whereas the conjugate code uses the latter description, however both of these are equivalent.
The initializations for alpha and beta for the conjugate branch, as you pointed out, are indeed not consistent with Algorithm 2 and the negative sign is a mistake. However, it will still work as you observed, since these are just the initial values: the projected sinkhorn iteration is a strictly convex problem, and it can be shown that the coordinate descent procedure is guaranteed to converge to the optimal value. It will still certainly be better of course for me to correct this initialization in the code.

I will leave this issue open until I've added back in the eval_z function, fixed the initialization for the conjugate branch, and made the location of the exp(-1) factor consistent in both branches.

ptpam · 2019-07-12T00:40:09Z

I want to ask something related to runtime. Is it the case that one step in an epoch takes around 1m30s for you as well?

riceric22 · 2019-07-18T18:52:36Z

Hey @ptpam, with batch sizes 128, it takes about 30-40 seconds to do a step of adversarial training (so doing PGD for at most 50 iterations on a single minibatch with the default parameters in adv_training_cifar.py using a single 2080ti)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some questions about the code #1

some questions about the code #1

meet-cjli commented Apr 20, 2019

riceric22 commented Apr 22, 2019 •

edited

Loading

ptpam commented Jul 12, 2019

riceric22 commented Jul 18, 2019

some questions about the code #1

some questions about the code #1

Comments

meet-cjli commented Apr 20, 2019

riceric22 commented Apr 22, 2019 • edited Loading

ptpam commented Jul 12, 2019

riceric22 commented Jul 18, 2019

riceric22 commented Apr 22, 2019 •

edited

Loading