Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

element indexing for batchfillerimage2d does not follow C-memory convention #29

Open
twongjirad opened this issue Oct 1, 2018 · 1 comment

Comments

@twongjirad
Copy link
Contributor

twongjirad commented Oct 1, 2018

In batchfillerimage2d,

the array is defined with elements (caffe mode):

(batchsize,channel,nrows,ncols)

this implies that ncols is the "contiguous" dimension. For example, if I defined a (4,3) 2d array in c++:

#include <iostream>

int main(int nargs, char** argv ) {

  float foo[4][3] = { {0,1,2}, {3,4,5}, {6,7,8},{9,10,11} };
  int nrows = 4;
  int ncols  = 3;

  std::cout << "foo[2][1]=" << foo[2][1] << std::endl;
  std::cout << "foo[ncols*2 + 1]=" << *(&foo[0][0] + ncols*2+1) << std::endl;
  std::cout << "foo[nrows*1 + 2]=" << *(&foo[0][0] + nrows*1+2) << std::endl;

};

The outcome of the above is:

twongjirad@blade:~/working/nutufts/larflow/ana$ g++ -o foo foo.cxx
twongjirad@blade:~/working/nutufts/larflow/ana$ ./foo 
foo[2][1]=7
foo[ncols*2 + 1]=7
foo[nrows*1 + 2]=6

However, in batchfillerimage2d, the tensor is defined with following dims,

      if (_caffe_mode) {
        dim[0] = batch_size();
        dim[1] = _num_channels;
        dim[2] = _rows;
        dim[3] = _cols;
     }

but the way the image is indexed is

size_t caffe_idx = 0;
    for (size_t row = 0; row < _rows; ++row) {
      for (size_t col = 0; col < _cols; ++col) {
        _caffe_idx_to_img_idx.at(caffe_idx) = col * _rows + row;
        ++caffe_idx;
      }
    }

Note that in "C" memory layout, the expectation would be

row*_ncols + col 

as the right-most dimension should be the most "rapidly" changing dimension.

This might be for historical reasons for caffe, but this could cause (is causing) confusion about laying out tensors properly in training and analysis routines for pytorch and tensorflow where we are routinely reshaping/manipulating the tensor.

What to do? I would like to change this, but changing this is a BREAKING for everyone.
Are we stuck with this?

@drinkingkazu
Copy link
Contributor

My question:
I'm on vacation today/tomorrow so my response is slow :( I have 1st question though: do you have to use caffe mode (BCHW)?

My opinion:
Now assuming the answer is yes... I don't think this mode is used so I would be 100% pro to changing it, but it should support (BCHW) mode obviously. I have to look closely to make a useful input... so that has to be a bit later... I am not sure if I understand the "C" memory layout argument: this is 1D array filling and how the ordering should be is determined solely by the comparison of how numpy/caffe/whatever and image2d interpret 1D array as 2D array. Or I am missing something clearly.

Not useful comments:
The code is migrated from larcv to larcv2 and I did test right off the migration, but since then it's not tested. We may have changed row vs. col definition since then (i.e. I remind this convention is never unique between numpy vs. opencv + standards by other 2D visualization libraries I know), and maybe the change was not propagated to this code...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants