Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patatrack integration - ECAL local reconstruction (7/N) #31719

Merged

Commits on Dec 29, 2020

  1. Implement a Heterogeneous version of Raw2Cluster and RecHit (#62)

      - reorganize `SiPixelRawToDigi` as `SiPixelRawToDigiHeterogeneous` using `HeterogeneousEDProducer`
          - output a `HeterogeneousEvent`
          - use `PixelThresholdClusterizer`
          - add `SiPixelDigiHeterogeneousConverter`
          - make cabling and gain transfers asynchronous
      - reorganize `SiPixelRecHits` as `SiPixelRecHitHeterogeneous`
      - move `PixelThresholdClusterizer` (back?) to interface+src in order to use it outside of RecoLocalTracker/SiPixelClusterizer
      - replace __host__ __device__ with constexpr to avoid weird compilation failures
      - split clusters to their own converter
    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    6de171e View commit details
    Browse the repository at this point in the history
  2. Prototype for EventSetup data on GPUs (#77)

    Adds a prototype for dealing with EventSetup data on GPUs. The prototype is applied to the ES data used by Raw2Cluster (cabling map etc, gains) and RecHits (CPE).
    
    Now it is the `ESProduct` who owns the GPU memory. Currently each of the affected `ESProducts` have a method `getGPUProductAsync(cuda::stream_t<>&)` that will allocate the memory on the current GPU device and transfer the data there asynchronously, if the data is not there yet. The functionality of bookkeeping which devices have the data already, and necessary synchronization between multiple threads (only one thread may do the transfer per device) are abstracted to a helper template in `HeterogeneousCore/CUDACore/interface/CUDAESProduct.h`.
    
    Technical changes:
      - `EventSetup`-based implementation for GPU cabling map, gains, etc
      - add support for multiple devices to `PixelCPEFast`
      - abstract the `EeventSetup` GPU transfer
      - move `malloc` and transfer to the lambda
      - move `cudaFree` outside of the `nullptr` check
      - move files (back) to the plusing directory
      - rename `siPixelDigisHeterogeneous` to `siPixelClustersHeterogeneous`
    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    b7e339b View commit details
    Browse the repository at this point in the history
  3. Synchronise with CMSSW_10_2_0

    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    9af927e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a4c526b View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    fcf8d67 View commit details
    Browse the repository at this point in the history
  6. Next prototype of the framework integration (#100)

    Provide a mechanism for a chain of modules to share a resource, that can be e.g. CUDA device memory or a CUDA stream.
    Minimize data movements between the CPU and the device, and support multiple devices.
    Allow the same job configuration to be used on all hardware combinations.
    
    See HeterogeneousCore/CUDACore/README.md for a more detailed description and examples.
    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    fa62e7d View commit details
    Browse the repository at this point in the history
  7. Various updates to pixel track/vertex DQM and MTV (#285)

    * Add DQM for pixel vertices
    
    * Add pT>0.9GeV pixel track collections to MTV
    
    * Add dzPV0p1, Pt0to1, Pt1 variants of pixel track DQM
    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    795efee View commit details
    Browse the repository at this point in the history
  8. Reimplement the ECAL multifit with CUDA, for an HLT-like configuration (

    #335)
    
    Reimplementation of the orignal cpu-based ECAL multifit to run on Nvidia GPUs with CUDA,
    for an HLT-like configuration, where regression matrices are fixed at 10x10 (no dynamic pedestal, etc...).
    
    The main computation type is float by default, configurable at compile time, and the minimization (Cholesky + solvers, etc...) is implemented using using the Eigen library.
    The timing computation is implemented, but is not run by default at HLT.
    
    Tha implementation:
      - uses one CUDA stream per EDM stream;
      - the EventSetup conditions are updated on the GPU only they the change, via the CUDAESProduct mechanism;
      - only the per-event data is transferred for each event;
      - the results are optionally copied back to the host and synchronised by the produce() method.
    
    A simple tool is available to validate cpu vs gpu results.
    
    Known issues and to do list:
      - add a module to convert from new format to the legacy format;
      - make use of the CUDAService framework for the device selection, stream handling and memory allocation;
      - investigate some instabilities in the Cholesky decomposition vs original cpu version, specifically [for fnnls](https://github.com/cms-patatrack/cmssw/pull/335/files#diff-ed446c49128ac6dc6f45eeebab079613R70) causing rare, but noticeable discrepancies between cpu and gpu versions.
    vkhristenko authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    7006648 View commit details
    Browse the repository at this point in the history
  9. Extend the ECAL validation on GPU (#359)

    Extend the ECAL validation tool:
      - skip events with a mismatched number of hits, instead of aborting;
      - add 2D plots for energy and chi2 differences;
      - save all plots also in PDF format.
    
    Move it to the .../bin directory and rename it to makeEcalGpuValidationPlots.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    b803812 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    6e015d3 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    4d43970 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    249aa64 View commit details
    Browse the repository at this point in the history
  13. Remove unused class_def rules

    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    8f7fc7a View commit details
    Browse the repository at this point in the history
  14. Fix clang warnings (#387)

    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    419aa82 View commit details
    Browse the repository at this point in the history
  15. Replace use of API wrapper stream and event with plain CUDA, part 1 (#…

    …389)
    
    Replace cuda::stream_t<> with cudaStream_t in client code
    Replace cuda::event_t with cudaEvent_t in the client code
    Clean up BuildFiles
    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    93dd141 View commit details
    Browse the repository at this point in the history
  16. Implement changes from the CUDA framework review (#429)

    Rename the cudautils namespace to cms::cuda or cms::cudatest, and drop the CUDA prefix from the symbols defined there.
    
    Always record and query the CUDA event, to minimize need for error checking in CUDAScopedContextProduce destructor.
    
    Add comments to highlight the pieces in CachingDeviceAllocator that have been changed wrt. cub.
    
    Various other updates and clean up:
      - enable CUDA for compute capability 3.5.
      - clean up CUDAService, CUDA tests and plugins.
      - add CUDA existence protections to BuildFiles.
      - mark thread-safe static variables with CMS_THREAD_SAFE.
    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    7cd3e8e View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    b00d983 View commit details
    Browse the repository at this point in the history
  18. Implement the ECAL unpacker on GPUs (#443)

    Implement the ECAL unpacker running on GPU from RAW data, and update the downstream modules to use the new data format.
    vkhristenko authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    6be25f4 View commit details
    Browse the repository at this point in the history
  19. Clean up ECAL unapcker code (#444)

    Fix compilation warnings, remove commented out code, and apply code formatting rules.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    823d7fe View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    ed9326c View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    92f357b View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    b426a89 View commit details
    Browse the repository at this point in the history
  23. Update the ECAL local reconstruction Tasks and Sequences

    Move the `ecalDigis` and `ecalMultiFitUncalibRecHit` modules to
    separate Tasks.
    
    Implement an `ecalOnlyLocalRecoTask` based on the `ecalLocalRecoTask`,
    without the trigger primitive-related modules.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    2c85634 View commit details
    Browse the repository at this point in the history
  24. Implement the ECAL-only gpu workflows (#450)

    Let EcalCPUDigisProducer optionally produce dummy ECAL integrity collections.
    
    Implement the use of the "gpu" process modifier for ECAL-only reconstruction
    workflows:
      - 10824.512: TTbar, 2018 realistic conditions, no pileup;
      - 11634.512: TTbar, 2021 realistic conditions, no pileup.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    e4df1da View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    a48be4c View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    6299135 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    7f23bbf View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    da82b8d View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    33690a2 View commit details
    Browse the repository at this point in the history
  30. Add customisations for profiling the ECAL-only workflow (#472)

    customizeEcalOnlyForProfilingGPUOnly:
      Customise the ECAL-only reconstruction to run on GPU.
      Currently, this means running only the unpacker and multifit,
      up to the uncalbrated rechits.
    
    customizeEcalOnlyForProfilingGPUWithHostCopy:
      Customise the ECAL-only reconstruction to run on GPU, and copy
      the data to the host.
      Currently, this means running only the unpacker and multifit,
      up to the uncalbrated rechits.
    
    customizeEcalOnlyForProfiling:
      Customise the ECAL-only reconstruction to run on GPU, copy the
      data to the host, and convert them to legacy format.
      Currently, this means running only the unpacker and multifit,
      up to the uncalbrated rechits, on the GPU - and the rechits
      producer on the CPU.
      The same customisation can be also used on the CPU workflow,
      running up to the rechits on CPU.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    bd26eba View commit details
    Browse the repository at this point in the history
  31. Implement ECAL rechits on GPU (#462)

    Include standalone executables for the validation of ECAL uncalibrated
    and calibrated rechits.
    amassiro authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    3360c67 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    b370e33 View commit details
    Browse the repository at this point in the history
  33. Fixes for ECAL rechits on GPU (#475)

    EcalLinearCorrectionsGPU:
      - add missing cudaFree()
    
    EcalRecHitBuilderKernels:
      - fix warnings due to unused variables
      - fix incorrectly skipping some channels due to the use of "return" in the kernel loop
      - add default values for dead/invalid channels
    amassiro authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    7833581 View commit details
    Browse the repository at this point in the history
  34. Enable ECAL rechits reconstruction on GPUs at HLT (#477)

    Extend the ECAL HLT cutomisation to run the ECAL rechit producer on GPU.
    
    Add an edm::LogError in case of too many channels for the rechits and uncalibrated rechits producers.
    amassiro authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    a63a202 View commit details
    Browse the repository at this point in the history
  35. Restructure code to work around CUDA build limitations (#483)

    Move ECAL and HCAL CUDA code to plugins.
    General cleanup: remove unused code, apply clang-format and various include changes.
    Fix product labels for HCAL rechits on CPU.
    
    Co-authored-by: Andrea Bocci <[email protected]>
    vkhristenko and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    e481300 View commit details
    Browse the repository at this point in the history
  36. Apply code formatting (#486)

    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    244cfa0 View commit details
    Browse the repository at this point in the history
  37. Comment unused variable eps_diff (#490)

    Comment instead of removing because there is commented code using eps_diff
    makortel authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    228625e View commit details
    Browse the repository at this point in the history
  38. Remove duplicate dictionary definitions (#489)

    Remove dictionary definitions for classes already defined in CUDADataFormats/StdDictionaries.
    
    Co-authored-by: Andrea Bocci <[email protected]>
    makortel and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    167fd46 View commit details
    Browse the repository at this point in the history
  39. Backport: add ECAL-only and HCAL-only workflows for MC and data (cms-…

    …sw#30350)
    
    Backport cms-sw#30105: add ECAL-only workflows for data.
    Backport cms-sw#30136: add HCAL-only workflows for MC and data.
    mariadalfonso authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    3d9b272 View commit details
    Browse the repository at this point in the history
  40. Update ECAL and HCAL reconstruction to run on multple GPUs [1/3] (#502)

    Use caching allocators for host and device CUDA memory.
    Use dedicated ESProducers to make part of the modules' configuration available on all GPUs.
    Rename hcal and hcal::common namespaces to to calo::common.
    vkhristenko authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    765a48f View commit details
    Browse the repository at this point in the history
  41. Update ECAL and HCAL reconstruction to run on multple GPUs [2/3] (#508)

    Add missing ESProducers for ECAL and HCAL GPU modules: add to the
    offline workflows and to the HLT customisations the ESProducers required
    to complement the configuration of the ECAL and HCAL GPU modules.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    3079f23 View commit details
    Browse the repository at this point in the history
  42. Update ECAL and HCAL reconstruction to run on multple GPUs [3/3] (#504)

    Fix fillDescriptions() for EcalRecHitParametersGPUESProducer.
    amassiro authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    2a384fa View commit details
    Browse the repository at this point in the history
  43. Apply code formatting

    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    3bbc28f View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    8a663d1 View commit details
    Browse the repository at this point in the history
  45. Configure the number of ECAL barrel and endcap channels separately (#517

    )
    
    Fix memory allocation issues.
    
    Apply come code clean up:
      - remove outdated comments;
      - replace MYMALLOC macro with a lambda;
      - reuse named values from EcalDataFrame.
    amassiro authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    f384542 View commit details
    Browse the repository at this point in the history
  46. Apply code formatting (#524)

    VinInn authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    0962cd4 View commit details
    Browse the repository at this point in the history
  47. Refactor common ECAL and HCAL code (#523)

    Move duplicated Eigen code to a common file, and use it for both ECAL and HCAL.
    Move HCAL general reconstruction code from the hcal::multifit to the hcal::reconstruction namespace.
    mariadalfonso authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    c15e453 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    2567694 View commit details
    Browse the repository at this point in the history
  49. Further clean up after merging CMSSW_11_2_0_pre7 (#556)

    Minor bug fixes:
      - fix a typo in EventFilter/EcalRawToDigi/plugins/BuildFile.xml .
    
    Clean up:
      - remove obsolete ArrayShadow class;
      - remove obsolete profiling functions.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    20731ea View commit details
    Browse the repository at this point in the history
  50. Move multifit/MAHI common code to DataFormats/CaloRecHit (#557)

    Move multifit/MAHI common code to DataFormats/CaloRecHit/interface/MultifitComputations.h .
    Improve naming and description of fnnls parameters.
    Use Eigen preprocessor symbols instead of explicit CUDA keywords, and CUDA preprocessor symbols to protect CUDA-only functions.
    
    Co-authored-by: Andrea Bocci <[email protected]>
    mariadalfonso and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    240d1d8 View commit details
    Browse the repository at this point in the history
  51. Address ECAL review comments regarding preprocessor directives (#558)

    Adjust include guards.
    Adjust comments on preprocessor macros.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    3bef19e View commit details
    Browse the repository at this point in the history
  52. Reduce code duplication in CPU and GPU modules (#566)

    Move HCAL constants to a separate file, and update CPU and GPU code accordingly.
    
    Add an infinite-IOV record for GPU modules configuration: as an interim approach,
    some modules are using the ventSetup approach to copy complex configurations to
    the GPUs; the new "JobConfigurationGPURecord" should be used for those, to make
    it easier both to highlight the intent, and clean up the client code when a
    better solution is found.
    
    Replace ECAL and HCAL job configuration records with JobConfigurationGPURecord.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    9e3750f View commit details
    Browse the repository at this point in the history
  53. Address ECAL review comments regarding python files (#563)

    Clean up ecalRawDecodingAndMultifit.py .
    Rename ecalMultiFitUncalibRecHitGPUcfi.py to ecalMultiFitUncalibRecHitGPU_cfi.py .
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    f5ad690 View commit details
    Browse the repository at this point in the history
  54. Move common ESProducer templates to ConvertingESProducer(WithDependen…

    …cies)T (#569)
    
    Move few similar implementations of templated ESProducers
      - EcalESProducerGPU
      - EcalRawESProducerGPU
      - HcalESProducerGPU
      - HcalESProducerGPUWithDependencies
      - HcalRawESProducerGPU
    to a common implementation under HeterogeneousCore/CUDACore/ .
    
    Adapt all client code accordingly.
    
    Do not use transient handles to avoid ESProducers taking references to
    transient memory objects.
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    04f3c46 View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    e52e730 View commit details
    Browse the repository at this point in the history
  56. Refactor ECAL and HCAL chi2 code (#567)

    Factor out the chi2 computation from the ECAL multifit and HCAL MAHI code,
    and move it to MultifitComputations.
    mariadalfonso authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    16b46a5 View commit details
    Browse the repository at this point in the history
  57. Clean up ECAL local reconstruction code (#591)

    Clean up ECAL local reconstruction code:
      - improve comments for documentation
      - remove commented out code, unused functions and unused header files
      - introduce better variable names
      - migrates to esConsumes
      - store ParameterSet by value instead of by reference
      - move __syncthreads() outside of the if blocks to avoid possibly blocked threads
      - peplaces ecal::abs() with std::abs()
    
    Co-authored-by: amassiro <[email protected]>
    2 people authored and fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    0214a0f View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    9a1a0a8 View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    e0092bd View commit details
    Browse the repository at this point in the history
  60. Update the default configuration of ECAL modules (#598)

    Update the default configuration of "EcalRawToDigiGPU" and "EcalUncalibRecHitProducerGPU":
      - enable the ECAL pulse timing computation in the offline workflow;
      - change the ECAL GPU digis label to match the CPU ones.
    
    Other clean up:
      - remove an unused cfi file;
      - rename "EcalUncalibRecHitMultiFitAlgo_gpu_new" to "EcalUncalibRecHitMultiFitAlgoGPU".
    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    9666459 View commit details
    Browse the repository at this point in the history
  61. Apply code formatting

    fwyzard committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    01c1f96 View commit details
    Browse the repository at this point in the history

Commits on Jan 15, 2021

  1. Configuration menu
    Copy the full SHA
    92f3342 View commit details
    Browse the repository at this point in the history