All posts by klato

January 2022 Meeting Notes

The January 2022 LArSoft Offline Leads status update was handled via email and a google document.

LArSoft – Erica Snider

  • The 2022 LArSoft workplan was approved at the December Steering Group Meeting.
  • The project team continues to make progress toward rolling out phase 1 of the spack migration, but we do not yet have a timeline for completion. The target is still to be ready by the end of January. We will have additional information at the January 25 LArSoft Coordination Meeting.
  • We have performed a number of updates to LArSoft.org, including the addition of information about LArSoft on HPC. If you have experience with running LArSoft on HPC resources, or are working on development toward that goal, please let us know so that we can include it on this page.
  • LArSoft Redmine wiki:  the migration of the LArSoft wiki to GitHub is under way, with the release notes and about a quarter of the remaining pages validated post-migration. Issues with the converted markdown currently limit the rate of progress. We will provide additional information at the January 25 LArSoft Coordination Meeting.
  • In response to numerous questions, we’ve added information on how to cite LArSoft. This is available at:  https://larsoft.org/citing-larsoft/ 

MicroBooNE – Herbert Greenlee

MCC9-related updates were merged into larsoft and uboone suite integration releases as of version v09_41_00.  Refer to talk in Dec. 14, 2021 larsoft coordination meeting.

SBND – Andrzej Szelc

Material, including videos, from the 6th UK LArTPC Software and Analysis workshop in November 2021 is available and on the LArSoft training website.

This informal workshop was intended for LArTPC Software/Analysis beginners (mostly PhD students and post-docs). The aim was for new collaborators on LArTPC experiments to become familiar with the software and analysis tools commonly available to experiments such as MicroBooNE, SBND, DUNE, protoDUNE and ICARUS. The workshop was held in a hybrid mode at the University of Edinburgh and online.

SBN Data/Infrastructure – Joseph Zennamo, Wesley Ketchum

Working through large-scale production testing ahead of next major SBN release. There are major issues in SBND with memory, related to moving to the refactored LArG4 and lingering issues in the ‘rollup’ of truth information from showers. We met with Hans/others earlier in the month and developed a plan, but haven’t seen progress on that yet. This is an urgent need for simulation (and Dom Brailsford reports this is affecting DUNE as well).

DUNE – Heidi Schellman, Tingjun Yang, Michael Kirby

No Report

ICARUS – Daniele Gibin, Tracy Usher

No Report

LArIAT – Jonathan Asaadi

No Report

Please email Katherine Lato or Erica Snider for any corrections or additions to these notes.

August 2021 Meeting Notes

This Offline Leads status update was handled via email and a google document.

LArSoft – Erica Snider

  • Making progress on art 3.09 migration, and have a third release candidate. We are aiming to transition LArSoft to art 3.09 during the week of Aug  9 or 16, depending upon what additional problems are found
  • After art 3.09 is in place, we expect to be in a position technically to migrate LArSoft to a build system based on cetmodules with a spack back end that provides backwards compatible support for UPS. (See the presentation by Chris Green at the Feb 23, 2021 LArSoft Coordination Meeting for some discussion of this migration). Work on this migration will begin immediately after the art 3.09 migration. 
    • Prior to rolling out the new system, we will provide experiments an opportunity to review documentation and our user support, along with a release candidate with the new system. We will seek explicit sign-off from the experiments prior to migration.
    • After this migration, we will begin work on phasing out UPS in preparation for a move to the final spack-only system. Additional user resources will be provided prior to that change.
  • Progress on thread-safety has slowed. The current focus is on converting services that access the database to use the art concurrent caching support infrastructure.
  • Kyle is working on preparing a profiling and optimization presentation, as requested in issue #25831. He proposed three separate 30-minute sessions:
    1. Basics of CPU and memory usage (stacks, caches, heap allocations) and guidelines for their use
    2. Tools for profiling your programs
    3. Stepping through profile results of a sample program

DUNE – Andrew John Norman, Heidi Schellman, Tingjun Yang, Michael Kirby

DUNE has scripts to split up dunetpc, but is waiting for Dom Brailsford to commit a rearrangement of the services fcl files which affect the ability of unit tests to run independently.  They plan on moving to GitHub for the new split repositories.  Heidi and Andrew are evaluating ways DUNE collaborators should use GitHub now that username/password access is disabled and tokens or SSH keys are required.

ICARUS – Daniele Gibin, Tracy Usher

No Report

LArIAT – Jonathan Asaadi

No Report

MicroBooNE – Herbert Greenlee

At the Aug. 24, 2021 LArSoft coordination meeting, Herb presented a plan for reconciling the LArSoft version of data product ParticleID (package lardataobj) with the MicroBooNE MCC9 version.  The long term goal is for MicroBooNE to merge its MCC9 production release updates into the develop branch.  Some follow up work is required to decide between the strategy of updating ParticleID on the develop branch to match MCC9, or adding an entirely new data product class.  The sticking point all along has been backward compatibility with data files written using older versions.

SBND – Andrzej Szelc

No Report

SBN Data/Infrastructure – Joseph Zennamo, Wesley Ketchum

SBN is preparing for the August production in advance of the larger October production push. 

As part of this SBND has migrated to using the latest refactored LArG4 where they have observed issues with the MCParticle collections containing non-unique TrackIDs and SegFaulting when trying to access trajectory information. They have followed up with experts. 

Please email Katherine Lato or Erica Snider for any corrections or additions to these notes.

July 2021 Meeting Notes

Offline Leads meeting – July 15th, 2021

Attendees: Miquel Nebot-Guinot, Andrzej Szelc, Wesley Ketchum, Tom Junk,  Erica Snider, Katherine Lato

LArSoft:

  • Working on making services that access the database use the caching system. (What Kyle Knoepfel presented at the LArSoft Coordination Meeting in November, 2020.)
  • Have been working through issues related to art 3.09 migration.
    • Recently resolved two root issues. One still being tested. 
    • Takes time to iterate on issues in the product stack
    •  Expect to be completed soon.
  • First phase of SPACK migration requires art 3.09, and expect will follow relatively quickly after migration to art 3.09. This phase will be compatible with UPS and mrb, so will not require major changes in how we do things.

Round Robin:

  1. SBN Data/Infrastructure  (Wes, Miquel) 
    1. Need to think about the online systems for SBN. We use UPS local products, and run two environments — one on the DAQ side so more real-time/online, the other on data quality so more like offline. We need to get experience with this for the Spack transition, but nothing appears to be outside current methods. The way we do stuff in the online system mirrors what is done in the offline. 
    2. Looking to freeze the code and get things in order for the next few months. It is advantageous to us to have the new build as soon as possible. ICARUS major physics run in fall. Hoping to get frozen the pieces for the ICARUS data reconstruction. When we freeze the code, we’re going to want to optimize it. May reach out for help on profiling. Hopefully in 2-3 weeks, we’ll have code that does what we want and will have dedicated time for optimizing. This will be our general pattern moving forward:  freeze functional production code, dedicated time for optimization
  2. SBND: (Andrzej)  Getting ready to move to new Geant4 framework. Made a module to take the  CRT output from the new way (SimAuxDetHits) to the old (SimAuxDetChannels)? The module takes hits and packages them as channels. Two objects that are effectively identical. Is this something LArSoft would be interested in? 
    1. Erica:  Contributing that to LArSoft would be good. 
    2. Andrzej: Will let Ivan know to get in touch with LArSoft.
  3. DUNE: (Tom) 
    1. Been working on chopping DUNE TPC into pieces because the builds are slow. Chopped into smaller pieces by taking directories out and assigning them to UPS products. Not too different from how LArSoft arranges things. Wrote a script to do the chopping since code changes while working on the split. One issue, the FHiCL files don’t factor as easily as the code because they are included often and include many other files. The FHiCL files can depend on things not there in the code dependency tree, so if I put a file higher on the tree, it depends on things that aren’t there. Can get around this by setting up the whole tree, but then it’s just like dunetpc now.  For LArSoft, can people set up subsets or must they run the whole thing?
      1. Erica: LArSoft depends on having experiment code — no native detector, for instance, so can’t run anything outside the context of an experiment. Doing an integration type test therefore requires a lot of repositories. I would expect to set up everything to do integration tests. For unit tests, the repositories should be stand-alone if done correctly.. Mrb test runs unit tests at build time on one repository at a time. Historically, there were integration tests (so full art workflows) put in the unit test part. As an aside, would encourage DUNE to strip all that out, put all integration tests into CI workflows. Can define many workflows there if don’t want them all to be run automatically. Then make sure all unit tests are stand-alone, so testable one repository at a time with ‘mrb test’. 
    2. David Adams gave a talk yesterday. Again advocated structuring around art tools
      1. LArSoft MT work is de-servicing as much as we can, since at least some of the current services don’t need to be (e.g., (things where there is no need for global scope). Really just there to take advantage of art state transitions
      2. State transitions can be handled at module scope with tools in many cases. 
      3. Noted that ProtoDUNE pulls event data from a DB. Beam configuration is at the spill level (where a spill is ~15 sec long). Need to optimize DB access for these cases.
    3. Discussed FHiCL structures again w/in context of re-factoring repositories. Long discussion about trade-offs of aggregating configuration versus layering, shortcomings of the current scheme, other ways to organize the layering, the utility of base configurations,… Difficult to summarize, and no clear conclusions.
    4. Noted that since https access to Redmine repositories has been removed, many collaborations who want to develop code (international developers in particular) can no longer check out DUNE code.
      1. So want to deploy to GitHub. 
      2. Wes noted SBN was happy with move to GitHub and use of pull requests. Resulting in better and more stable code. Having pull request mechanism in place is helping to improve the quality and stability of the code. They are starting to do code reviews as the code comes in. Erica echoed similar situation for LArSoft. Particularly good that with pull requests, LArSoft is able to  test the code before merging. Tom not sure DUNE has the effort available. 
      3. Wes commented that there are instructions on Redmine for how to set up a mirror on GitHub, 
    5. Also a discussion of factoring along functional versus detector axes. DUNE has a lot of detectors, so much of the organization is along detector lines. A lot of the code is detector-specific. Can’t use ProtoDUNE code for DUNE FD.
      1. Wes offered to share examples of using two detectors — SBND and ICARUS with common SBN underneath. Driven by how people work. Have two collaborations working together, though, which may not fit the DUNE model as well.

Please email Katherine Lato or Erica Snider for any corrections or additions to these notes.

May 2021 Meeting Notes

This Offline Leads status update was handled via email and a google document.

LArSoft – Erica Snider

  • The previously proposed rollback of hdf5 package will not be necessary. We have the required e20 builds, which required patches to an externally supported package. Thank you to those who followed up with testing the rollback.
  • The migration to art 3.09 is in progress, and is expected to be completed by mid-May. This new version comes with three associated changes:  
    • e20 as the default build qualifier 
    •  new version of root that addresses a problem reading certain files (issue #25615). This version of art is also compatible with cet_modules, and will enable the first phase of migration to the Spack-based build system
    • Tensorflow v2.3
  • SBN previously requested assistance and possibly a tutorial on profiling tools and techniques. The SciSoft team is prepared to provide this assistance. SBN should make a specific request via the Offline Leads Meeting, or Redmine issue ticket
  • Update on the status of memory footprint increase reported by DUNE:  Tom Junk reported some progress on the DUNE side. There has been no further progress to report from the LArSoft side. Kyle Knoepfel is tasked with following up.
  • The project has no progress to report on geometry extensions for pixel detectors

DUNE – Andrew John Norman, Heidi Schellman, Tingjun Yang, Michael Kirby

dunepdsprce, dune-raw-data and dunetpc have been compiled and tested with e20.  It took a little maintenance as data read-in methods sometimes involved creating pointers to elements packed structures complaining about possible unaligned data; e20 emits a warning with these.  All fixed, though if someday in the future 32-bit objects get padded unless we say packed, we could be in for more maintenance.  Tom’s progress with the memory footprint issue consisted of identifying software components that take more memory in larsoft v09_16_00 as compared with v09_15_00, and a lot of it seems to be what ROOT loads with it.

ICARUS – Daniele Gibin, Tracy Usher

No Report

LArIAT – Jonathan Asaadi

No Report

MicroBooNE – Herbert Greenlee

No Report

SBND – Andrzej Szelc

No Report

SBN Data/Infrastructure – Joseph Zennamo, Wesley Ketchum

From email: We have opened a request for a profiling tutorial for SBN developers:

https://cdcvs.fnal.gov/redmine/issues/25831

Please email Katherine Lato or Erica Snider for any corrections or additions to these notes.

April 2021 Offline Leads Meeting Notes

Offline Leads Meeting – April 22, 2021

Attendees: Joseph Zennamo, Tingjun Yang, Erica Snider, Katherine Lato

1) We had a request to migrate “best effort” Ubuntu support from LTS 18 to LTS 20. This requires building under gcc v9 (e20). This now works, so will begin “best effort” for LTS 20. Proposing to move to e20. DUNE and uB have products that will need to be rebuilt with e20.

Discussion: Joseph asked about the impact of shifting to e20? 

Typically aren’t changes to interfaces, but compilers get better at enforcing the standard. Sometimes code that compiles in an earlier version of a compiler doesn’t compile because the code wasn’t compliant with the standard. 

Tingjun noted that they tried to move to e20 for ArgoNeut code. Has some issues with warnings in TenserFlow. Lynn provided a solution, they’re going to test that. May have similar issues with DUNE, should start testing it.

LArSoft will migrate once experiments give the all-clear.

2) There is a request to migrate to TensorFlow v2.3. The project is ready to do this, but we need people from the experiments to check that everything works as required under the new version. Only larrecodnn uses TensorFlow within core LArSoft.  Both argoneutcode and dunetpc use tensorflow.

Discussion: Have expanded the scope of this migration to include moving to the next version of TensorRT (now re-branded as Triton) at the same time. 

Leigh Whitehead said in email several weeks back that  DUNE is ready for TF v2.3. Tingjun noted that some things have changed, so they need to run some of the tests again.

3) Rollback of hdf5 v1_12 to hdf5 v1_10. (Noted that the older version builds with e20)

Discussion: SBN has no immediate use for this, but given their drive to use HPC resources, expect that HDF5 conversions will be a part of the workflow at some point. No opinion at this time.

Discussion at last LCM suggested DUNE is ok with a temporary rollback to hdf5 v1_10. Need to confirm.

4) Round table:

Tingjun: ArgoNeuT and DUNE issues for LArSoft

 

  1. A producer module crashes when reading older data. Submitted an art ticket for that. Kyle was consulting, but it’s been a while and it may impact us soon. Urgent. https://cdcvs.fnal.gov/redmine/issues/25615
  2. Since LArSoft moved to new ROOT version see a 20% increase in memory usage for DUNE production jobs. Reported this via LArSoft issue. Tom Junk is the contact. Not as urgent, but it has impact on our production since we have to request more slots. https://cdcvs.fnal.gov/redmine/issues/25512 
  3. Tom Junk additions after the meeting via email:
    1. dunetpc compiles (and links) with e20 but I have yet to run anything more than an event display with it.  There’s an e20 build of tensorflow v1_12_0d that is included in the dependency tree when I built dunetpc just now with e20.
    2. There were a couple of things in dune-raw-data and dunepdsprce that caused gcc v9_3_0 to emit new warnings but these have been straightened out.
    3. I have tested the rollback to hdf5 v1_10 with the raw data readin source we have in dune-raw-data and it works. There’s now a dune_raw_data v1_18_01 which builds with the older hdf5, ready to go when the rollback is deployed. DUNE also depends on hdf5 via hep_hpc, and there is a rolled-back version of that now (with e20 even.  Thanks, Lynn!) I am discussing with Kyle about how best to do delayed reading with HDF5.  This is important to keep memory consumption down for the DUNE far detector and even helps us with ProtoDUNE data, which I assume will be in HDF5 format moving forwards, if I read peoples’ slides right.  We had it working with ROOT, but it will take some design and coding to get it right with HDF5.
    4. Regarding the memory increase with larsoft v9_16_00 (ROOT v6_22), I ran valgrind and spotted a few things that were taking more memory.  I don’t have solutions, however.

SBN: 

  1. Just started a workflow group with SBN to digest how they’re going to do everything and think about it. There may be things that affect LArSoft in the future, but not right now.
  2. What kind of support on profiling SBN code is there?  For the first pass, is it possible to invite a profile expert to a SBN meeting to help developers and analyzers learn how to use these tools? At least for the simpler ones. Need a discussion suitable for analyzers.

Erica: SciSoft team can assist with profiling. The lab provides a set of profiling tools, though it changes with time. There is expertise within SCD in using these tools. Will try to find someone to provide the requested tutorial.

Please email Katherine Lato or Erica Snider for any corrections or additions to these notes.

February 2021 Offline Leads Meeting Notes

The February 2021 LArSoft Offline Leads status update were handled via email and a google document.

LArSoft – Erica Snider

LArSoft has migrated to art 3.06 as of LArSoft v09_16_00 released on Feb 4. The new version of art includes an update to root, v6_22_06a. Note that this version of root requires an additional set of ups qualifiers, e19:p383b:prof, etc.

In anticipation of the migration to the spack packaging system, we have begun work to migrate the LARSoft build to use cetmodules, which uses spack and CMake instead of cetbuildtools. cetmodules is backwards compatible with UPS, so can be used with either packaging system. There will be no impact on developers and end users when this change is introduced. More details will be provided soon once the full migration plan is completed.

There is a request to migrate to TensorFlow v2.3. The project is ready to do this, but we need people from the experiments to check that everything works as required under the new version. Only larrecodnn uses TensorFlow within core LArSoft. Both argoneutcode and dunetpc use tensorflow.

We also have a request to migrate “best effort” Ubuntu support from LTS 18 to LTS 20. This will probably require that we also migrate to gcc v9 (e20). Please send any comments you have regarding either of these changes. 

DUNE – Andrew John Norman, Heidi Schellman, Tingjun Yang, Michael Kirby

We upgraded DUNE’s stack to the new art and root with very few issues.  A development hdf5 reader module in dune-raw-data had used H5Cpp.h which is now removed and we are looking for alternative methods in hep_hepc but are having troubles finding a method to access file attributes. We are testing the tensorflow v2.3 product.

SBN Data/Infrastructure – Joseph Zennamo, Wesley Ketchum

We did a minor refactoring of some of our experiment code to ease further development and deployment: this has meant setting up an ‘sbnana’ that has minimal dependencies on art/LArSoft, helping us push forward on the CAFAna analysis framework development. We’re working on being ready for e20 releases ASAP, to help the Ubuntu migration to LTS20 go through faster. We’re also looking forward to the updates to the Event Reweighting presented by S. Gardiner, as that will be necessary for handing reweighting systematics in SBN sensitivity studies.

ICARUS – Daniele Gibin, Tracy Usher

No Report

LArIAT – Jonathan Asaadi

No Report

MicroBooNE – Herbert Greenlee

No Report

SBND – Andrzej Szelc

No Report

Please email Katherine Lato or Erica Snider for any corrections or additions to these notes.

January 2021 Offline Leads Meeting Notes

Offline Leads Meeting – Jan. 14th

Tom Junk, Tingjun Yang, Erica Snider, Katherine Lato

LArSoft Update:

  • Spack Update: SciSoft, after getting art to work with Spack, is now focusing on LArSoft. Kyle is working on a draft of the migration plan. 
    • Tom has looked at a few examples, trying things. Did not seem to be “easier” than current ‘system. The hope is that it at least solves some of the technical issues with portability, etc, and is no worse an experience for users. 
    • Two phases for the transition. In the first phase, allow Spack to operate with the underlying UPS. The second will do away with UPS.
  • Tickets for “numerous” items from work plan discussion were mainly ICARUS and SBND, who weren’t at the meeting. Documentation for running in containers was a DUNE request. No one present had tried them, so no feedback on this.
  • Documentation in general. Who do we target? New people, or experts? Is it complete? LArSoft asked for guidance on what to focus on and missing pieces. Also help with identifying things that are out of date.
  • Mu Wei presented Algorithms for calculating number of ions and photons at the January 12th LArSoft Coordination meeting, which led to a proposal to drop support for the Separate algorithm (where number of electrons and photons are uncorrelated). Discussed with those present, who agreed dropping Separate is reasonable.

Round Robin, DUNE:

  • Tom asked about Github going to two-factor authorization in August. Pull request was awkward. User side documentation isn’t as obvious as the administrative, but Tom did find it on the web. Might be helpful to add some pointers in the LArSoft wiki to the correct documentation in GitHub
  • Tingun mentioned a computing tutorial they’re having next Friday that will be going through a lot of LArSoft wiki pages. LArSoft asked to let us know if things are out of date, or need updating prior to that.
  • DUNE has a plan to split dunetpc but have been waiting, first on Spack migration. Have since decided they don’t need to wait on that. Discussed some of the technicalities, such as needing to do the build themselves piece by piece or getting an FWBuild-like recipe. Biggest work is probably breaking up into pieces, which they can do at any time. Will keep LArSoft advised.
  • Tingjun asked about status on updating TensorFlow. Erica noted that as of mid-December, there was a Tensorflow v2.3 build available.  Will inquire about plans for migrating LArSoft, and see about expediting them. 

Please email Katherine Lato or Erica Snider for any corrections or additions to these notes.