Category Archives: news

LArSoft Workshop June 2019

 Workshop Overview

The annual LARSoft workshop was held on June 24 and 25 at Fermilab. There were three sessions:

  • Session 1:  LArSoft tutorial. 
    • Provide the basic knowledge and tools for navigating, using, writing and contributing LArSoft code.
  • Session 2:  Multi-threading and vectorization.
    • Multi-threading and vectorization targeting CPUs and grid processing, giving people the background and tools needed to approach the code and start thinking about making their code thread safe, trying to address memory issues, vectorizing, etc.
  • Session 3: Long-term vision for LArSoft.
    • To discuss ideas and concerns about how LArSoft should evolve with changes to the computing landscape as we move toward the era of DUNE data-taking.

The slides from the speakers can be found on indico. All sessions were recorded. They can be found at: https://vms.fnal.gov/.

Session 1:  LArSoft tutorial

Overview and Introduction to LArSoft 

Erica Snider began the LArSoft workshop with an introduction to LArSoft.  The LArSoft collaboration consists of a group of experiments and software computing organizations contributing and sharing data simulation, reconstruction and analysis code for Liquid Argon TPC experiments. LArSoft also refers to the code that is shared amongst these experiments. Organizing principle for LArSoft based on a layering of functionality, dependencies.

LArSoft is not stand-alone code. It requires experiment / detector-specific configuration. The same basic design pertains to the experiment code. Nothing in core LArSoft code depends upon experiment code.

 

 

 

Technical details, code organization

Saba Sehrish covered repositories, UPS products, setting up and running LArSoft and contributing to LArSoft. There are 18 repositories containing the LArSoft code; each experiment has at least one code repository for detector-specific code.

Simplify your code

Kyle Knoepfel discussed how code becomes complex. Over time, code becomes larger and larger. Ways to combat this include:

  • remove files that you know are not needed
  • remove unnecessary header dependencies
  • remove unnecessary link-time dependencies
  • remove unnecessary functions
  • use modern C++ facilities to simplify your code
  • reduce coupling to art

Pandora tutorial

Andrew Smith discussed pattern recognition in LArTPC experiments, with Pandora being a general purpose, open-source framework for pattern recognition. It was initially used for future linear collider experiments, but now well established on many LArTPC experiments.

Useful source material on Pandora:

  1. Multi-day Pandora workshop in Cambridge, UK – 2016
    • Talks about how the algorithms work and step-by-step exercises about how you might develop a new algorithm using Pandora.
  2. LArSoft workshop in Manchester, UK – 2018
  3. Workshop on advanced computing & machine learning, Paraguay – 2018
    • Talks and exercises about running and using Pandora within LArSoft, including tutorials on using Pandora’s custom event display
  4. Experiment specific resources:

Practical guide to getting started in LArSoft

Tingjun Yang presented a practical guide to getting started in LArSoft. He used  ProtoDUNE examples that apply to most LArTPC experiments. A lot can be learn from existing code, talking to people and asking for help on SLACK.

How to tag and build a LArSoft patch release

Lynn Garren presented on how to tag and build a patch release by an experiment. MicroBooNE is already doing this. LArSoft provides tools, instructions, and consultation.  Up-to-date instructions are available at: How to tag and build a LArSoft patch release.

Session 2:  Multi-threading and vectorization

Introduction to multi-threading and vectorization

Matti Kortelainen discussed the motivation and the practical aspects of both multi-threading and vectorization.

Two models of parallelism: 1) Data parallelism: distribute data across “nodes”, which then operate on the data in parallel  2) Task parallelism: distribute tasks across “nodes”, which then run the tasks in parallel.

Two threads may “race” to read and write. There are many variations on what can happen.

A software thread  is the “Smallest sequence of programmed instructions that can be managed independently by a scheduler.” [Wikipedia]

Vectorization works well for math-heavy problems with large arrays/matrices/tensors of data. It doesn’t work so well for arbitrary data and algorithms.

Making code thread-safe

Kyle Knoepfel discussed how to make code thread-safe. The difficulty of this task depends on the context.

Multi-threaded art

Kyle Knoepfel described multi-threaded art. Modules on one trigger path may not consume products created by modules that are not on that same path. The design is largely based off of CMSSW’s design.

Experience learning to make code thread-safe

Mike Wang described his experience with making LArSoft code thread-safe. Except for the most trivial cases, do not expect to be able to hone in on a piece of LArSoft code (such as a particular service) and work on it in isolation in attempting to make it thread-safe. You are dealing with an intricate web of interconnecting and interacting pieces. Understanding how the code works and what it does, tedious as it may seem, goes a long way in facilitating the process of making the code thread-safe, helping avoid errors that will be very difficult to debug.

Long-term vision for LArSoft Overview

Adam Lyon noted that computing is changing (and the change has changed – GPUs over KNLs.) Future: multi-core, limited power/core, limited memory/core, memory bandwidth increasingly limiting. The DOE is spending $2B on new “Exascale” machines.

The Fermilab Scientific Computing Division is committed to LArSoft for current and future liquid argon experiments:

  • Fermilab SCD developers will continue to focus on infrastructure and software engineering
  • Continue to rely on developers from experiments
  • Continue to interface to neutrino toolkits like Pandora
  • Need to confront the HPC evolution
  • Reduce dependency on the framework

Computing in the time of DUNE; HPC computing solutions for LArSoft

As Giuseppe Cerati noted, technology is in rapid evolution. We can no longer rely on frequency (CPU clock speed) to keep growing exponentially. Must exploit parallelization to avoid sacrificing on physics performance.

Emerging architectures are about power efficiency. Technology driven by Machine Learning applications.

Many workflows of LArTPC experiments could exploit HPC resources – simulation, reconstruction (signal processing), deep learning (training and inference), analysis.

Data management and workflow solutions needed

Mike Kirby discussed data management and workflow solutions needed in the long-term based mainly on DUNE and MicroBooNE. “Event” volumes for DUNE are an order of magnitude beyond collider events. Already reducing the data volume from raw to just hits.

LArSoft framework works wonderfully for processing artroot files – there is a lack of a “framework” for processing non-artroot files (plain ntuples, etc) and this gap could be a problem – CAFAna is actively in use for DUNE and NOvA, but not a fully supported analysis framework.

With multiple Far Detector Modules and more than 100 Anode Plane Arrays possibly readout in a trigger record, the ability to distribute event “chunks” to multiple LArSoft processes/threads/jobs/etc and reassembly into reconstructed events should be explored.

DUNE perspective  on long-term vision

Tom Junk started by discussing the near detector for DUNE. Individual photon ray tracing is time consuming. They are studying a solution using a photon library for ArgonCUBE 2×2 prototype. LArSoft assumes “wire” in the core of Geometry design & APIs to query geometry information. This needs to be changed. Gianluca Petrillo at SLAC designed a generic “charge-sensitive element” to replace the current implementation in a non-distruptive manner. The goal is to run largeant for wire & pixel geometry.

We have some concerns about external source control support. There’s a lot of open-source code out there. Do we have to maintain every piece a DUNE collaborator wants to use?

ICARUS perspective on long-term vision

Tracy Usher pointed out some of the areas where ICARUS is already stressing the “standard” implementation of LArSoft based simulation and reconstruction. ICARUS stands for Imaging Cosmic And Rare Underground Signals. It is the result of some 20+ years of development of Liquid Argon TPCs as high resolution particle imaging detectors from ideas first presented by Carlo Rubbia in 1977.

ICARUS has more sense wires than SBND or MicroBooNE, and has 4 TPCs compared to 2 for SBND and 1 for MicroBooNE. ICARUS has horizontal wires, not vertical and they are split. It was originally optimized for detector Cosmic Rays.

Conclusion

Slides are available at https://indico.fnal.gov/event/20453/other-view?view=standard. There were a variety of speakers and topics, from introductory to advanced HPC techniques, enabling people to attend the sections of most interest to them. The quality of the talks was quite high with a lot of new, interesting content which we can now use to update the documentation on LArSoft.org. It will also serve as an excellent basis for discussion moving forward. 

Thank you to all who presented and/or participated!

Updated Geometry Description – September 2017

Note: Information moved to: https://larsoft.org/important-concepts-in-larsoft/geometry/ on 12/13/21.

Thanks to the help of Erica Snider, Gianluca Petrillo and Thomas Junk, we have an updated Geometry description available in the LArSoft wiki here.  The geometry package contains classes related to the geometry representation such as planes, TPCs, cryostats, etc. The LArSoft geometry provides descriptions of the physical structures and materials in the detector. Some important specifiable parameters in the detector geometry include:

  • the position of the detector relative to the beam
  • the structure and material properties of the cathode planes
  • where individual photon detectors are
  • the placement of wires and the distance between them within a plane
  • the distance between the wire planes
  • the details of the material surrounding the cryostat
  • the composition of the overburden
  • transformations between coordinate systems attached to various elements and global coordinates

The geometry also provides a mapping between sensing elements such as wires or strips and DAQ channels.

LArSoft release 6.28 changed the geometry to support dual-phase TPCs, which caused several assumptions to be removed or to change:

  • the drift direction is no longer assumed to be along x, but can be on any axis
  • the projection of a point on a plane is no longer assumed to have coordinates (y,z)
  • views no longer are assumed to measure a coordinate growing with z
  • the outer plane cannot be assumed to have drift coordinate 0 (same as drift distance)

When updating code, understanding the assumptions at the time the code was written may help explain why certain options were chosen.

For more information, please go to the LArSoft wiki here.

–Katherine Lato

LArSoft Workshop on Tools and Technologies – June 2017

Welcome – Panagiotis Spentzouris

Panagiotis Spentzouris welcomed participants and said the emphasis for LArSoft is on moving forward, improving technologies. Looking at the agenda, that’s what LArSoft is doing. As Panagiotis said, “Improvements in performance have been accomplished … Looking at the path of this project, things are on track.” Panagiotis’ welcome is available as an audio file below.

Introduction and welcome –  Erica Snider

Erica provided an introduction to the workshop, explaining that the reason for the workshop is that, “We want things that make our work easier, that help us produce better code and make our code run faster/more efficiently.” The workshop explores parallel computing, Continuous Integration, a new build system for art/LArSoft called Spack, debugging and profiling tools. Erica’s presentation is available as slides or as a video.

Introduction to multi-threading – Chris Jones

Chris explained that multi-threading is threads that can interact through shared memory, whereas multiple processes cannot. Multi-threading speeds up the program and enables the sharing of resources within the program itself. This is needed because computing hardware trends are that CPU frequencies no longer increase. Included in the presentation were examples of race conditions and that there are no benign race conditions. There are different levels of thread safety for objects:

  • thread hostile – not safe for more than one thread to call methods even for different class instances
  • thread friendly – different class instances can be used by different threads safely
  • const-thread safe – multiple threads can call const methods on the same class instance
  • thread safe – multiple threads can call non-const methods on the same class instance

Chris included things to get ready for multi-threading. This included removing ‘global’ variables, not using mutable member data for event data products and removing mutable state from services. A useful talk about C++ threading memory model is available at: https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutteratomic-Weapons-1-of-2

Chris’s presentation is available as slides or a video.

Vectorization and LArSoft – James Amundson

Single Instruction Multiple Data (SIMD) – a single instruction performs the same operation on multiple data items. We used to be able to just wait for hardware improvements to get faster, but that isn’t true anymore, as the following shows.

SIMD instructions have the potential to improve floating point performance two to 16 times. Jim showed examples of taking advantage of SIMD instructions, encouraging people to look at the different libraries that do this. Many widely-available math libraries include SIMD intrinsic support. He showed a LArSoft roofline analysis performed by Giuseppe Cerati using Intel Advisor.

Jim’s presentation is available as slides or a video.

LArSoft CI System Overview – Vito Di Benedetto 

Continuous Integration is a software engineering practice in which changes in a software code are immediately tested and reported. The goal is to provide rapid feedback to help identify defects introduced by code changes as soon as possible. Vito covered how to run CI and obtain results with detailed examples. The questions included wanting to see a sample script, so Vito brought up the Redmine page:  https://cdcvs.fnal.gov/redmine/projects/lar_ci/wiki. 

Vito’s pesentation is available as slides or a video.

Spack and SpackDev build system -James Amundson

LArsoft and art are moving to the Spack and SpackDev based build system. This system does not have one-for-one replacements for existing tools. Spack is a package manager designed to handle multiple versions and variants – https://spack.io/https://github.com/LLNL/spack. The original plan was to have a full demo at this workshop, but Spack development is behind schedule, so we could only explore Spack functionality. To try Spack, go to https://github.com/amundson/spackdev-bootstrap

Jim’s presentation is available as slides or a video.

Debugging tools – Paul Russo 

Paul explained the basics of using the gdb debugger to find and fix errors in code. The compiler must help the debugger by writing a detailed description of their code to the object file during compilation. This detailed description is commonly referred to as debugging symbols.

Paul’s presentation is available as slides or a video.

 Profiling Tutorial – Soon Yung Jun

Soon Yung gave an introduction to computing performance profiling with an overview of selected profiling tools and examples. Performance tuning is an essential part of the development cycle. Performance benchmarking quantifies usage/changes of CPU time and memory (amount required or churn). Performance profiling analyzes hot spots, bottlenecks and efficient utilization of resources. Soon also gave profiling results of the LArTest application with IgProf and Open|Speedshop.

Soon’s presentation is available as slides or a video.

Feedback

Comments include:

  • I found most of it useful. I’m fairly new to all of this, so it was mostly all new, useful information.
  • Increasing the portions for beginners or have a stand alone workshop for beginners.
  • More hands-on, interactive tutorials for the audience, even if it means installing some software/programs in advance.
  • I would like to have some topics which may be aimed towards participants who are less-than experts with the hope that us physicists can be more than just second-rate programmers.
  • Mostly; I thought the topics were good choices but would have preferred it to be a bit more hands-on.
  • I think these topics were the right ones; maybe even more on running on HPC resources next time.

Thanks to all who presented and/or participated.

Updated Continuous Integration – May 2017

Information moved to: https://larsoft.org/continuous-integration/ on 12/13/21.

Various studies have shown that the later in a project errors are found, the more it costs to fix them.[1][2][3] The Continuous Integration (CI) approach of software development adopted by LArSoft can save time by finding errors earlier in the process. As Lynn Garren noted, “Using CI allows you to test your code in the complete LArSoft environment independent of other people, which can save you time.” Another important benefit is noted by Gianluca Petrillo, “By using the CI system, I gain confidence that my code changes will not break other people’s work since the tests in the CI system will notify me if something went awry.”

The LArSoft Continuous Integration system is the set of tools, applications and machines that allows users to specify and automate the execution of tests of LArSoft and experiment software. Tests are configured via text files in the various repositories and are launched in response to an http trigger initiated manually by a user or automatically on every git push command to the develop branch of the central repositories. Arguments in the trigger specify run-time configuration options such as the base release to use, the branch within each repository to build (the develop branch by default), and the test “suite” to run, where each suite is a workflow of tests specified in the configuration files. Prior to each test, the new code is built against the base release. The CI system reports status, progress, test results, test logs, etc. to a database and notifies users with a summary of results via email. A web application queries the database to display a summary of test results with drill-down links to detailed information.

As Erica Snider said, “People focus on the testing needed for their particular experiment.  One benefit of our CI system is that, every time a change is made, it automatically tests the common code against all of the experiment repositories, then runs the test suites defined by each of the experiments. This allows us to catch interoperability problems within minutes after they are committed.”

There have been a number of updates to CI in recent months aimed at making the system easier and more intuitive for users, so please try it.

You can find more information about CI at http://larsoft.org/continuous-integration/ with detailed instructions on how to run jobs at: https://cdcvs.fnal.gov/redmine/projects/lar_ci/wiki

Training – March 2017

There is a new training page on the larsoft.org website that provides useful information on training for LArSoft in one spot. While designed for people new to LArSoft, it has links to information that may be of value to all. It can be found at  http://larsoft.org/training/

A valuable source of training are the annual LArSoft workshops. In 2017, the workshop will be held on June 20. The topics will be:

  • SPACK build system tutorial – In-depth, hands-on exploration of features, how to configure builds
  • Introduction to concurrency – What this means, why it is important, and what it implies for your code. A general discussion in advance of multi-threaded art release
  • Debugging and profiling tutorial
  • Continuous Integration (CI)  improvements and new features

Information will be added to the  Indico site including presentations.

–Katherine Lato

Expectations for Contributing Code to LArSoft – January 2017

Information moved to: https://larsoft.org/expectations-for-contributing-code-to-larsoft/ on 12/15/21.

While developing experiment-specific code, a developer may work on  a feature (such as an algorithm, utility, improvement) that can be shared with the community of experiments that use LArSoft. In most cases, this can be done easily since the modified code does not affect the architecture and doesn’t break anything. But when the feature affects the architecture, a few more steps are needed to ensure that such contributed code can be properly shared across experiments and integrates well into the existing body of code. This article outlines the process to introduce and integrate  into the core LArSoft code repositories both non-architectural features and architectural-affecting features.

The intent of the process is to achieve a smooth integration of new code, new ideas or improvements in a coordinated way, while at the same time minimizing any additional work required beyond the normal development cycle. Many changes will not require prior discussion. In cases with broad architectural implications getting feedback and guidance from the LArSoft Collaboration or the core LArSoft team early in the development cycle may be required.

Process for contributing a non-architectural, non-breaking change to LArSoft:

  1. Become familiar with the design principles and coding guidelines of LArSoft.
  2. Develop the code including comments, tests and documentation.
  3. Offer it by talking about it to LArSoft team members and maybe give a presentation at the LArSoft Coordination Meeting.

The rest of this article addresses a breaking, architectural change to LArSoft. Remember, all other cases will involve a more simplified process.

Process for contributing an architectural, breaking change to LArSoft:

  1. Someone working on an experiment has an idea, an improvement, or a new feature that affects the LArSoft architecture that can be shared in the core LArSoft repositories.
  2. Developer contacts the LArSoft Technical Lead or other members of the core LArSoft team to discuss the idea. Discussion can include an email thread, formal meeting, chat in the hallway, phone call, or any communication that works.
    • May find that the feature, or a suitable alternative, is already in LArSoft, thus saving the need to develop it again.
    • At this point, a decision will be made as to whether further discussion or review is needed, and where that discussion should take place. If other experts are likely to be required, a plan for including them in the process will be developed. The division of labor for the architectural changes will be discussed as well.
  3. Developer learns/reviews the design principles and coding guidelines of LArSoft.
  4. Developer prepares a straw proposal or prototype for the change.
  5. For major changes as determined in step (2), the proposal should be presented at the biweekly LArSoft Coordination Meeting. Depending on the feedback, more than one presentation may be useful.
  6. The developer writes the code, including comments, tests, and examples as needed, and keeps the LArSoft team informed on the status of work.
    • Any redmine pages or other technical documentation should be written during this time as well.
    • For new algorithms and services, an entry in the list of algorithms  should be made.
  7. When development is completed,  request that it be merged onto the develop branch since this is a breaking change.

There is a cost in making things workable for other experiments, but the benefit is that other experiments develop software that is usable by all experiments. The more this happens, the more all experiments benefit.

When designing LArSoft code, it’s important to understand the core LArSoft suite and all the components used by it. It’s also important to follow the design principles and coding guidelines so that what is developed can be maintained going forward. A good place to start is at Important Concepts in LArSoft  and by reading the material and watching the video about Configuration. It is important to follow the guidelines to have configuration-aware code that is maintainable with easy-to-understand configurations. The less time that is spent on debugging, the more time that can be spent on physics.

Once LArSoft contributors are aware of how LArSoft works, including the principles and guidelines for developing software, and have discussed the new feature with the LArSoft team, they will usually be asked to make a presentation at the LArSoft Coordination meeting. The idea here is to share the plan and the approach to solicit more ideas. Treat the initial presentation as a straw proposal–something that is designed to solicit feedback, not the final implementation that must be defended. At the same time, if a suggestion would double or triple the work required to implement it, and there isn’t a strong need for that suggestion to be implemented, it can be noted and set aside. The contributor is in charge of what he or she implements. The goal is to share software to reduce the development effort overall. It also encourages communication across experiments. The more collaborations can benefit from each other’s work, the better off we all are.

Details on developing software for LArSoft can be found at Developing with LArSoft wiki pageBy contributing to the common LArSoft repositories, improvements to the code can then be used by everyone.

— Katherine Lato & Erica Snider

CERN LArSoft Tutorial Report – November 2016

Introduction

CERN had a one-day LArSoft tutorial designed for those who are joining reconstruction and analysis efforts in Liquid Argon experiments but are new to the LArSoft framework. These sessions were video recorded. The recordings are available along with the presentations – see the links at the end of this write-up.

Organizers of the event would like to acknowledge Marzio Nessi for supporting the event and Audrey Deidda and Harri Toivonen for their significant help with solving organizational issues.

LArSoft tutorial overview

screen-shot-2016-11-21-at-9-22-28-am
Screen Shot from Video of the day

The participants came from many scientific research fields, not only from the Liquid Argon community, so in the morning session, the Liquid Argon TPC technology was presented. This introduction included discussion of a number of the challenges related to detection technology as well as challenges in  data reconstruction and analysis. This background was followed with a short introduction to the LArSoft framework.

The morning hands-on session was led by Theodoros Giannakopoulos who did an introductory talk  about the organization of the Neutrino Cluster at CERN. He also covered part of the LArSoft installation and explained how to set up the environment in Neutrino Cluster machines.

During the hands-on session, most of the participants logged into the Neutrino Cluster nodes using their own laptops with their own favorite operating system and shell. Having computing experts there was very important, especially Theodoros Giannakopoulos who was extremely helpful.

The purpose of the hands-on session was to learn how to run simulation and reconstruction jobs in LArSoft and how to visualize events  to inspect results. As an example we chose 2 GeV/c test beam pion in the ProtoDUNE-SP geometry. Robert Sulej’s slides guided participants during the hands-on session. In many places, we pointed to the in-depth materials from the Fermilab LArSoft tutorial and also linked to slides from the YoungDUNE tutorial, e.g. event display slides were very helpful.

The afternoon session started with a talk by Wesley Ketchum titled “Reconstruction as analysis.” The reconstruction in Liquid Argon is a challenging task, but developing reconstruction algorithms to overcome a problem is also a great source of satisfaction. It is also very important to make progress in the Liquid Argon data analysis. As Wes said, it is  the “process of taking raw data to physically-meaningful quantities that can be used in a physics analysis.” Wes presented an overview of reconstruction algorithms in LArSoft and several use cases from MicroBooNE data analysis experience.

screen-shot-2016-11-21-at-9-20-41-am
Diagram from Wesley Ketchum’s presentation

Wes prepared examples for the afternoon hands-on session during which he introduced Gallery as a light-weight tool for simulation and reconstruction results analysis appropriate also for the exploratory work before code is moved to the LArSoft framework. A short introduction  was followed by coding examples. The last part of the hands-on session explained how to “write and run your own module, step by step.”

Participants learned about the organization of LArSoft modules and algorithms and their configuration files. The aim of the coding exercise was to access information stored in data products (clusters) and associations between them (hits assigned to clusters) so participants could make histograms of simple variables such as number of cluster.  They also had the opportunity to learn about matching the reconstruction results to the simulation truth information.

Two weeks after the tutorial, several people are still in contact with us, progressing with their understanding of the framework and starting their real developments. The tutorial reached  many members of double phase LArTPC technology. This shows the advantages of having a detector agnostic approach where development efforts can be efficiently used. In the end we are all heading towards the same physics goals.


Links to Material

The video from the morning session is available here. It covers the following presentations:

  1. LArTPC detector basics, ProtoDUNE, DUNE goals  – Dorota Stefan(CERN/NCBJ (Warsaw PL))
  2. Introduction to LArSoft and DUNE tools – Robert Sulej, (FNAL / NCBJ)
  3. Tutorial part I – Dorota Stefan (CERN/NCBJ (Warsaw PL)) , Wesley Ketchum (Fermi National Accelerator Laboratory) , Robert Sulej (FNAL / NCBJ)

Note, you need a CERN user account to access cluster and do exercises using preinstalled software. It is possible to use local installation on a laptop. See the information at the beginning of the Indico page about the session.

The video from the afternoon session is available here. It covers the following presentations:

  1. Reconstruction as analysis? – Wesley Ketchum (Fermi National Accelerator Laboratory)
  2. Tutorial part II – Wesley Ketchum (Fermi National Accelerator Laboratory) , Robert Sulej (FNAL / NCBJ) , Dorota Stefan (CERN/NCBJ (Warsaw PL))

— Dorota Stefan & Katherine Lato

LArSoft/LArLite integration – September 2016

IMG_2852This summer, I helped create the first Liquid Argon Software (LArSoft) shared algorithm repository. Shared repositories are git-maintained directories of code that do not require  the underlying framework or other LArSoft dependencies. By using shared repositories,  development of common software can occur both in LArSoft and in a third-party system.  One such system is LArLite, a lightweight analysis framework developed by MicroBooNE scientist Kazuhiro Terao. Users of LArLite only need to set up the desired shared repositories to gain access to the code.

I worked on LArSoft/LArLite integration within the Scientific Computing Division (SCD),  which provides core services to the multitude of experiments being conducted at Fermilab. By enlisting computer scientists as a separate but well-connected group to the experiments, expert knowledge is focused to provide experiment analysts with an abundance of purpose built, robust computing tools.  I used the groundwork laid by a SCD project finished in the spring of 2016 to support shared repositories in LArSoft.

The computing needs of each experiment are unique, though all share some common elements. For example, many experiments at Fermilab employ art as their event processing framework, producing well-documented and reproducible simulations. Since its creation, the development of LArSoft has been guided by the needs of experiments to provide a sturdy framework for simulation, reconstruction and analysis of particle physics interactions in liquid argon. LArSoft is a collection of algorithms and code that exist in many git repositories, and corresponding products managed by the Unix Product Support (UPS) system. The UPS system provides a simple and safe method of establishing working environments on Unix machines while avoiding dependency conflicts. This system is available on most computing systems provided by Fermilab and can also be easily set up by users on their own systems. UPS is  relied upon for ensuring consistency between software environments. To cater to users’ needs, LArSoft has to be flexible to incorporate useful code developed independently of Fermilab supported machines.

The LArLite code repository  was written to allow for liquid argon analysis in a light­weight, python friendly system with a fast set­up procedure. Development and progress on algorithms has been made in this framework particularly because of its quick set­up time on personal computers that don’t have a well populated UPS area. Many of the core algorithms used in LArLite, however, are either not available in LArSoft or follow a very similar design to those in LArSoft causing a duplication of effort.

Having two copies of an algorithm exist in different build environments is undesirable as this creates discrepancies and incompatibilities. These discrepancies and incompatibilities become increasingly hard to rectify if changes are made on either side. For this reason, the shared repository must be able to be built with both native build systems. Another issue arises as LArLite was not developed with shared repositories in mind and has no tagged releases. LArLite only has a monitored master branch which may not be compatible with shared repositories in the future. For shared LArLite code to be reliably used within LArLite as well as LArSoft, a stable supported release needed to be created.

Working in the SCD, I created a first demonstration of using shared repositories for merging LArSoft and LArLite by moving the LArLite code GeoAlgo into the shareable repository larcorealg.  Building on the earlier groundwork, I ensured compatibility between the two native build systems, CetMake and GNUmake, and resolved dependency issues by using the UPS system and editing LArLite core files.  After this demonstration, a more complicated  LArLite algorithm, Flash Matching, was moved to a shared repository in coordination with the algorithm developer, Ariana Hackenburg. Here, LArLite services and data products were interchanged for their corresponding and newly shared versions  in LArSoft. An integrated LArSoft/LArLite system benefits all users, as it maintains the setup speed of LArLite to support a rapid development cycle, while allowing experiments like MicroBooNE to officially adopt and implement important algorithms as part of LArSoft with no additional effort. Most importantly, it allows users on both sides to access previously inaccessible algorithms and data.

The introduction of LArLite algorithms into shared git repositories demonstrates the potential for including other third-party software in a similar fashion. Collaborative liquid argon software in shared repositories will require the work and commitment of many physicists. It’s my hope that the effort is made to include more algorithms into shared repositories since it has the potential to improve liquid argon based experiments at Fermilab.

–Luke Simons

LArSoft Workshop Report – August 2016

The 2016 LArSoft Usability Workshop on June 22 and June 23 focused on usability, interfaces and code analysis. Over 50 people registered, filling the rooms. Remote attendance was supported as well. The section titles listed below are links to the material presented, where available.


Table of Contents:
1 – Videos
2 – Opening Remarks
3 – Recent Relevant LArSoft Efforts
4 – Gatekeeping
5 – Pandora
6 – Documentation
7 – SpackDev
8 – FHiCL-file
9 – FHiCL Case Study
10 – gallery
11 – Peer Review
12 – Code Analysis Process
13 – GENIE and Geant4
14 – Performance Profiling
15 – Feedback


During the meeting, a dozen issues were recorded at: https://cdcvs.fnal.gov/redmine/projects/larsoft/issues 
About half of the sessions were recorded. (Not by design. This was the first time we tried to capture video.) Note that these video recordings were not the main point of the workshop, and they have been edited to avoid side conversations at the beginnings and ends of presentations and beeps as people joined the bridge. Feedback is appreciated. Please contact klato@fnal.gov.

2016-06-23 10.25.232016-06-23 13.50.37

Erica Snider – Opening Remarks

The two-day workshop began with a review of usability and the conference by Erica Snider. LArSoft is the users’ code. The Fermilab LArSoft team is here to help and input from the users is critical to what we do. Interruptions were encouraged throughout the workshop.

Panagiotis Spentzouris – Welcome

Although Panagiotis Spentzouris was not able to attend the opening session, during his remarks on the second day, he said, “LArSoft is one of the centerpieces of our strategy for developing and supporting common tools for experiments with experiments.” It’s not an easy model. It requires discipline from the users and SCD support. The work of the experts on the SCD side and the users’ work has made it a success, and it is important to continue along this path.

Steve Brice – Steering Group Welcome

Steve Brice gave the steering group welcome. (You can listen to part of it below, if you like.)

The liquid argon effort involves many detectors in increasingly complex ways with experiments being able to learn from one another. The LArSoft team has achieved enormous success in connecting the different endeavors. Going forward, it will get more difficult. “Keep up the good work. It’s greatly appreciated and widely acknowledged.”

Gianluca Petrillo – Recent relevant LArSoft efforts

Usability is defined by ISO 9241-11 as, “The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.”

After last year’s workshop, LArSoft interviewed a number of users and stakeholders, identified several areas of intervention and collected a list of desirable items.  We selected some of them: examples and use of associations. There is a wiki page listing the new examples, and we’re happy to write more. The examples follow best practices including documentation in Doxygen format. The LArSoft team endorses test writing and can provide help in writing tests. Several questions/comments covered issues about recording tests, how tests evolve, the need for user help to make them better as well as making the code better by extending the interface. The video of the presentation is available here.

Erica Snider – Maintaining balance in LArSoft: gate-keeping, usability, progress

We have to balance the ability to introduce new ideas and new features quickly with writing production-quality code. Users depend on production quality and developers depend on agility. Both of these are important to usability. Discussing code changes at a coordination meeting is necessary for all changes that affect behavior, i.e. most cases.

There are policies, guidelines and standards at all levels: design principles, coding guidelines, git branching model and documentation guidelines. The point of this model is to produce shareable, relatively uniform code that is recent. That’s why we have weekly releases. The focus is on finding consensus across experiments for changes, which requires gate-keeping. We want to find the right balance- enough agility to keep people interested in producing code with enough gate-keeping to keep people using it. Are we in the right spot? The video of the presentation is available here.

John Marshall – Lessons learned from collaborative software development using Pandora

The Pandora multi-algorithm approach to pattern recognition uses large numbers of algorithms (80+), each designed to address specific topologies and gradually build-up a picture of events. It relies on functionality provided by the Pandora Software Development Kit (SDK), documented in EPJC 75:439. Algorithms are structured around a number of key operations and can be written in pseudo-code form; what differs between real algorithms is the precise logic, based on topological information, that determines when to request operations such as merging or splitting Clusters (collections of Hits). Ideas about Event Data Model requirements, developer training, communication and style guides were presented, alongside feedback from Pandora developers. The best features include the ability to quickly test new ideas, that the SDK services can be trusted completely, the simple XML-based configuration and the visual debugging functionality (seeing a problem presented visually can lead to rapid understanding). Things that haven’t worked so well include the build mechanics (a difficulty associated with inclusion in multiple software frameworks) and attempts to provide some globally reusable features (the current geometry model, and some Hit properties, still lean towards their collider-detector usage). The video of the presentation is available here.

Katherine Lato – Documenting LArSoft for ease of use and learning

Good documentation is needed at all levels. For example, when documenting a LArSoft algorithm, it’s important to have a high-level view available at http://larsoft.org/list, which includes a link to a detailed document, like a technical note available to the entire LArSoft community. Comments in the header and in the implementation code should be in a format that enables Doxygen to interpret them. It’s also important to describe important parts in the code. See LArSoft wiki (Redmine) page Code Documentation guidelines for a suggested template to follow. Don’t forget in-line comments in the code for maintainability. The video of the presentation is available here.

Patrick Gartung – SpackDev: a new development environment for LArSoft

We are looking at new build systems because the current system using environment variables has problems running on new operating systems and in Linux containers. Spack-builds a stack of dependent software packages. It is open source, well documented and community supported with information available from:

Spack generates environment modules that can be used to set up the environment. It will have a different hash. There are still things that need to be worked out. The video of the presentation is available here.

Kyle Knoepfel – Configuration best practices, helpers and FHiCL-file validation

Kyle walked through setting up a FHiCL file, including how to set up a PROLOG, the #include facility and the rules about using it. The directory of a file to be included must be present on the FHICL_FILE_PATH environment variable. Don’t abuse #include. Only #include files that contain only prologs. A common frustration is how to know what parameters to specify for a given module. While looking at the source code is one solution, it would be better to devise a system that documents itself. Kyle demonstrated the ‘art –help’ and ‘art –print-available-modules’ facilities. For the ‘art –print-description’ facility, the allowed configuration is printed to the screen if users provide the appropriate C++ structure. The main point is that easy-to-maintain and understandable code means more time spent on physics instead of debugging coding errors.

Robert Kutschke – A case study in using FHiCL to ensure single points of maintenance

When designing its use of FHiCL, Mu2e set two goals: parameters should have a single point of maintenance and FHiCL that can be run interactively should be runnable on the grid without editing. Rob’s talk showed several FHiCL fragments from actual Mu2e MC production FHiCL files. One technique is to define base configuration fragments and apply deltas to that base; this naturally leads to deep configuration hierarchies. The trick to making an interactive FHiCL file grid-ready is to design the grid scripts and the interactive FHiCL files together – the grid scripts do their job by applying prefixes and postfixes to the interactive FHiCL file. We have never needed to “reach in and edit” a FHiCL file as part of a grid job.

Marc Paterno – Using gallery for data access: with demo and discussion

gallery provides access to event data in art/Root files outside the art event processing framework executable without the use of EDProducers, EDAnalyzers, etc., thus, without the facilities of the framework (e.g. callbacks from framework transitions, writing of art/ROOT files). The distribution bundle larsoftobj was introduced to give a single-command installation for all the UPS products needed to use gallery to read LArSoft-created art/ROOT files. Installation instructions are at http://scisoft.fnal.gov/scisoft/bundles/larsoftobj/ (look for the newest version and view the HTML file for instructions). Marc provided a demo that read through 100 files and filled three histograms. He showed the code and talked through it, answering questions.

‘Discussion of Ideas’ and ‘Using tips and techniques on your code.’

Several issues were recorded from the afternoon working sections.

Robert Kutschke – Peer review: It’s not just for physics anymore!

This is about things we should have been doing all along. The HEP Analysis Peer Review Process works really well and is similar. Rob covered what we do well with HEP and Integration Testing and some lessons learned from the Mu2e Reviews such as a lot of the value came from the prep work for the review. Lessons from the software development community are that many errors were found by the author when preparing for the review. Use the specialists. Since much value is in the preparations, that means having deadlines, profiling and prep presentation.

Erica Snider – LArSoft code analysis process

Writing good code is a process. Things would get better if we reviewed our code. Want the code analysis process to be as light-weight as possible for the situation and to be performed collaboratively with the code author(s). Erica went through the five basic steps of a review and the recent example of PMA analysis. LArSoft hopes to create a culture that seeks code analysis. Thanks to Mike Wallbank and Bruce Baller for volunteering their code for the afternoon code analysis during the workshop.

Robert Hatcher – LArSoft use of GENIE and Geant4

If you’re going to generate events with GENIE, read https://cdcvs.fnal.gov/redmine/projects/nutools/wiki/GENIEHelper. “It will save you a world of hurt.” The improvement in usability and performance of the interface to Geant4 needs to be a cooperative effort between the PDS group and LArSoft. The video of the presentation is available here.

Christopher Jones – You too can do performance profiling

Types of profiling include timing and memory. The tool igprof can do both. Chris covered how to use igprof. The video of the presentation is available here.

Code and performance analysis working groups

Two groups focused on analyzing two coding examples volunteered by their authors, Bruce Baller: TrajCluster algorithm and Michael Wallbank: BlurredCluster algorithm.

Feedback

Comments include:

  • I enjoyed talking about LArSoft and the future of the project. I was very encouraged by all that was said and feel exciting times lie ahead for the software! Also enjoyed the code analysis and discussion of new feature of art/fhicl.
  • The LArSoft steering group welcome set a strong, positive tone for the workshop.
  • It was better than expected. I supposed it was something more introductory but actually it was oriented to people who wants to contribute to LArSoft and not only be users. So it is very helpful for my work.
  • It was a little more advanced/high-level than I’d anticipated, but very useful.
  • It did not have as many overview talks as I expected. And there seemed to be few talks by the experiments.
  • Useful how-to instructions on slides. In the code review section, Mike Wallbank was cutting and pasting igprof commands from Chris Jones’s highly useful slides.
  • I was able to discuss with the experts about the issues I didn’t understand or what I thought it was missing.

Thanks to all who presented and/or participated!

Opening the box: event reconstruction using Pandora – June 2016

by  Jack Weston for the Pandora Team, Cavendish Laboratory, University of Cambridge

The Pandora multi-algorithm approach to automated pattern recognition affords a powerful framework for event reconstruction in fine-granularity detectors, such as LAr TPCs. 

With its origins at the proposed International Linear Collider (ILC) experiment, the Pandora project began in 2007 as a particle flow calorimetry algorithm; that is, an algorithm that reconstructs the paths of individual particles, taking advantage of high-granularity tracking detectors and calorimeters. Since then, Pandora has grown into a flexible software framework for solving generic pattern recognition problems that involve points in time and space, particularly suited to the “photographic quality” images produced by LAr TPCs.

Behind Pandora’s approach to pattern recognition is the multi-algorithm paradigm: the idea that each particular topology can be addressed by one or more relatively straightforward, self-contained algorithms. Under this paradigm, solving a complex pattern recognition problem, such as reconstructing particles in a LAr TPC (Figure 1), requires a sizable chain of such algorithms, whose order of execution is able to exhibit data-dependent nonlinearity and recursion. Some algorithms are very sophisticated, other rather simple; together, they gradually build up a picture of the events. This approach marks a significant departure from the traditional practice of single algorithms for shower finding and track fitting, for example, and promises to provide a robust way of tackling intricate pattern recognition problems. It also affords a rich development environment since many algorithms, each addressing a given topology in a different way, can be safely bound together and work in harmony. This approach is able to provide a more sophisticated and accurate reconstruction than would be reasonably achievable by writing one algorithm alone.

"Figure

Figure 1: A 3D Pandora reconstruction, projected into the w-view. The input hits are from a Monte Carlo simulation of a 0.8 GeV charged-current νμ event with resonant π0production. The event shows a proton (grey), a muon (red), and two photons (green, blue). The event is visualised using the Pandora event display, where the x axis represents a distance derived from drift time and the w axis a distance derived from the number of the wire recording the ionisation signal.

Figure 2: The Pandora reconstruction output in the LArSoft Event Data Model (EDM). The PFParticle object is associated with the 3D hits, clusters, tracks, seeds and vertices of the particle it represents. In addition, PFParticle parent-daughter links allow representation of a full particle hierarchy.
Figure 2: The Pandora reconstruction output in the LArSoft Event Data Model (EDM). The PFParticle object is associated with the 3D hits, clusters, tracks, seeds and vertices of the particle it represents. In addition, PFParticle parent-daughter links allow representation of a full particle hierarchy.

In a LAr TPC, the readout provides three 2D images of the event within the active detector volume. Each image shares a common coordinate, derived from the drift time. The second coordinate is derived from the number of the wire recording the ionisation signal in a given plane. The reconstruction begins with cautious, track-oriented 2D clustering before using a series of topological-association algorithms that conservatively merge and split the 2D proto-clusters. Considering pairs of 2D clusters, an extensive list of possible 3D vertex positions is produced. A score is calculated for each candidate vertex and the best one is chosen. The 3D vertex position can then be projected back into the readout planes and used, for example, to split the 2D clusters at the projected vertex position if required. To reconstruct 3D tracks, Pandora uses track-matching algorithms to exploit the time-coordinate-overlap between all possible groupings of 2D candidate clusters in different planes to predict the position of the cluster in the third plane. The time-overlap span, the proportion of matching sampling points and a chi-squared value are all neatly stored in a rank-three tensor, where the three indices are the clusters in each of the three views. The tensor is interrogated by a series of algorithm tools which identify any matching ambiguities and make careful changes to the 2D clusters until the tensor is diagonal and the cluster combinations are unambiguous. These matches are stored in ‘particles’, which provide a convenient means for collecting together objects reconstructed in the three readout planes.

Showers can then be reconstructed (in 2D) by attempting to add branches to long clusters that represent shower spines—the showers are grown outwards from a spine by identifying branches, then branches-of-branches, and so on. To synthesize the 2D shower information into 3D showers, the tensor ideas can be re-used: the fitted shower envelopes for each pair of views are used to predict an envelope in the third view, and the enclosed hit fraction in this view is stored in a tensor, along with other cluster-matching properties. An analogous process of tensor-diagonalisation then yields the best 3D matches. Finally, 3D hits are created and the track, shower and vertex information is brought together to produce a full particle hierarchy, where each daughter particle comprises metadata, a list of 2D clusters, a 3D cluster, a 3D interaction vertex, and a list of any further daughter particles (Figure 2).

Figure 3: Pseudocode demonstrating procedures for creating and merging clusters and their associated API calls. This logic is almost identical between algorithms, regardless of the pattern recognition problem.
Figure 3: Pseudocode demonstrating procedures for creating and merging clusters and their associated API calls. This logic is almost identical between algorithms, regardless of the pattern recognition problem.

Such a multi-algorithm approach poses a significant software challenge. The Pandora Software Development Kit provides a software environment designed specifically for this purpose: it offers the basic building-blocks for abstracting high-level structure from a collection of points, such as clusters, as well as the means for users to adapt these building-blocks to suit their problem. It also provides the framework for running the complex, nonlinear chain of algorithms required and automates the handling of the building-blocks, allowing developers’ algorithms to focus on physics rather than technicalities like memory management. This facilitates rapid algorithm development. Instances of the objects in the Event Data Model (EDM) are owned by Pandora Managers, which are responsible for object and named list lifetimes. These Managers automate a complete set of low-level operations that facilitate the high-level operations required by pattern recognition algorithms. The algorithms contain the step-by-step instructions for finding patterns in the input data and use APIs to modify, create or destroy objects and lists. Much of the technical difficulty in writing highly object-oriented algorithms like these is alleviated by Pandora’s provision of APIs able to access and manipulate objects in the Pandora EDM, such as getting a named list of hits or creating new clusters. These APIs can be used in algorithms to interact with Managers and provide common low-level logic in a concise way. Examples of the API calls that would be encountered in algorithms for creating and merging clusters are shown in Figure 3. Such procedures are common to all pattern recognition problems, differing only in the logic that determines the ‘best’ clusters, which is specific to the problem at hand. Pandora also contains a ROOT-based event display that allows the navigation of particle hierarchies, 2D and 3D hits, clusters and vertices, as well as providing invaluable visual debugging possibilities within algorithms.

Pandora works in conjunction with LArSoft by providing the ideal framework for the pattern recognition step. An art producer module creates Pandora instances, and configures and registers algorithms. For each event, hit information is passed from LArSoft into Pandora and, at the end, the Pandora reconstruction output, in the form of PFParticles, is passed back to LArSoft. Whilst LArSoft remains the core framework for the simulation and reconstruction process, this multi-algorithm reconstruction cannot be easily realised using LArSoft alone: Pandora provides the tools necessary for the job, as well as an environment conducive to developing algorithms under this paradigm. All interaction between LArSoft and Pandora is handled by the LArSoft module LArPandora, which serves as their mutual interface as well as translating Pandora’s inputs and outputs into the required formats (Figure 4). This module depends only on the library LArPandoraContent which, in turn, depends only on the Pandora SDK and visualisation libraries.

Figure 4: Cartoon indicating the structure of the Pandora packages in LArSoft. This includes the LArSoft module LArPandora, the LArPandoraContent library and the SDK and monitoring libraries. Note that monitoring library has a ROOT dependency. The SDK and monitoring library provide a framework that is independent of the pattern recognition problem to be solved: all the LAr TPC reconstruction logic resides in the LArPandoraContent library. The LArPandora module provides a LArSoft/Pandora interface for translating Pandora's inputs and outputs.
Figure 4: Diagram indicating the structure of the Pandora packages in LArSoft. This includes the LArSoft module LArPandora, the LArPandoraContent library and the SDK and monitoring libraries. Note that monitoring library has a ROOT dependency. The SDK and monitoring library provide a framework that is independent of the pattern recognition problem to be solved: all the LAr TPC reconstruction logic resides in the LArPandoraContent library. The LArPandora module provides a LArSoft/Pandora interface for translating Pandora’s inputs and outputs.

At MicroBooNE, two different algorithm chains are used: PandoraCosmic, optimised for the reconstruction of cosmic rays and their daughter delta rays, and PandoraNu, optimised for the reconstruction of neutrino events. By running the PandoraCosmic chain first, cosmic rays can be identified and removed, providing the input for the PandoraNu reconstruction. Starting by manipulating the three sets of hits in 2D before synthesising these into 3D structures, Pandora gradually builds up a picture of the event through clustering, vertexing, and track and shower reconstruction. The full reconstruction currently consists of nearly 80+ different algorithms that, together, abstract from the input 2D hits an information-rich 3D particle hierarchy, including information like particle types, vertices and directions.

At the upcoming DUNE experiment, the detector will comprise multiple LAr TPC detector volumes, each of which will provide two or three 2D images. Events may cross the boundaries between volumes, posing a new challenge for the reconstruction. To solve this problem, a Pandora instance is created for each volume, which provides a MicroBooNE-like reconstruction of that detector region. Bringing together these independent reconstructions, logic is then required to decide which particles to stitch together across gaps—and to determine which particle is the ‘parent’. While algorithm re-optimisation to suit the different beam spectrum is required, the reusable Pandora interfaces ensure that MicroBooNE developments are directly applicable to DUNE.

Pandora provides a flexible development platform for developing, visualising and testing pattern recognition algorithms. The LArPandoraContent library offers an advanced and continually improving reconstruction of cosmic ray and neutrino-induced events in LAr TPCs and is used by the MicroBooNE and DUNE experiments. Pandora’s GitHub page can be found at https://github.com/PandoraPFA, with an accompanying paper describing the SDK on arXiv: http://arxiv.org/pdf/1506.05348v2.pdf (EPJC 75:439).

The Pandora Team. Photo courtesy The Pandora Team.
The Pandora Team. Photo courtesy Boruo Xu.