- Algorithm or Service name
- Image Pattern
- Author or Maintainer
- Robert Sulej
- one line description
- Set of algorithms, modules and scripts performing pattern recognition in 2D images made of wire ADC waveforms. Algorithms are based on convolutional neural networks. There are tools for data preparation, running the network in the training and inference m
Package consists of the following components:
1) TrainingDataAlg algorithm class and PointIdTrainingData/PointIdTrainingNuevent modules. Perform dump of wire ADC waveforms to text files which are used for the preparation of training sets for various applications. Waveforms are downscaled with configurable factor and downscale method. Currently ADC waveform is deconvoluted and amplitudes are corrected against attenuation due to electron lifetime (option to skip these steps to be added). Three files are produced for each event: a) file .raw containing downscaled ADC waveform, each line in the file is corresponding to one wire; b) file .pdg with PID/vertex MC truth info, containing entries corresponding to those in .raw file; PID information is PDG code of particle which deposited max energy in the “pixel”; PID is saved on 2 lower bytes and vertex info is saved on 2 higher bytes; vertex flags are described in: larreco/RecoAlg/ImagePatternAlgs/PointIdAlg/PointIdAlg.h; c) file .deposit, with MC truth energy deposits projected to 2D plane, entries also corresponding to .raw file information.
Information in these files should be generic enough to allow developments of wide range of CNN applications.
2) prepare_data_cnn_* scripts
Python scripts for training set preparation. Currently CNN is used to classify point in the image on basis of surrounding patch. Scripts are creating such patches from files produced in 1), selecting balanced amounts of interesting classes and background. The output are .npy files with CNN input and desired output.
3) train_cnn_* scripts
Scripts which perform CNN training, based on Keras toolkit. Model and weights files are produced.
Script in larreco/RecoAlg/ImagePatternAlgs/Keras, used to convert CNN model and weights files into a single text file which can be used run the network in the inference mode back in the LArSoft module. This is the solution fastest to develop (but not fastest in the execution) to allow using CNN results in the LArSoftís reconstruction flow, in future it should be replaced with more efficient, vectorized code.
5) PointIdAlg class and PointIdEffTest / EmTrackClusterId modules
Algorithm and modules which are running pre-prepared CNN model in the inference mode. The application is distinction between EM shower and track-like signals. EmTrackClusterId module uses as input the output from clustering algorithms and produces new collection of clusters, which contains only clusters recognized as EM shower-like activity. Modules with similar structures will be added for the vertex and decay point identification.
- location in code
- LArReco repository larreco/RecoAlg/ImagePatternAlgs
- code analysis done
- improved code released