finemo
Fi-NeMo: Finding Neural Network Motifs.
A GPU-accelerated motif instance calling tool for identifying transcription factor binding sites from neural network contribution scores.
Fi-NeMo implements a competitive optimization approach using proximal gradient descent to identify motif instances by solving a sparse linear reconstruction problem. The algorithm represents contribution scores as weighted combinations of motif contribution weight matrices (CWMs) at specific genomic positions.
Key Features
- GPU-accelerated hit calling using PyTorch
- Support for multiple input formats (bigWig, HDF5, TF-MoDISco)
- Competitive motif instance assignment
- Comprehensive evaluation and visualization tools
- Post-processing utilities for hit refinement
Modules
- hitcaller : Core Fi-NeMo algorithm implementation
- data_io : Data input/output utilities
- main : Command-line interface
- evaluation : Performance assessment tools
- visualization : Plotting and report generation
- postprocessing : Hit refinement and analysis
Examples
Basic hit calling workflow:
>>> import finemo
>>> from finemo import data_io, hitcaller
>>>
>>> # Load preprocessed data
>>> sequences, contribs, peaks_df, has_peaks = data_io.load_regions_npz('regions.npz')
>>> cwms, trim_masks = data_io.load_motif_cwms('motifs.h5')
>>>
>>> # Call hits
>>> hits_df, qc_df = hitcaller.fit_contribs(
... cwms=cwms,
... contribs=contribs,
... sequences=sequences,
... cwm_trim_mask=trim_masks,
... use_hypothetical=False,
... lambdas=np.array([0.7] * len(cwms)),
... step_size_max=3.0,
... step_size_min=0.08,
... sqrt_transform=False,
... convergence_tol=0.0005,
... max_steps=10000,
... batch_size=1000,
... step_adjust=0.7,
... post_filter=True,
... device=None,
... compile_optimizer=False
... )
See Also
TF-MoDISco
: https://github.com/jmschrei/tfmodisco-lite
BPNet
: https://github.com/kundajelab/bpnet-refactor
ChromBPNet
: https://github.com/kundajelab/chrombpnet
1"""Fi-NeMo: Finding Neural Network Motifs. 2 3A GPU-accelerated motif instance calling tool for identifying transcription factor 4binding sites from neural network contribution scores. 5 6Fi-NeMo implements a competitive optimization approach using proximal gradient descent 7to identify motif instances by solving a sparse linear reconstruction problem. The 8algorithm represents contribution scores as weighted combinations of motif contribution 9weight matrices (CWMs) at specific genomic positions. 10 11Key Features 12------------ 13- GPU-accelerated hit calling using PyTorch 14- Support for multiple input formats (bigWig, HDF5, TF-MoDISco) 15- Competitive motif instance assignment 16- Comprehensive evaluation and visualization tools 17- Post-processing utilities for hit refinement 18 19Modules 20------- 21- hitcaller : Core Fi-NeMo algorithm implementation 22- data_io : Data input/output utilities 23- main : Command-line interface 24- evaluation : Performance assessment tools 25- visualization : Plotting and report generation 26- postprocessing : Hit refinement and analysis 27 28Examples 29-------- 30Basic hit calling workflow: 31 32>>> import finemo 33>>> from finemo import data_io, hitcaller 34>>> 35>>> # Load preprocessed data 36>>> sequences, contribs, peaks_df, has_peaks = data_io.load_regions_npz('regions.npz') 37>>> cwms, trim_masks = data_io.load_motif_cwms('motifs.h5') 38>>> 39>>> # Call hits 40>>> hits_df, qc_df = hitcaller.fit_contribs( 41... cwms=cwms, 42... contribs=contribs, 43... sequences=sequences, 44... cwm_trim_mask=trim_masks, 45... use_hypothetical=False, 46... lambdas=np.array([0.7] * len(cwms)), 47... step_size_max=3.0, 48... step_size_min=0.08, 49... sqrt_transform=False, 50... convergence_tol=0.0005, 51... max_steps=10000, 52... batch_size=1000, 53... step_adjust=0.7, 54... post_filter=True, 55... device=None, 56... compile_optimizer=False 57... ) 58 59See Also 60-------- 61TF-MoDISco : https://github.com/jmschrei/tfmodisco-lite 62BPNet : https://github.com/kundajelab/bpnet-refactor 63ChromBPNet: https://github.com/kundajelab/chrombpnet 64""" 65 66from . import data_io 67from . import hitcaller 68from . import evaluation 69from . import visualization 70from . import postprocessing 71from . import main 72 73__all__ = [ 74 "data_io", 75 "hitcaller", 76 "evaluation", 77 "visualization", 78 "postprocessing", 79 "main", 80]