v0.13.0 (March - May 2025)

1. New Features

1.1 Core Functionality Expansion

  • Integration of PRS (Progressive Relaxation Sampling) Algorithm
    A new PRS analysis module is added, supporting progressive relaxation sampling calculations for trajectories, including mean calculation, coordinate repetition detection, and RMSD (Root Mean Square Deviation) analysis. It leverages numpy and MDAnalysis for efficient computations. (feat(algorithm): add PRS calculation functionality)
  • PCA (Principal Component Analysis) Functionality
    Implemented PCA for trajectory data in md_ana.py, enabling dimensionality reduction and scatter plot matrix visualization. A new pca class and related command - line options are introduced. (feat(md_ana): add PCA analysis)
  • Parallel Processing and Chain Selection
  • Added an n-workers parameter for parallel computing in Ramachandran and Janin plot analyses, and supported specifying chain IDs for targeted analysis. (feat(md_ana): add parallel processing and chain selection)
  • The md_task module now supports multi-chain selection, with updates to the filter_atoms_by_chains function and chain handling logic in related classes. (feat(md_task): allow multiple chains selection)

1.2 Enhanced Interaction Analysis

  • Bond Length Scoring Function
    Introduced the bond_length_score function, which calculates interaction scores based on bond length. Updated functions like merge_interaction_df to adopt this new scoring method. (feat(pml): add bond length scoring function)
  • Output Styles and File Extensions
  • Added a ligand output style for interaction analysis to format results from the ligand's perspective. (feat(ana_interaction): add ligand output style)
  • Introduced a suffix parameter to customize the suffix of output file names, facilitating the differentiation of results from multiple analysis runs. (feat(ana_interaction): add suffix option)

1.3 Algorithm and Tool Improvements

  • Enhanced RMSD Calculation
  • Added return_rot and return_rmsd options to the fit_to and batch_fit_to functions to support returning rotation matrices and RMSD values. (feat(rms): add return_rot option)
  • Supported torch and cuda backends for batch RMSD calculations and multi-coordinate alignment, improving computational efficiency. (feat(algorithm): add backend support for RMSD)
  • GMX Tool Integration
  • Added an -o parameter to specify the output file name in gmx_MMPBSA analysis and automatically included the -nogui option. (feat(ana_gmx): add output file name option)
  • Improved argument parsing, such as for the batch directory parameter in run_gmx and the source file type in distribute_file. (fix(run_gmx): update argument parsing)

2. Improvements and Optimizations

2.1 Code Refactoring and Performance Optimization

  • 🚀 GNM Module Optimization
  • Refactored functions like generate_matrix, introduced backend selection (numpy/torch), and optimized matrix calculation and SVD (Singular Value Decomposition) processes. (refactor(gnm): optimize generate_matrix)
  • Replaced all_pairs with valid_pair to reduce invalid computations and improve the performance of generate_close_matrix. (perf(gnm): optimize generate_close_matrix)
  • 🚀 Data Processing Optimization
  • Moved the PDB string conversion logic outside the trajectory loop in ana_gmx and introduced the FakeAtomGroup class for preprocessing, enhancing interaction analysis efficiency. (refactor(gmx): optimize PDB conversion)
  • Optimized the visualization of the plot_PDF function by enhancing label and color bar readability. (fix(ana_gmx): enhance plot labels)

2.2 Architecture and Compatibility

  • 🔄 Dependency Updates
  • Globally replaced mbapy with mbapy_lite and updated related import paths and function calls. (refactor(lazydock): update imports to use mbapy_lite)
  • Increased the chain ID length limit to 16 characters to support chain identification in complex complexes. (refactor(gmx): increase chainID length limit)
  • 🔄 Parallel and Resource Management
  • Added a timeout parameter (default 1 second) to pool.close() to prevent process deadlocks. (refactor(ana_gmx): add timeout parameter)
  • Optimized thread pool management to automatically close the thread pool after analysis completion. (fix(lazydock/scripts/md_ana.py): close thread pool)

3. Bug Fixes

3.1 Critical Function Fixes

  • 🐛 Data Validation and Exception Handling
  • Added error handling for network calculation results in md_task to prevent the processing of invalid data. (fix(md_task): add error handling for network results)
  • Fixed the time point selection error in interaction analysis in ana_gmx and handled empty interaction lists to avoid index out - of - bounds errors. (fix(ana_gmx): correct time point selection)
  • 🐛 File and Path Issues
  • Checked the existence of files like .tpr and .xtc to avoid crashes due to missing files during analysis. (fix(ana_gmx): add existence checks for files)
  • Corrected the group index calculation in gmx_MMPBSA to ensure the correct transmission of atom group ranges. (fix(mmpbsa): correct group index calculation)

3.2 Logical and Computational Errors

  • 🐛 Unit and Numerical Calculations
  • Fixed the isothermal compressibility calculation by using trajectory slices instead of single frames to improve accuracy. (fix(md_traj): correct isothermal compressibility)
  • Corrected the time unit conversion logic to ensure the correct normalization of plot_df. (fix(ana_gmx): correct time unit conversion)
  • 🐛 Argument Parsing and Function Calls
  • Fixed variable assignment errors for box dimensions in the pml_plugin and unified parameter naming. (fix(pml_plugin): correct variable assignment)
  • Added support for 2D reference coordinates in align.py and improved shape checking and error prompts. (fix(gmx/mda/align.py): support 2D ref_coords)

4. Documentation and Toolchain

4.1 Documentation Updates

  • 📚 Improved Installation Guide
  • Updated the installation section in the README, corrected the "install docs" link to "How to Install", and added dependency instructions for OpenBabel and gmx_MMPBSA. (docs(README): update installation guide)
  • Supplemented the conda installation command for gmx_MMPBSA and reminded users of potential library conflicts. (docs(install): update gmx_MMPBSA command)
  • 📚 Code Annotations and Docstrings
  • Followed the Google documentation specification to update the parameter descriptions and return value descriptions of RMSD - related functions. (docs(lazydock): update algorithm/rms.py docs)

4.2 Tool and Script Optimization

  • 🛠️ Default Parameters and Output
  • Set default names for common files (e.g., --main-name='md.tpr') and updated help information. (feat(scripts): add default file names)
  • Changed the PCA output format from Excel to CSV to improve compatibility. (refactor(lazydock): update PCA output to csv)
  • 🛠️ Enhanced Test Coverage
  • Expanded RMSD test cases to cover batch calculations, multiple backends (numpy/torch/cuda), and boundary conditions. (test(rms): expand test coverage)
  • Added unit tests to verify the correctness of PRS and vectorized sliding average functions. (test(algorithm): add unit tests)

5. Compatibility Notes

  • 🔥 Breaking Changes
  • Removed the conflicts parameter from the simple_analysis class and the unused gro-name parameter. (refactor(scripts): remove conflicts argument)
  • Renamed the generate_matrix function to generate_valid_pairs, requiring updates to function calls. (refactor(gnm): rename generate_matrix)