GMSO: Flexible storage of chemical topology for molecular simulation¶
This is the documentation for GMSO
, the General Molecular Simulation Object.
It is a part of the MoSDeF, the Molecular Simulation
Design Framework.
Design Principles¶
Scope and Features of GMSO
¶
GMSO
is designed to enable the flexible, general representation of
chemical topologies for molecular simulation. Efforts are made to enable
lossless, bias-free storage of data, without assuming particular chemistries,
models, or using any particular engine’s ecosystem as a starting point. The
scope is generally restrained to the preparation, manipulation, and conversion
of and of input files for molecular simulation, i.e. before engines are called
to execute the simulations themselves. GMSO
currently does not support
conversions between trajectory file formats for analysis codes. In the scope of
molecular simulation, we loosely define a chemical topology as everything
needed to reproducibly prepare a chemical system for simulation. This includes
particle coordinates and connectivity, box information, force field data
(functional forms, parameters tagged with units, partial charges, etc.) and
some optional information that may not apply to all systems (i.e. specification
of elements with each particle).
GMSO
enables the following features:
- Supporting a variety of models in the molecular simulation/computational chemistry community: No assumptions are made about an interaction site representing an atom or bead, instead supported atomistic, united-atom/coarse-grained, polarizable, and other models!
- Greater flexibility for exotic potentials: The
AtomType
(and analogue classes for intramolecular interactions) usessympy
to store any potential that can be represented by a mathematical expression. If you can write it down, it can be stored! - Easier development for glue to new engines: by not being designed for compatibility with any particular molecular simulation engine or ecosystem, it becomes more tractable for developers in the community to add glue for engines that are not currently supported (and even ones that do not exist at present)!
- Compatibility with existing community tools: No single molecular simulation
tool will be a silver bullet, so
GMSO
includes functions to convert objects. These can be used in their own right to convert between objects in-memory and also to support conversion to file formats not natively supported at any given time. Currently supported conversions includeParmEd
,OpenMM
,mBuild
,MDTraj
, with others coming in the future! - Native support for reading and writing many common file formats (
XYZ
,GRO
,TOP
,LAMMPSDATA
) and indirect support, through other libraries, for many more!
Structure of GMSO
¶
There are three main modules within the Python package:
gmso.core
stores the classes that constitute the core data structures.gmso.formats
stores readers and writers for (on-disk) file formats.gmso.external
includes functions that convert core data structures between external libraries and their internal representation.
Data Structures in GMSO¶
Following data structures are available within GMSO.
Core Classes¶
gmso.Topology |
|
gmso.SubTopology |
|
gmso.Atom |
|
gmso.Bond |
|
gmso.Angle |
|
gmso.Dihedral |
|
gmso.Improper |
|
gmso.AtomType |
|
gmso.BondType |
|
gmso.AngleType |
|
gmso.DihedralType |
|
gmso.ImproperType |
Topology¶
SubTopology¶
Atom¶
Bond¶
Angle¶
Dihedral¶
Improper¶
ForceField¶
Formats¶
This submodule provides readers and writers for (on-disk) file formats.
GROMACS¶
The following methods are available for reading and writing GROMACS files.
GSD¶
The following methods are available for reading and writing GSD files.
xyz¶
The following methods are available for reading and writing xyz files.
LAMMPS DATA¶
The following methods are available for reading and writing LAMMPS data.
External¶
This submodule includes functions that convert core data structures between external libraries and their internal representation.
Installation¶
Installing dependencies with conda¶
Dependencies of GMSO
are listed in the file requirements.txt
. They
can be installed in one line:
$ conda install -c omnia -c mosdef -c conda-forge --file requirements.txt
Alternatively you can add all the required channels to your .condarc
file
and then install dependencies.
$ conda config --add channels omnia
$ conda config --add channels mosdef
$ conda config --add channels conda-forge
$ conda install --file requirements.txt
Note
These commands will likely change a configuration file on your computer and
may affect installation of other packages in other projects you are working
on. However, the channel priority recommended is fairly common
(in particular, conda-forge
having the highest priority) and should
work well for most installations.
Installing dependencies with pip¶
$ pip install -r requirements.txt
Note
Compared to conda
installation, this is less tested. Some upstream
dependencies may not be available on PyPI
but can be installed via
source or conda
.
Install an editable version from source¶
Once all dependencies are installed, the GMSO
itself can be installed.
It is currently only available through its source code. It will be available
through pip
and conda
in the future.
$ git clone https://github.com/mosdef-hub/gmso.git
$ cd gmso
$ pip install -e .
Supported Python Versions¶
Python 3.7 is the recommend version for users. It is the only version on which development and testing consistently takes place. Older (3.6) and newer (3.8+) versions of Python 3 are likely to work but no guarantee is made and, in addition, some dependencies may not be available for other versions. No effort is made to support Python 2 because it is considered obsolete as of early 2020.
Testing your installation¶
GMSO
uses py.test
to execute its unit tests. To run them, first install some extra depdencies:
$ conda install --file requirements-test.txt
And then run the tests with the py.test
executable:
$ py.test -v
Using GMSO with Docker¶
As much of scientific software development happens in unix platforms, to avoid the quirks of development dependent on system you use, a recommended way is to use docker or other containerization technologies. This section is a how to guide on using GMSO
with docker.
Prerequisites¶
A docker installation in your machine. Follow this link to get a docker installation working on your machine. If you are not familiar with docker and want to get started with docker, the Internet is full of good tutorials like the ones here and here.
Quick Start¶
After you have a working docker installation, please use the following command to use run a jupyter-notebook with all the dependencies for GMSO installed:
$ docker pull mosdef/gmso:latest
$ docker run -it --name gmso -p 8888:8888 mosdef/gmso:latest su anaconda -s\
/bin/sh -l -c "jupyter-notebook --no-browser --ip="0.0.0.0" --notebook-dir\
/home/anaconda/gmso-notebooks"
If every thing happens correctly, you should a be able to start a jupyter-notebook server running in a python environment with all the dependencies for GMSO installed.
Alternatively, you can also start a Bourne shell to use python from the container’s terminal:
$ docker run -it --name gmso mosdef/gmso:latest
Important
The instructions above will start a docker container but containers by nature are ephemeral, so any filesystem changes (like adding a new notebook) you make will only persist till the end of the container’s lifecycle. If the container is removed, any changes or code additions will not persist.
Persisting User Volumes¶
If you will be using GMSO from a docker container, a recommended way is to mount what are called user volumes in the container. User volumes will provide a way to persist all filesystem/code additions made to a container regardless of the container lifecycle. For example, you might want to create a directory called gmso-notebooks in your local system, which will store all your GMSO notebooks/code. In order to make that accessible to the container(where the notebooks will be created/edited), use the following steps:
- Create a directory in your filesystem
$ mkdir -p /path/to/gmso-notebooks
$ cd /path/to/gmso-notebooks
- Define an entry-point script. Inside gmso-notebooks in your local file system create a file called
dir_entrypoint.sh
and paste the following content.
#!/bin/sh
chown -R anaconda:anaconda /home/anaconda/gmso-notebooks
su anaconda -s /bin/sh -l -c "jupyter-notebook --no-browser --ip="0.0.0.0" --notebook-dir /home/anaconda/gmso-notebooks"
- Run docker image for GMSO
$ docker run -it --name gmso -p 8888:8888 --entrypoint /home/anaconda/gmso-notebooks/dir_entrypoint.sh -v /home/umesh/gmso-notebooks:/home/anaconda/gmso-notebooks mosdef/gmso:latest
Cleaning Up¶
You can remove the created container by using the following command:
$ docker container rm gmso
Note
Instead of using latest, you can use the image mosdef/gmso:stable
for most recent stable release of GMSO
and run the tutorials.
Contributing¶
Contributions are welcomed via pull requests on GitHub. Developers and/or users will review requested changes and make comments. The rest of this file will serve as a set of general guidelines for contributors.
Features¶
Implement functionality in a general and flexible fashion¶
GMSO is designed to be general and flexible, not limited to single chemistries, file formats, simulation engines, or simulation methods. Additions to core features should attempt to provide something that is applicable to a variety of use-cases and not targeted at only the focus area of your research. However, some specific features targeted toward a limited use case may be appropriate. Speak to the developers before writing your code and they will help you make design choices that allow flexibility.
Version control¶
We currently use the “standard” Pull Request model. Contributions should be implemented on feature branches of forks. Please try to keep the master branch of your fork up-to-date with the master branch of the main repository.
Source code¶
Use a consistent style¶
It is important to have a consistent style throughout the source code. The following criteria are desired:
- Lines wrapped to 80 characters
- Lines are indented with spaces
- Lines do not end with whitespace
- For other details, refer to PEP8
To help with the above, there are tools such as flake8 and Black.
Document code with comments¶
All public-facing functions should have docstrings using the numpy style. This includes concise paragraph-style description of what the class or function does, relevant limitations and known issues, and descriptions of arguments. Internal functions can have simple one-liner docstrings.
Tests¶
Write unit tests¶
All new functionality in GMSO should be tested with automatic unit tests that execute in a few seconds. These tests should attempt to cover all options that the user can select. All or most of the added lines of source code should be covered by unit test(s). We currently use pytest, which can be executed simply by calling pytest from the root directory of the package.