VMD: Visualisation, Atom Selection, and Scripting¶
When it comes to visualisation of MD simulations, there are two main programs used in the biomolecular areas: Visual Molecular Dynamics (VMD) and PyMOL. As I am much more experienced with VMD than PyMOL, that’s where this tutorial will focus; however, PyMOL is an extremely flexible program that will have similar functionality to VMD if you choose to get invested in it (and coding in Python instead of TCL is certainly an advantage for advanced users). The key purpose of these programs is to be able to visualise trajectories generated from MD simulations - both for the purpose of double-checking a simulation for issues or interesting behaviour, and for the purpose of generating high-quality images for use in sharing your results. However, a wide variety of other possibilities are enabled by the variety of packages in VMD - analysis of MD simulations, setting up simulations, and even the running of simulations can be performed through the VMD. As this tutorial focuses on selecting and visualising atoms, further information on these other functions of VMD can be found in the online VMD Documentation. In addition to this, VMD itself offers tutorials on several key functions, including those discussed in this tutorial, with a list available here. As there is a great deal of information on how to use VMD online, I’ll be including links to appropriate tutorials and resources consistently, and I’ll mostly focus on the tips and tricks I’ve found most useful.
Installation and setup¶
I would encourage you to use the latest alpha version of VMD available from the official software location, as the latest ‘stable’ version at the time of writing - 1.9.3 - was last updated in 2016, and is not up to date with many features; for the MacOS operating system, only the latest alphas work on versions of MacOS since Version 10.15 (Catalina). After following the installation guidelines in the installation guide, we can begin to visualise a molecule of interest. A brief summary of all things visualisation in VMD can be seen in the VMD tutorial on the topic, if you want more detailed instructions and explanations of options. Just about any file type that conveys molecular geometric data can be read by VMD - going to the File tab and selecting the New Molecule option allows you to select an appropriate file, and after hitting load we can investigate the molecule in the the display window. Alternatively, if you’re running VMD on MacOS or a Linux distribution and followed installation instructions, you can navigate to the folder containing your file in the command line, and open the file with the vmd [filename]
command. Once you’re in the program, there are some useful hotkeys to make navigating the scene a little easier:
- = will reset the viewing window to be centred on the primary molecule, and zooms out to show all atoms involved. It’s a useful reset button.
- R will change the effect of the mouse to rotating the scene, which is the default when you first load up VMD
- T will change the effect of the mouse to moving the camera position - up, down, left, or right - rather than rotating the scene.
- 1 will change the effect of the mouse to selecting atoms when you click on them, allowing you to label atoms
- 2, 3, and 4 will select bonds, angles, and dihedrals respectively - just select the atoms in the order you want them. This will let you label the distance/angle of this selection; further information is present in Graphics -> Labels, and then selecting bonds/angles/dihedrals. You can even graph this value over time if you’ve loaded a trajectory!
- 5, 6, 7, and 8 will change the effect of the house to moving atoms, residues, fragments, and molecules respectively - useful if you’re making minor tweaks to atom locations, investigating changes in angles or dihedrals, or more. As a note here: this movement doesn’t get saved unless you save a molecule manually, either by right-clicking the molecule in the VMD Main window and hitting “Save Coordinates”, or using the scripting outlined in [[#VMD Scripting]].
Before further investigation is performed, I would recommend changing your display style from Perspective to Orthographic, in the Display tab; this turns off distortions that are done to show a sense of perspective, but can make it difficult to confirm that molecules are behaving as expected. In addition to the change to an Orthographic camera, there are several small changes I’ve found to be helpful for visualisation in VMD. Removal of the axes (Display -> Axes -> Off) is helpful for removing a distraction (and essential for making figures), and changing the background colour to grey or white (Graphics -> Colours -> Display -> Background -> 2 grey/ 8 white) is also one I find helpful, as the black colour makes for a confusing environment for visualisation. Setting a white background is essential when capturing images for use in publications. These settings - and many others - can be written to a .vmdrc file so they are automatically applied on boot-up in the Extensions -> VMD Preferences menu by pressing “Query All VMD Settings” and then “Write Settings to VMDRC”.
Visualisation¶
Customisation of the visualisation seen by default is primarily done by opening the Representations window from the Graphics tab, which allows for modification of both the atoms being represented, and how they are being represented graphically. A wide variety of properties can be changed - the drawing method is the shape in which each atom is visually shown, the material is how the textures on the atoms respond to lighting, and the colouring method controls how the colour of an atom is determined. The NewCartoon representation (Graphics -> Representations -> Drawing Method -> NewCartoon) works well for proteins and biomolecules, showing their overall structure without overwhelming the viewer with visual information – and in the same screen, changing the material to Glass½/3 or Transparent can work well to de-emphasise (but still display) the overall structure in favour of a separate area of interest you’re showing. A representation of your area of interest works well in the licorice Drawing Method, or CPK if you’re going for more similarities with a typical chemistry-style representation.
An image of the insulin molecule, showing the above-mentioned graphical representations as a simple example. The backbone of the insulin peptide is shown in grey using a NewCartoon representation, with the B Chain shown using the Glass1 material (as the transparency de-emphasises it). The A6-A11 disulfide linkage is in the centre of the image, and is shown with a CPK representation. The isoleucine10 residue is shown in the left of the image using a Licorice representation.
Once you’re happy with how the figure looks in the display window, it’s time to render the image: File -> Render. My recommendation is to render using Tachyon (internal) rendering – it tends to give the crispest looking picture out of the options that automatically come with VMD. Remember that the level of zoom you’re using in the Display window will be recreated in the render, so frame your image as you want it in the final shot! This image will be output as a .tga file, which are highly uncompressed - detailed visually, and large in size. You may want to convert it to a .png, or similar image if you’re going to be storing a lot of them, or putting them in something like a PowerPoint. If you want a particularly impressive-looking image, you can turn on Ray Tracing in the Display -> Display Settings window, turning Shadows and Ambient Occlusion on, and setting the ambient and direct lighting to your visual preference. These settings work best with the “AO” line of materials for your representations - AOShiny, AOChalky, and AOEdgy. Use of an advanced renderer, like the Tachyon renderer, is required for this to take effect. A more detailed tutorial on generating high-quality images is available here for further exploration of the topic.
An example of a rendered image using these settings is shown below, showing Binding Site 1 on the Insulin Receptor, with the insulin peptide bound to it. There is no objectively correct way to represent this, and to demonstrate this I have shown two similar versions - in both cases, a neutral colour for the receptor backbone is used to draw emphasis to the brightly-coloured insulin backbone representation, and the extent of glycosylation of this receptor is shown by including all glycans with an all-atom representation (as something like the backbone representation does not work for glycans). However, these glycans show a great deal of visual noise - this might be beneficial if they are important to your work, but if the are present primarily to demonstrate the extent of glycosylation while still focusing on the insulin, the image on the right may be more appropriate. In this case, a transparent material was used to draw emphasis to the insulin binding site, while still allowing for a demonstration of the glycosylation. An appropriate visualisation of your system is dependent on what you are trying to show with the visualisation, rather than there being set rules about how and what should be visualised.
Left: An image of the insulin receptor using ray tracing and ambient occlusion (with an ambient light strength of 0.9, and a direct light strength of 0.35), with the IR peptide backbone shown in a grey-coloured Cartoon representation and the AOEdgy material, the insulin peptide backbone shown in an orange-coloured Cartoon representation and the AOEdgy material, and the glycans shown in an all-atom CPK representation and the AOEdgy material. Right: The same image, but with the glycans using the Glass1 material for transparency to avoid cluttering the image.
Atom Selection¶
In addition to customising the visual appearance of the atoms, it is often required to select specific groups of atoms, either to visualise only these atoms, or to give them a different visualisation to the other atoms in your system. An unlimited number of representations can be created, with more recent representations showing on top of older ones in the VMD display window (though this is not always followed by your renderer; if this is causing strange outcomes, either customise the unwanted representation to not include that region, or make the wanted representation larger so it is physically on top of the unwanted one). In addition to the visual customisation available in the “Draw Style” tab of the Graphical Representations window, the “Selections” tab allows for customisation of what atoms are selected. A set of single-word representations are available for common selections - protein
selects all residues with the appropriate backbone atoms included, backbone
selects only protein backbone, nucleic
selects only nucleic acids. These can be combined with the atom selection language rules to set up more advanced selections - not water and not ion
shows all atoms that aren’t water or ions, and backbone or nucleic
shows both protein backbone atoms and nucleic acid atoms (as a note here: it’s important to use ‘or’ instead of ‘and’ here, as ‘and’ requires both parts to be true - not water and protein
will show only not-water atoms that are also protein atoms). An extremely wide range of combinations are available here - VMD lists the single-word selections in one window, and keywords in another. Keywords require at least one additional value to be provided - for example, resname ALA
will select all alanine residues, or index 4561
will select the single atom with that index, whereas index 4561 to 4571
will select that range of indices. From my perspective, there are a few of these that I come back to consistently:
Selection | Context |
---|---|
protein | When selecting a single residue by number, or atoms by element, it’s often very helpful to make sure you’ve not also selected water or ion atoms |
backbone/sidechain | When visualising proteins, it’s often good to show the backbone in one representation (e.g. cartoon) and then specific side chains of interest with something more detailed (e.g. CPK) |
chain | Chains are a good way of showing only a specific region of the molecule you’re focusing on |
fragment | When chains haven’t been set properly, you’re dealing with glycans or other badly-handled biomolecules, or more same fragment as ... is very useful |
resid | protein and resid 6 11 is definitely easier than protein and (resname CYS and not chain B) - use the 1 hotkey to find residue numbers |
resname | It’s very useful to be able to go protein and resname ASP GLU to visualise all negatively charged amino acids, as an example |
These allow for more flexible descriptions of selections - protein and name CA
will select only protein atoms with the name CA (alpha carbon) for their atom name. In addition to these selections created that define a set of atoms by their exact properties, selections can be a little more complicated. For example, (resname CYS and name SG)
selects only the sulphur atoms on cysteine residues, but same fragment as (resname CYS and name SG)
will select all atoms on the same fragment (a series of residues joined by peptide bonds) as the cysteine residue. Alternatively, one could specify all within 5 of (resname CYS and name SG)
, and this would select all atoms within 5 Å of a sulphur atom in the cysteine residue. Finally, these can all be combined together to make a selection such as same residue as all within 5 of (resname CYS and name SG)
, which selects not only the atoms within this 5 Å distance, but also all the other atoms that are in the same residue as the atoms within the 5 Å distance.
A demonstration of this is shown in the image below, continuing the example of insulin bound to the insulin receptor.
Top left: Using the same visualisation settings as in the previous example, the insulin peptide backbone on its own is shown, using the chain G H
selection. Top Right: In addition to the insulin backbone, nearby atoms in the insulin receptor backbone are shown, using the (chain G H) or all within 2 of chain G H
selection. It is difficult to see what these atoms may be doing, as they are isolated from the surrounding atoms. Bottom Left: In addition to the insulin backbone, nearby atoms of the insulin receptor are shown in-context, using the (chain G H) or (same residue as all within 2 of chain G H)
selection. This allows for visualisation of the binding residues of the insulin receptor. Bottom Right: The same image as the bottom left, shown with an additional representation of the surrounding glycans shown with a transparent Glass1 material to demonstrate that the IR peptide is not the only region that may play a binding role in this process, while still separating peptide backbone from the more flexibile glycans.
VMD Scripting¶
This powerful selection language allows for extremely detailed atom selections, which can be a significant boon during visualisation. However, these atom selections may also be of interest for purposes such as analysis of MD simulations, or exporting the atoms selected to a coordinate file. VMD contains a console in which scripting can be performed, and scripts can also be written out and re-used to perform a function without having to open VMD and put them in manually. This scripting is performed in the TCL language, with a few additions for VMD-specific functionality. A tutorial is available from the creators of VMD on the basics of scripting in VMD, but there are a few aspects that I think are particularly important to emphasise here. The console in which these commands are written can be opened through the Extensions -> Tk Console menu. If you’re using a Mac computer with the Mojave (or a more recent) OS, a technical change in the operating system has caused ongoing issues, and typing in the Tk Console can often miss keypresses - you may find it easier to write commands elsewhere and paste them into the console. An example that I’ve used previously is a script to write out each chain in a file as a separate .pdb file:
set all [atomselect top all]
$all set beta 0
set back [atomselect top backbone]
$back set beta 1
$all writepdb output.pdb
This script selects all the atoms in the system, sets their beta value to 0, and then sets the beta value of all backbone atoms to one, and writes the output; in this case, it’s being done because the value in the beta column can be used to select atoms later on in an MD simulation. The most important command to learn is to select a group of atoms, such as the first line in the above script - the selection language is the same as in the Graphical Representations tab, but a command is required: set sel [atomselect top "insert selection here"]
, where sel is the name of the selection being created. For example, set backbone [atomselect top "backbone"]
will create a selection named ‘backbone’ which contains only peptide backbone atoms. The same basic terminology can be used for creating other variables - set a 73
will create a variable named ‘a’ with a value of 73; the square brackets in TCL specify that there is a fully separate command contained within them. After atom selection, the likely next most common command that one will be using in VMD is getting information from this selection - the name of the atoms involved, the elements, the fragments they belong to, and more. This can be obtained with the following command: $sel get property
, where sel is the name of a previously created selection (and the $ is an indicator for TCL that we’re referencing an existing variable), and property is replaced with one of a large list of atom selection keywords available - for example, $sel get chain
will report the one-letter identifier for the chain of each atom in the selection. This can also be reversed, using $sel set property value
to change the value of the property to a new one for a selection - $sel set chain B
will set the chain’s identifier to B for all atoms in the selection. The final command mentioned here will be writing a selection of atoms to a new file - though this selection could be all the atoms if you just want to change the file format, or make the formatting consistent. This action is performed with a slight variation depending on the file format, as VMD has a custom function for each of the major file formats; for example, to write a new pdb file from the selection ‘sel’, the command would be: $sel writepdb filename.pdb
, whereas to write an xyz it would be $sel writexyz filename.xyz
. Combining these commands together, we would have something like the following to select all residues around a ligand, set their chain to ‘C’ for an ease of selection for visualisation later on, and write the new file:
set $sel [atomselect top "same residue as all within 5 of resname LIG"]
$sel set chain C
$sel writepdb ligandCutout.pdb
vmd -dispdev text -pdb filename.pdb -e script.tcl
, where ‘script.tcl’ is the file containing the commands of interest, and ‘-pdb filename.pdb’ opens the file you want the commands performed on.
Adding Hydrogen Atoms¶
One task that may be helpful to do through scripting is the addition of hydrogen atoms to your molecule - it is not uncommon for crystal and NMR structures to be missing hydrogen atoms. When performing MD simulations, the addition of these hydrogen atoms is an automated part of just about all processes. Because of this, it is possible to use tools in VMD designed for MD simulation to add these atoms, even if your purpose for the final structure is for use in non-MM applications, so long as your molecules are biomolecules. Without using the scripting window, it is possible to do this by opening the Extensions>Modeling>Automatic PSF Builder tool. AutoPSF then lets you load in the default topology files by clicking the Load input files
button, split the chains while keeping the Protein selection (or something that fits your system if it isn’t a protein) with the Guess and split chains using current selections
button, and then lets you chose to delete any chains that may not be of interest to you - for example, crystallographic water - with the Create chains
button. Finally, apply any automatically generated patches to create a final .pdb file that will have appropriate hydrogens inserted. Once this has been completed, you can right click on the new [structureName]_autopsf.psf
file in the VMD Main window and hit Save Coordinates, change the file type to something like .xyz if it’s required, and you’ve got your coordinates. An alternative to this is to use the TCL console to script this. A simple line can do the procedure outlined above: autopsf -mol top -protein
. A simple script like the following could be used:
autopsf -mol top -protein # Create the struture with hydrogen atoms
mol delete 0 # Delete the original representation, so 'top' now refers to the new one
set all [atomselect top all] # Select all the atoms
$all writexyz output.xyz # Save them as an .xyz
exit # Automatically quit
If the above file were saved as fixHydrogens.tcl
, then running the command vmd -dispdev text -pdb [input].pdb -e fixHydrogens.tcl
would be able to fix hydrogens without any user interaction required - just fix the name of the input to be appropriate for your system. However, this method does have a limitation - it only works on biomolecules. To my knowledge, there’s no trivial way to fix hydrogens in non-biomolecule contexts in VMD - PyMOL would be my recommendation there, where the following script can add hydrogens to any molecule:
h_add all
save output.pdb
This has not been a thorough explanation of VMD, or the ways it can be used - but hopefully it has been a useful introduction to the program. I’ve linked to documentation and tutorials where appropriate, and so if you’re in need of further detail on a topic, click through to those to find that detail. Good luck with your use of the program!