🧬 HP Model for Protein Folding

This project implements the HP model for protein folding in Python.

The HP model is a simplified approach to explore basic protein folding behaviors via Monte Carlo simulations on the free energy of protein bonds. It reduces protein sequences to two amino acid categories: H (hydrophobic) and P (polar). For more information, see this paper or the theory section below.

A command-line application is included, allowing users to input a protein sequence (using all 20 standard amino acids or H/P only), run HP model simulations at chosen temperatures, optionally apply annealing algorithms, and output the energy evolution, protein structures, minimum energy configurations, and compactness.

This project provides a first look at protein behavior and can be used to study transitions to native states as a function of temperature and binding energy. Various tests and comparisons are possible, such as observing the folding behavior when two random adjacent amino acids are swapped.

📋 Table of Contents

Installation & Running
Parameter Settings
Repository Structure
Theoretical Background
1. Folding Algorithm
2. Structure Acceptance
Execution Example

🚀 Installation & Running

From your terminal, navigate to your desired folder and clone this repository.

After that, move to the project directory:

cd HP_model

And activate the virtual environment:

source .venv/bin/activate

Next, install all dependencies running:

pip install -r requirements.txt

Finally, run the application:

python src/main.py

⚙️ Parameter Settings

You can modify protein sequences, structures, and other parameters by editing the config.yaml file or creating a new configuration file.

🔡 Insert the Protein Sequence

Write the sequence in the sequence field inside config.yaml (uppercase letters only, no quotes). Sequences can use just H/P monomers or all 20 amino acids (automatically converted to H/P).

🔢 Change the Number of Folding Steps

Set the number of folding steps via the folding_steps variable under the simulation section in config.yaml.

🛠 Other Parameters

Enable or disable annealing: annealing: true or annealing: false in simulation
Use a specific initial structure: set use_structure: true and provide a list of coordinates in structure. Sequence and structure lengths must match!
Set initial temperature: temperature in simulation
Create a GIF of the process: create_gif: true or create_gif: false in plot
Set a random seed: seed in config.yaml (or None for random)

📝 Create a Custom Configuration File

Copy the syntax from config.yaml and adjust parameters as needed. To use your file, simply update the path in the main script if necessary. Custom configuration files can use any extension supported by PyYAML.

Example config.yaml structure:

sequence: MGLSDGEWQLVLNVWGKVEADVAGHGQEVLIRSHVWGECPVLPALLSGVRALSESHQKRLRKDSRDDDGDDGDGDNDNDDGDGDDDDGDDDGDNDNDDDDGDGDDDGDGDDDRDDSDGGGGDHADDDNGNDDGDDDGHPETLEKFDKFKHLKTADEMKASEDLKKHGNTVLTALGGILKKKGHHEAELKPLAQSHATKHKIPVKYLEFISDAIIHVLQSKHPGDFGADAQAAMNKALELFRNDMAAKYKELGFQG

structure:
  use_structure: false
  coordinates: [ [ 0,0 ], [ 0,1 ], ... ]  # only needed if use_structure: true

simulation:
  folding_steps: 5000
  annealing: true
  temperature: 5.0

plot:
  create_gif: true

seed: 42

📁 Repository Structure

HP_model/
├── output/
│   └── ...plots and outputs
│── config.yaml
├── src/
│   ├── __init__.py
│   ├── main.py
│   ├── plots.py
│   ├── protein_class.py
│   └── utils.py
├── test/
│   ├── __init__.py
│   ├── config_test.yaml
│   └── test.py
└── requirements.txt

Main Python files:

output/: Directory containing plots and other outputs
config.yaml: Input configuration
src/main.py: Runs simulations and saves results to output/
src/protein_class.py: Defines the Protein class and key evolution methods
src/utils.py: Helper functions for validation, configuration, and sequence conversion
src/plots.py: Plotting functions for results visualization (energy, compactness, structures, GIF creation)
test/test.py: Test suite for code validation
test/config_test.yaml: Configuration for test runs (do not modify). To run tests: use pytest test/test.py
requirements.txt: All dependencies for the project

📚 Theoretical Background

The HP model simplifies protein folding by categorizing amino acids as either hydrophobic (H) or polar (P). Hydrophobic amino acids cluster inside the protein to avoid water, while polar ones remain on the surface. This model is educational and helps introduce protein folding basics, but real folding involves many more factors. Researchers use more advanced models for accurate predictions.

🧩 Folding Algorithm

The folding algorithm is implemented in the Protein class (protein_class.py), with utility functions in utils.py. Each evolutionary step involves:

Select a random monomer (from 1 to length-2).
Randomly choose a move type (tail_fold in utils.py):
1 = 90° clockwise, 2 = 90° counterclockwise, 3 = 180° rotation, 4 = x-axis reflection, 5 = y-axis reflection,
6 = symmetry on 1st/3rd quadrants, 7 = symmetry on 2nd/4th quadrants, 8 = diagonal move (if possible).
Validate the new structure (no overlaps, neighbor distances = 1). If invalid, repeat.
If valid, accept or reject the new folded structure according to the Metropolis criterion.

✅ Structure Acceptance

After generating a new structure, its energy is computed and accepted according to the Metropolis algorithm:

If the new structure's energy is lower, accept it.
If higher, accept with probability:

$$p = e^{-\frac{\Delta E}{k_B T}}$$

Notes:
$k_B$ (Boltzmann constant) is approximated to 1. Temperatures in the config file are interpreted as $T$.

$k_B$ = 1 simplifies the simulation and numerical comparison between runs.
Ensure all inputs are consistently non-dimensionalized going forward (energy, temperature, etc.).

🖥️ Execution Example

Example: Simulation of the Myoglobin (Camelus dromedarius) protein sequence.

Simulate 5000 folding steps as indicated in the config.yaml file.
Starting temperature: 5.0
Annealing: true

Results (found in the output/ folder):

Initial protein sequence:
Evolution process:
Final protein folding:
Energy evolution:
Compactness evolution:
Minimum energy folding
Maximum compactness folding

The plots show how energy and compactness stabilize as the temperature decreases. Lowest energy does not necessarily correspond to the highest compactness.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
output		output
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧬 HP Model for Protein Folding

📋 Table of Contents

🚀 Installation & Running

⚙️ Parameter Settings

🔡 Insert the Protein Sequence

🔢 Change the Number of Folding Steps

🛠 Other Parameters

📝 Create a Custom Configuration File

📁 Repository Structure

📚 Theoretical Background

🧩 Folding Algorithm

✅ Structure Acceptance

🖥️ Execution Example

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

TommyGiak/HP_model

Folders and files

Latest commit

History

Repository files navigation

🧬 HP Model for Protein Folding

📋 Table of Contents

🚀 Installation & Running

⚙️ Parameter Settings

🔡 Insert the Protein Sequence

🔢 Change the Number of Folding Steps

🛠 Other Parameters

📝 Create a Custom Configuration File

📁 Repository Structure

📚 Theoretical Background

🧩 Folding Algorithm

✅ Structure Acceptance

🖥️ Execution Example

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages