Graph Neural Network for ALK inhibitors
Published:
Abstract
Updating…
Workflow
- Molecular representation with node and edge attributes
- Model architecture:
- 3 layers of GCN
- 3 layers of fully connected neural networks
- Cross validation and external validation
Requirements
This module requires the following modules:
Installation
Clone this repository to use
Note
Updating…
Folder segmentation
Finally the folder structure should look like this:
Mol2DSimi (project root)
|__ README.md
|__ MolGNN
|__ |__ GNN_architect.py
| |__ GNN_ext_valid.py
| |__ GNN_int_valid.py
| |__ GNN_train.py
| |__ MolData.py
| |__ savemodel.py
| |__ seed_everything.py
| |__ visualization.py
|__ Image (saved images)
|__ Tutorial_GNN.ipynb
|__ Model
|__ data
|__ LICENSE
|......
Usage
import sys
sys.path.append('./MolGNN')
from seed_everything import seed_everything
from visualization import historical_plot
from savemodel import SaveModel
from GNN_architect import GNN
from GNN_train import train
from GNN_int_valid import evaluate
from GNN_ext_valid import external_evaluate
from MolData import MoleculeDataset
# 1. Data interagion
## 1a. Creating dataset
seed_everything(42)
train_dataset = MoleculeDataset(root = "./data/",
filename = "ALK_train.csv")
test_dataset = MoleculeDataset(root = "./data/",
filename = "ALK_test.csv", test = True)
valid_dataset = MoleculeDataset(root = "./data//",
filename = "ALK_val.csv", valid = True)
## 1b Creating Data loader
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True,worker_init_fn=np.random.seed(42))
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=True,worker_init_fn=np.random.seed(42))
valid_loader = DataLoader(valid_dataset, batch_size=64, shuffle=True,worker_init_fn=np.random.seed(42))
# 2. Train
device = torch.device("cuda")
seed_everything(42)
model = GNN(feature_size=30)
model = model.to(device)
print(model)
import random
import sys
if not sys.warnoptions:
import warnings
warnings.simplefilter("ignore")
seed_everything(42)
model = GNN(feature_size=30)
model = model.to(device)
train_losses = []
validation_losses = []
train_f1s = []
val_f1s = []
train_aps = []
val_aps = []
criterion = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.SGD(model.parameters(),
lr=0.01,momentum=0.8,weight_decay=0.0001)
lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.1)
save_best_model = SaveModel()
epochs = 50
for epoch in range(epochs):
train1_loss,train1_f1, train1_ap = train(model, criterion, optimizer, train_loader)
val1_loss,val1_f1, val1_ap,validation_loss = evaluate(model, criterion,valid_loader)
train_losses.append(train1_loss)
validation_losses.append(val1_loss)
train_f1s.append(train1_f1)
val_f1s.append(val1_f1)
train_aps.append(train1_ap)
val_aps.append(val1_ap)
lr_scheduler.step(val1_ap)
#save_best_model(val1_ap, epoch, model, optimizer, criterion)
if (epoch+1) % 5 == 0:
print("Epoch: {}/{}.. ".format(epoch+1, epochs),
"Training Loss: {:.3f}.. ".format(train1_loss),
"validation Loss: {:.3f}.. ".format(val1_loss),
"validation f1_score: {:.3f}.. ".format(val1_f1),
"validation average precision: {:.3f}.. ".format(val1_ap),
)
save_best_model(epochs, model, optimizer, criterion)
# 3. Evaluation
external_evaluate(model, criterion, test_loader)
Contributing
Please visit the GNN-ALK repository. Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. Please make sure to update tests as appropriate.