Files for split_train, split_valid, and split_test arguments #36

TangYiChing · 2024-10-20T15:31:34Z

Hi,

I am running train.py, and this line of code complains the input is a directory. complex_names_all = read_strings_from_txt(self.split_path)

The original code snippet, read_strings_from_text needs a file

def read_strings_from_txt(path):
    # every line will be one element of the returned list
    with open(path) as file:
        lines = file.readlines()
        return [line.rstrip() for line in lines]

However, the default split_train, split_valid, split_test are paths not files. see utils.parsing.py

split_train='./data/splits/timesplit_no_lig_overlap_train',
split_val='./data/splits/timesplit_no_lig_overlap_val',
split_test='./data/splits/timesplit_test'

Can you tell me how to generate those files?

Bellow are the errors from the train.py

$ CUDA_VISABLE_DEVICES=6 python train.py
Available GPU count: 8
Namespace(config=None, log_dir='workdir', restart_dir=None, cache_path='./data/cache', data_dir='./data/mnt/nas/research-data/luwei/dynamicbind_data/pdbbind_v11//pocket_aligned_fill_missing/', info_path='./data/d3_with_clash_info_small.csv', finetune_data_path='./results/CACHE4/finetune_data.pkl', split_train='./data/splits/timesplit_no_lig_overlap_train', split_val='./data/splits/timesplit_no_lig_overlap_val', split_test='./data/splits/timesplit_test', test_sigma_intervals=False, val_inference_freq=5, finetune_freq=None, num_finetune_complexes=500, inference_steps=20, num_inference_complexes=100, inference_earlystop_metric='valinf_rmsds_lt2', inference_earlystop_goal='max', wandb=False, project='difdock_train', run_name='', cudnn_benchmark=False, num_dataloader_workers=0, pin_memory=False, n_epochs=400, batch_size=32, sample_batch_size=16, scheduler=None, scheduler_patience=20, lr=0.001, restart_lr=None, w_decay=0.0, num_workers=1, use_ema=False, ema_rate=0.999, only_test=False, limit_complexes=0, all_atoms=False, receptor_radius=30, c_alpha_max_neighbors=10, atom_radius=5, atom_max_neighbors=8, matching_popsize=20, matching_maxiter=20, max_lig_size=None, remove_hs=False, num_conformers=1, esm_embeddings_path=None, lddt_weight=0.99, affinity_weight=0.01, tr_weight=0.33, rot_weight=0.33, tor_weight=0.33, res_tr_weight=0.33, res_rot_weight=0.33, res_chi_weight=0.33, rot_sigma_min=0.03, rot_sigma_max=1.65, tr_sigma_min=0.1, tr_sigma_max=20, tor_sigma_min=0.0314, tor_sigma_max=3.14, res_rot_sigma_min=0.01, res_rot_sigma_max=1, res_tr_sigma_min=0.01, res_tr_sigma_max=1, res_chi_sigma_min=0.01, res_chi_sigma_max=1, no_torsion=False, num_conv_layers=2, max_radius=5.0, scale_by_sigma=False, ns=16, nv=4, distance_embed_dim=32, cross_distance_embed_dim=32, no_batch_norm=False, use_second_order_repr=False, cross_max_distance=80, dynamic_max_cross=False, dropout=0.0, embedding_type='sinusoidal', sigma_embed_dim=32, embedding_scale=1000)
Processing complexes from [./data/splits/timesplit_no_lig_overlap_train] and saving it to [./data/cache_torsion/limit0_INDEXtimesplit_no_lig_overlap_train_maxLigSizeNone_H1_recRad30_recMax10]
Traceback (most recent call last):
  File "/tools/DynamicBind/train.py", line 221, in <module>
    main_function()
  File "/tools/DynamicBind/train.py", line 165, in main_function
    train_loader, val_loader = construct_loader(args, t_to_sigma)
  File "/tools/DynamicBind/datasets/pdbbind.py", line 797, in construct_loader
    train_dataset = PDBBind(info=info,cache_path=args.cache_path, split_path=args.split_train, keep_original=True,
  File "/tools/DynamicBind/datasets/pdbbind.py", line 144, in __init__
    self.preprocessing()
  File "/tools/DynamicBind/datasets/pdbbind.py", line 203, in preprocessing
    complex_names_all = read_strings_from_txt(self.split_path)
  File "/tools/DynamicBind/utils/utils.py", line 61, in read_strings_from_txt
    with open(path) as file:
IsADirectoryError: [Errno 21] Is a directory: './data/splits/timesplit_no_lig_overlap_train'

The text was updated successfully, but these errors were encountered:

patjiang · 2024-10-26T21:27:13Z

Make sure you clone the workdir!
e.g.
(click this link): https://github.com/user-attachments/assets/9258e083-04a2-4e64-93cb-893012a4309b
or look at this photo:

Hope this helps!

aTMRz · 2024-12-06T03:58:43Z

after download this i still can't find the split files？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files for split_train, split_valid, and split_test arguments #36

Files for split_train, split_valid, and split_test arguments #36

TangYiChing commented Oct 20, 2024

patjiang commented Oct 26, 2024 •

edited

Loading

aTMRz commented Dec 6, 2024

Files for split_train, split_valid, and split_test arguments #36

Files for split_train, split_valid, and split_test arguments #36

Comments

TangYiChing commented Oct 20, 2024

patjiang commented Oct 26, 2024 • edited Loading

aTMRz commented Dec 6, 2024

patjiang commented Oct 26, 2024 •

edited

Loading