freeze_support() runtime error on training

I tried to train the model on my data set and got the following error:

> UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\Context.cpp:85.)
> Using a different number of positional encodings than DINOv2, which means we're not loading DINOv2 backbone weights. This is not a problem if finetuning a pretrained RF-DETR model.
> Using patch size 16 instead of 14, which means we're not loading DINOv2 backbone weights. This is not a problem if finetuning a pretrained RF-DETR model.
> Loading pretrain weights
> Unable to initialize TensorBoard. Logging is turned off for this session.  Run 'pip install tensorboard' to enable logging.
> Not using distributed mode
> git:
>   sha: 443f480e2406840b1024296e5b4199c74a70a0d0, status: has uncommited changes, branch: main
> 
> Namespace(num_classes=2, grad_accum_steps=4, amp=True, lr=0.0001, lr_encoder=0.00015, batch_size=4, weight_decay=0.0001, epochs=10, lr_drop=100, clip_max_norm=0.1, lr_vit_layer_decay=0.8, lr_component_decay=0.7, do_benchmark=False, dropout=0, drop_path=0.0, drop_mode='standard', drop_schedule='constant', cutoff_epoch=0, pretrained_encoder=None, pretrain_weights='rf-detr-nano.pth', pretrain_exclude_keys=None, pretrain_keys_modify_to_load=None, pretrained_distiller=None, encoder='dinov2_windowed_small', vit_encoder_num_layers=12, window_block_indexes=None, position_embedding='sine', out_feature_indexes=[3, 6, 9, 12], freeze_encoder=False, layer_norm=True, rms_norm=False, backbone_lora=False, force_no_pretrain=False, dec_layers=2, dim_feedforward=2048, hidden_dim=256, sa_nheads=8, ca_nheads=16, num_queries=300, group_detr=13, two_stage=True, projector_scale=['P4'], lite_refpoint_refine=True, num_select=300, dec_n_points=2, decoder_norm='LN', bbox_reparam=True, freeze_batch_norm=False, set_cost_class=2, set_cost_bbox=5, set_cost_giou=2, cls_loss_coef=1.0, bbox_loss_coef=5, giou_loss_coef=2, focal_alpha=0.25, aux_loss=True, sum_group_losses=False, use_varifocal_loss=False, use_position_supervised_loss=False, ia_bce_loss=True, dataset_file='roboflow', coco_path=None, dataset_dir='C:\\codes\\obstacle_detector\\data\\dataset', square_resize_div_64=True, output_dir='output', dont_save_weights=False, checkpoint_interval=10, seed=42, resume='', start_epoch=0, eval=False, use_ema=True, ema_decay=0.993, ema_tau=100, num_workers=2, device='cuda', world_size=1, dist_url='env://', sync_bn=True, fp16_eval=False, encoder_only=False, backbone_only=False, resolution=384, use_cls_token=False, multi_scale=True, expanded_scales=True, do_random_resize_via_padding=False, warmup_epochs=0.0, lr_scheduler='step', lr_min_factor=0.0, early_stopping=False, early_stopping_patience=10, early_stopping_min_delta=0.001, early_stopping_use_ema=False, gradient_checkpointing=False, patch_size=16, num_windows=2, positional_encoding_size=24, mask_downsample_ratio=4, tensorboard=True, wandb=False, project=None, run=None, class_names=['obstacle'], run_test=True, segmentation_head=False, distributed=False)
> number of params: 30147076
> [544]
> loading annotations into memory...
> Done (t=0.01s)
> creating index...
> index created!
> [544]
> loading annotations into memory...
> Done (t=0.00s)
> creating index...
> index created!
> [544]
> loading annotations into memory...
> Done (t=0.00s)
> creating index...
> index created!
> Get benchmark
> Start training
> Grad accum steps:  4
> Total batch size:  16
> LENGTH OF DATA LOADER: 43
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
>     exitcode = _main(fd, parent_sentinel)
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 125, in _main
>     prepare(preparation_data)
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 236, in prepare
>     _fixup_main_from_path(data['init_main_from_path'])
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
>     main_content = runpy.run_path(main_path,
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 289, in run_path
>     return _run_module_code(code, init_globals, run_name,
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
>     _run_code(code, mod_globals, init_globals,
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
>     exec(code, run_globals)
>   File "c:\codes\obstacle_detector\rfdetr\train.py", line 5, in <module>
>     model.train(
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\rfdetr\detr.py", line 83, in train
>     self.train_from_config(config, **kwargs)
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\rfdetr\detr.py", line 191, in train_from_config
>     self.model.train(
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\rfdetr\main.py", line 341, in train
>     train_stats = train_one_epoch(
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\rfdetr\engine.py", line 88, in train_one_epoch
>     for data_iter_step, (samples, targets) in enumerate(
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\rfdetr\util\misc.py", line 239, in log_every
>     for obj in iterable:
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\torch\utils\data\dataloader.py", line 494, in __iter__
>     return self._get_iterator()
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\torch\utils\data\dataloader.py", line 427, in _get_iterator
>     return _MultiProcessingDataLoaderIter(self)
>   File "C:\codes\obstacle_detector\rfdetr\venv\lib\site-packages\torch\utils\data\dataloader.py", line 1170, in __init__
>     w.start()
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 121, in start
>     self._popen = self._Popen(self)
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 224, in _Popen
>     return _default_context.get_context().Process._Popen(process_obj)
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\context.py", line 336, in _Popen
>     return Popen(process_obj)
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
>     prep_data = spawn.get_preparation_data(process_obj._name)
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
>     _check_not_importing_main()
>   File "C:\Users\0000018283959\AppData\Local\Programs\Python\Python310\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
>     raise RuntimeError('''
> RuntimeError:
>         An attempt has been made to start a new process before the
>         current process has finished its bootstrapping phase.
> 
>         This probably means that you are not using fork to start your
>         child processes and you have forgotten to use the proper idiom
>         in the main module:
> 
>             if __name__ == '__main__':
>                 freeze_support()
>                 ...
> 
>         The "freeze_support()" line can be omitted if the program
>         is not going to be frozen to produce an executable.

The code I used was:

```
from rfdetr.detr import RFDETRNano

model = RFDETRNano(pretrain_weights='rf-detr-nano.pth', device='cuda')

model.train(
    dataset_dir='dataset',
    epochs=10,
    batch_size=4,
    grad_accum_steps=4,
    lr=1e-4,
    output_dir='output'
)
```

Environment:
Windows 11
rfdetr                 1.3.0
torch                  2.9.0+cu130
torchvision            0.24.0+cu130

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

freeze_support() runtime error on training #428

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

freeze_support() runtime error on training #428

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions