Training

Generic Training

This section describes the parameters common to all the models present in the Models class in NiftyTorch.

Parameters:

  • num_classes (int,required): The number of classes in a datasets.
  • in_channels (int,required): The number of channels in the input to the model.
  • data_folder (str,required): The path to the directory which contains input data folder.
  • data_csv (str,required): The path to the csv containing the filename and it's corresponding label.
  • data_transforms (torchvision.transforms,required): The transformations from torchvision which is to be applied to the dataset.
  • filename_label (str,required): The label which used to identify the input image file name.
  • class_label (str,required): The label which is used to identify the class information for the corresponding filename.
  • learning_rate (float,default = 3e-4): The learning which is used to be for the optimizer.
  • step_size (int,default = 7): The step size to used for step based learning rate scheduler.
  • gamma (float,default = 0.2): The reduction factor to be used in step based learning rate scheduler.
  • cuda (str,default = None): The which GPU is to be used.
  • batch_size (int,default = 1): The number examples to be used in each gradient update.
  • image_scale (int,default = 128): The size of the image to be considerd for rescaling the image.
  • loss (torch.nn,default = nn.CrossEntropyLoss()): The loss function be used in the required task.
  • optimizer (torch.optim,default = optim.Adam): The optimizer which is used for updating weights.
  • device_ids (list,default = []): The list of GPUs to be considered for data parallelization.
  • l2 (float,default = 0): The l2 regularization coefficient.
  • experiment_name (string,default = None): The entire path to the tensorboard directory.

Modules

AlexNet Training Parameters

Parameters:

  • channels (list,default = [1,2,2,2,1]): A list containing the out_channels for each convolutional layer, it must be of size 5.
  • kernel_size (list,default = [3,5,3,3,1]): A list containing the kernel size of each convolutional layer, it must be of size 5.
  • strides (list,default = [1,2,1,1,1]): A list containing the stride at each convolutional layer, it must be of size 5.
  • padding (list,default = [0,1,1,1,1]): A list containing the padding for each convolutional layer, it must be of size 5.

Usage:

import torch
from niftytorch.models.alexnet import train_alexnet
from torchvision import transforms

data_transforms = transforms.Compose([transforms.ToTensor()])

data_folder = "./data"
data_csv = "./data.csv"
train = train_alexnet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,channels = [1,2,4,2,1],kernel_size = [3,5,5,3,1],strides = [1,2,2,2,1],padding = [1,1,1,1,1], data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()

VGGNet Training Parameters

Parameters:

  • version (str,default = "A"): The version can be 'A','B','D' or 'E'.
  • cfgs (list,default = same network parameters): A list containing the configurations.

Usage:

import torch
from niftytorch.models.vggnet import train_vggnet
from torchvision import transforms

data_transforms = transforms.Compose([transforms.ToTensor()])

data_folder = "./data"
data_csv = "./data.csv"
cfgs = {'B':[4, 'M', 8, 'M', 8, 8, 'M', 32, 32, 'M', 32, 64, 'M']}
train = train_vggnet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,version = "B", data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128,cfgs = cfgs)
train.train()

ResNet Training Parameters

Parameters:

  • block (default = BottleNeck): The type of network module to be used as a building block in the resnet.
  • layers (list,default = [1,2,4,4]): The how many times does each block has to be repeated in the resnet.
  • stride (list,default = [2,1,2,2,2]): The stride to be used in each building block of the resnet.
  • channels (list,default = [64,128,256,512]): The number of channels to be maintained in each building block of the resnet.

Usage:

import torch
from niftytorch.models.resnet import train_resnet
from torchvision import transforms
from NiftyTorch.layers.layers import bottleneck
data_transforms = transforms.Compose([transforms.ToTensor()])
layers = [1,2,1,1,2]
stride = [1,1,1,1,1]
channels = [32,64,64,32,32]
data_folder = "./data"
data_csv = "./data.csv"
train = train_resnet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,block = block, data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',layers = [1,2,4,4],stride = [2,1,2,2,2],channels = [64,128,256,512],learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()

ShuffleNet Training Parameters

Parameters:

  • groups (default = 2): number of groups to be used in grouped 1x1 convolutions in each ShuffleUnit.
  • stage_repeats (list,default = [3,7,3]): The number of times each stage is repeated.

Usage:

import torch
from niftytorch.models.shufflenet import train_shufflenet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
groups = 2
stage_repeats = [2,7,4]
data_folder = "./data"
data_csv = "./data.csv"
train = train_shufflenet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,groups = groups, data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',stage_repeats = stage_repeats,learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()

SqueezeNet Training Parameters

Parameters:

  • version (str,default = '1_0'): The version of squeezenet to be used.

Usage:

import torch
from niftytorch.models.squeezeNet import train_squeezenet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
version = '1_1'
data_folder = "./data"
data_csv = "./data.csv"
train = train_squeezenet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,version = version, data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()

XNOR NET

Parameters:

  • channels (list,default = [32, 96, 144, 144, 96, 7]): A list containing the out_channels for each convolutional layer, it must be of size 5.
  • kernel_size (list,default = [11, 5, 3, 3, 3]): A list containing the kernel size of each convolutional layer, it must be of size 5.
  • strides (list,default = [4, 1, 1, 1, 1]): A list containing the stride at each convolutional layer, it must be of size 5.
  • padding (list,default = [0, 2, 1, 1, 1]): A list containing the padding for each convolutional layer, it must be of size 5.
  • groups (list,default = [1, 1, 1, 1, 1]): A list containing the number of groups in convolution filters for each convolutional layer, it must be of size 5.

Usage:

import torch
from niftytorch.models.xnornet import train_xnornet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
version = '1_1'
data_folder = "./data"
data_csv = "./data.csv"
train = train_xnornet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,channels = [32,96,144,42,6],kernel_size = [3,5,5,3,1],strides = [1,2,1,2,1],padding = [1,1,0,1,1],groups = [1,2,2,1,1] data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()

Hyperparameter Training

Generic Hyperparameter Tuning parameters

For hyperparameter tuning we need to create a configuration dictionary where below parameters are the keys.
These are generic parameters for hyperparameter tuning:

  • learning_rate (bool/float): If False, the hyperparameter tuning is considered for learning rate else for any other float value or True.
  • lr_min (float): Minimum learning rate to be considered for tuning.
  • lr_max (float): Maximum learning rate to be considered for tuning.
  • batch_size (bool/int): If False, the hyperparameter tuning is considered for batch size else for any other int value or True.
  • data_folder (string): The path to the directory which contains input data folder
  • data_csv (string): The path to the csv containing the filename and it's corresponding label.
  • gamma (float): The reduction factor to be used in step based learning rate scheduler.
  • num_classes (int): The number of classes in a datasets.
  • loss (bool): If False, the hyperparameter tuning is considered for loss else a single loss function is considered.
  • step_size (float): The step size to used for step based learning rate scheduler
  • loss_list (list): The list of losses to be considered for training.
  • scheduler (bool): If False, the hyperparameter tuning is considered for scheduler else single scheduler function is considered.
  • scheduler_list (list): The list of scheduler to be considered for hyperparameter tuning.
  • optimizer (bool/nn.optim): If False, the hyperparameter tuning is considered for optimizer else a single optimizer function is considered.
  • opt_list (list): The list of optimizer to be considered for optimization.
  • filename_label (str): The label which used to identify the input image file name.
  • class_label (str): The label which is used to identify the class information for the corresponding filename.
  • in_channels (int): The number of channels in the input to the model.
  • num_workers (int): The threads to be considered while loading the data.
  • image_scale (bool): The size of the image to be considerd for rescaling the image.
  • image_scale_list (list): The list of the image sizes to be considered for hyperparameter tuning.
  • device_ids (list): The list of devices to be considered for data parallelization.
  • cuda (str): The GPU to be considered for loading data.
  • l2 (float,default = 0): The l2 regularization coefficient.

AlexNet Hyparameter Tuning

These are hyperparameters for AlexNet:

  • channels (int/bool): If False, the hyperparameter tuning is considered for channels.
  • channels_1 (list): The list containing all values to be tested for channel 1.
  • channels_2 (list): The list containing all values to be tested for channel 2.
  • channels_3 (list): The list containing all values to be tested for channel 3.
  • channels_4 (list): The list containing all values to be tested for channel 4.
  • channels_5 (list): The list containing all values to be tested for channel 5.
  • strides (bool): If False, the hyperparameter tuning is considered for strides.
  • strides_1 (list): The list containing all values to be tested for strides 1.
  • strides_2 (list): The list containing all values to be tested for strides 2.
  • strides_3 (list): The list containing all values to be tested for strides 3.
  • strides_4 (list): The list containing all values to be tested for strides 4.
  • strides_5 (list): The list containing all values to be tested for strides 4.
  • kernel_size (bool): If False, the hyperparameter tuning is considered for kernel size.
  • kernel_size_1 (list): The list containing all values to be tested for kernel size 1.
  • kernel_size_2 (list): The list containing all values to be tested for kernel size 2.
  • kernel_size_3 (list): The list containing all values to be tested for kernel size 3.
  • kernel_size_4 (list): The list containing all values to be tested for kernel size 4.
  • kernel_size_5 (list): The list containing all values to be tested for kernel size 5.
  • padding (bool): If False, the hyperparameter tuning is considered for padding.
  • padding_1 (list): The list containing all values to be tested for padding 1.
  • padding_2 (list): The list containing all values to be tested for padding 2.
  • padding_3 (list): The list containing all values to be tested for padding 3.
  • padding_4 (list): The list containing all values to be tested for padding 4.
  • padding_5 (list): The list containing all values to be tested for padding 5.

ResNet Hyperparameter Tuning

These are hyperparameters for ResNet:

  • groups (bool): If True, the hyperparameter tuning is considered for groups.
  • groups_min (int): The minimum group value to be used for hyperparameter tuning.
  • groups_max (int): The maximum group value to be used for hyperparameter tuning.
  • block (bool): If True, the type of block in resent is considered for hyperparameter.
  • block_list (list): The list containing different type of blocks to be considered for hyperparameter tuning.
  • norm_layer (bool): If True, the type of normalization layers in resent is considered for hyperparameter.
  • norm_layer_list (list): The list containing different type of normalization layers to be considered for hyperparameter tuning.
  • width_per_group (bool): If True, the number of groups per each layer in resent is considered for hyperparameter.
  • width_per_group_list (list): The list containing different types of groups.

ShuffleNet Hyperparameter Tuning

These are hyperparameters for ShuffleNet:

  • groups (bool): If True, the number of groups per each layer in resent is considered for hyperparameter.
  • groups_min (int): The minimum group value to be used for hyperparameter tuning.
  • groups_max (int): The maximum group value to be used for hyperparameter tuning.
  • stage_repeat (bool): If True, the number of stage_repeats in shufflenet is considered for hyperparameter.
  • stage_repeat_1 (list): The list containing all the values stage_repeats_1.
  • stage_repeat_2 (list): The list containing all the values stage_repeats_2.
  • stage_repeat_3 (list): The list containing all the values stage_repeats_3.

XNOR NET Hyperparameter Tuning

  • channels (int/bool): If False, the hyperparameter tuning is considered for channels.
  • channels_1 (list): The list containing all values to be tested for channel 1.
  • channels_2 (list): The list containing all values to be tested for channel 2.
  • channels_3 (list): The list containing all values to be tested for channel 3.
  • channels_4 (list): The list containing all values to be tested for channel 4.
  • channels_5 (list): The list containing all values to be tested for channel 5.
  • strides (bool): If False, the hyperparameter tuning is considered for strides.
  • strides_1 (list): The list containing all values to be tested for strides 1.
  • strides_2 (list): The list containing all values to be tested for strides 2.
  • strides_3 (list): The list containing all values to be tested for strides 3.
  • strides_4 (list): The list containing all values to be tested for strides 4.
  • strides_5 (list): The list containing all values to be tested for strides 4.
  • kernel_size (bool): If False, the hyperparameter tuning is considered for kernel size.
  • kernel_size_1 (list): The list containing all values to be tested for kernel size 1.
  • kernel_size_2 (list): The list containing all values to be tested for kernel size 2.
  • kernel_size_3 (list): The list containing all values to be tested for kernel size 3.
  • kernel_size_4 (list): The list containing all values to be tested for kernel size 4.
  • kernel_size_5 (list): The list containing all values to be tested for kernel size 5.
  • padding (bool): If False, the hyperparameter tuning is considered for padding.
  • padding_1 (list): The list containing all values to be tested for padding 1.
  • padding_2 (list): The list containing all values to be tested for padding 2.
  • padding_3 (list): The list containing all values to be tested for padding 3.
  • padding_4 (list): The list containing all values to be tested for padding 4.
  • padding_5 (list): The list containing all values to be tested for padding 5.
  • groups (bool): If False, the hyperparameter tuning is considered for groups.
  • groups_1 (list): The list containing all values to be tested for groups 1.
  • groups_2 (list): The list containing all values to be tested for groups 2.
  • groups_3 (list): The list containing all values to be tested for groups 3.
  • groups_4 (list): The list containing all values to be tested for groups 4.
  • groups_5 (list): The list containing all values to be tested for groups 5.