Training
Generic Training
This section describes the parameters common to all the models present in the Models class in NiftyTorch.
Parameters:
- num_classes (int,required): The number of classes in a datasets.
- in_channels (int,required): The number of channels in the input to the model.
- data_folder (str,required): The path to the directory which contains input data folder.
- data_csv (str,required): The path to the csv containing the filename and it's corresponding label.
- data_transforms (torchvision.transforms,required): The transformations from torchvision which is to be applied to the dataset.
- filename_label (str,required): The label which used to identify the input image file name.
- class_label (str,required): The label which is used to identify the class information for the corresponding filename.
- learning_rate (float,default = 3e-4): The learning which is used to be for the optimizer.
- step_size (int,default = 7): The step size to used for step based learning rate scheduler.
- gamma (float,default = 0.2): The reduction factor to be used in step based learning rate scheduler.
- cuda (str,default = None): The which GPU is to be used.
- batch_size (int,default = 1): The number examples to be used in each gradient update.
- image_scale (int,default = 128): The size of the image to be considerd for rescaling the image.
- loss (torch.nn,default = nn.CrossEntropyLoss()): The loss function be used in the required task.
- optimizer (torch.optim,default = optim.Adam): The optimizer which is used for updating weights.
- device_ids (list,default = []): The list of GPUs to be considered for data parallelization.
- l2 (float,default = 0): The l2 regularization coefficient.
- experiment_name (string,default = None): The entire path to the tensorboard directory.
Modules
AlexNet Training Parameters
Parameters:
- channels (list,default = [1,2,2,2,1]): A list containing the out_channels for each convolutional layer, it must be of size 5.
- kernel_size (list,default = [3,5,3,3,1]): A list containing the kernel size of each convolutional layer, it must be of size 5.
- strides (list,default = [1,2,1,1,1]): A list containing the stride at each convolutional layer, it must be of size 5.
- padding (list,default = [0,1,1,1,1]): A list containing the padding for each convolutional layer, it must be of size 5.
Usage:
import torch
from niftytorch.models.alexnet import train_alexnet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
data_folder = "./data"
data_csv = "./data.csv"
train = train_alexnet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,channels = [1,2,4,2,1],kernel_size = [3,5,5,3,1],strides = [1,2,2,2,1],padding = [1,1,1,1,1], data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()
VGGNet Training Parameters
Parameters:
- version (str,default = "A"): The version can be 'A','B','D' or 'E'.
- cfgs (list,default = same network parameters): A list containing the configurations.
Usage:
import torch
from niftytorch.models.vggnet import train_vggnet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
data_folder = "./data"
data_csv = "./data.csv"
cfgs = {'B':[4, 'M', 8, 'M', 8, 8, 'M', 32, 32, 'M', 32, 64, 'M']}
train = train_vggnet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,version = "B", data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128,cfgs = cfgs)
train.train()
ResNet Training Parameters
Parameters:
- block (default = BottleNeck): The type of network module to be used as a building block in the resnet.
- layers (list,default = [1,2,4,4]): The how many times does each block has to be repeated in the resnet.
- stride (list,default = [2,1,2,2,2]): The stride to be used in each building block of the resnet.
- channels (list,default = [64,128,256,512]): The number of channels to be maintained in each building block of the resnet.
Usage:
import torch
from niftytorch.models.resnet import train_resnet
from torchvision import transforms
from NiftyTorch.layers.layers import bottleneck
data_transforms = transforms.Compose([transforms.ToTensor()])
layers = [1,2,1,1,2]
stride = [1,1,1,1,1]
channels = [32,64,64,32,32]
data_folder = "./data"
data_csv = "./data.csv"
train = train_resnet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,block = block, data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',layers = [1,2,4,4],stride = [2,1,2,2,2],channels = [64,128,256,512],learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()
ShuffleNet Training Parameters
Parameters:
- groups (default = 2): number of groups to be used in grouped 1x1 convolutions in each ShuffleUnit.
- stage_repeats (list,default = [3,7,3]): The number of times each stage is repeated.
Usage:
import torch
from niftytorch.models.shufflenet import train_shufflenet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
groups = 2
stage_repeats = [2,7,4]
data_folder = "./data"
data_csv = "./data.csv"
train = train_shufflenet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,groups = groups, data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',stage_repeats = stage_repeats,learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()
SqueezeNet Training Parameters
Parameters:
- version (str,default = '1_0'): The version of squeezenet to be used.
Usage:
import torch
from niftytorch.models.squeezeNet import train_squeezenet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
version = '1_1'
data_folder = "./data"
data_csv = "./data.csv"
train = train_squeezenet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,version = version, data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()
XNOR NET
Parameters:
- channels (list,default = [32, 96, 144, 144, 96, 7]): A list containing the out_channels for each convolutional layer, it must be of size 5.
- kernel_size (list,default = [11, 5, 3, 3, 3]): A list containing the kernel size of each convolutional layer, it must be of size 5.
- strides (list,default = [4, 1, 1, 1, 1]): A list containing the stride at each convolutional layer, it must be of size 5.
- padding (list,default = [0, 2, 1, 1, 1]): A list containing the padding for each convolutional layer, it must be of size 5.
- groups (list,default = [1, 1, 1, 1, 1]): A list containing the number of groups in convolution filters for each convolutional layer, it must be of size 5.
Usage:
import torch
from niftytorch.models.xnornet import train_xnornet
from torchvision import transforms
data_transforms = transforms.Compose([transforms.ToTensor()])
version = '1_1'
data_folder = "./data"
data_csv = "./data.csv"
train = train_xnornet()
train.set_params(num_classes = 2, in_channels = 1, data_folder = data_folder, data_csv = data_csv,channels = [32,96,144,42,6],kernel_size = [3,5,5,3,1],strides = [1,2,1,2,1],padding = [1,1,0,1,1],groups = [1,2,2,1,1] data_transforms = data_transforms, filename_label = 'Subject',class_label = 'Class',learning_rate = 3e-4,step_size = 7, gamma = 0.1, cuda = 'cuda:3',batch_size = 16,image_scale = 128)
train.train()
Hyperparameter Training
Generic Hyperparameter Tuning parameters
For hyperparameter tuning we need to create a configuration dictionary where below parameters are the keys.
These are generic parameters for hyperparameter tuning:
- learning_rate (bool/float): If False, the hyperparameter tuning is considered for learning rate else for any other float value or True.
- lr_min (float): Minimum learning rate to be considered for tuning.
- lr_max (float): Maximum learning rate to be considered for tuning.
- batch_size (bool/int): If False, the hyperparameter tuning is considered for batch size else for any other int value or True.
- data_folder (string): The path to the directory which contains input data folder
- data_csv (string): The path to the csv containing the filename and it's corresponding label.
- gamma (float): The reduction factor to be used in step based learning rate scheduler.
- num_classes (int): The number of classes in a datasets.
- loss (bool): If False, the hyperparameter tuning is considered for loss else a single loss function is considered.
- step_size (float): The step size to used for step based learning rate scheduler
- loss_list (list): The list of losses to be considered for training.
- scheduler (bool): If False, the hyperparameter tuning is considered for scheduler else single scheduler function is considered.
- scheduler_list (list): The list of scheduler to be considered for hyperparameter tuning.
- optimizer (bool/nn.optim): If False, the hyperparameter tuning is considered for optimizer else a single optimizer function is considered.
- opt_list (list): The list of optimizer to be considered for optimization.
- filename_label (str): The label which used to identify the input image file name.
- class_label (str): The label which is used to identify the class information for the corresponding filename.
- in_channels (int): The number of channels in the input to the model.
- num_workers (int): The threads to be considered while loading the data.
- image_scale (bool): The size of the image to be considerd for rescaling the image.
- image_scale_list (list): The list of the image sizes to be considered for hyperparameter tuning.
- device_ids (list): The list of devices to be considered for data parallelization.
- cuda (str): The GPU to be considered for loading data.
- l2 (float,default = 0): The l2 regularization coefficient.
AlexNet Hyparameter Tuning
These are hyperparameters for AlexNet:
- channels (int/bool): If False, the hyperparameter tuning is considered for channels.
- channels_1 (list): The list containing all values to be tested for channel 1.
- channels_2 (list): The list containing all values to be tested for channel 2.
- channels_3 (list): The list containing all values to be tested for channel 3.
- channels_4 (list): The list containing all values to be tested for channel 4.
- channels_5 (list): The list containing all values to be tested for channel 5.
- strides (bool): If False, the hyperparameter tuning is considered for strides.
- strides_1 (list): The list containing all values to be tested for strides 1.
- strides_2 (list): The list containing all values to be tested for strides 2.
- strides_3 (list): The list containing all values to be tested for strides 3.
- strides_4 (list): The list containing all values to be tested for strides 4.
- strides_5 (list): The list containing all values to be tested for strides 4.
- kernel_size (bool): If False, the hyperparameter tuning is considered for kernel size.
- kernel_size_1 (list): The list containing all values to be tested for kernel size 1.
- kernel_size_2 (list): The list containing all values to be tested for kernel size 2.
- kernel_size_3 (list): The list containing all values to be tested for kernel size 3.
- kernel_size_4 (list): The list containing all values to be tested for kernel size 4.
- kernel_size_5 (list): The list containing all values to be tested for kernel size 5.
- padding (bool): If False, the hyperparameter tuning is considered for padding.
- padding_1 (list): The list containing all values to be tested for padding 1.
- padding_2 (list): The list containing all values to be tested for padding 2.
- padding_3 (list): The list containing all values to be tested for padding 3.
- padding_4 (list): The list containing all values to be tested for padding 4.
- padding_5 (list): The list containing all values to be tested for padding 5.
ResNet Hyperparameter Tuning
These are hyperparameters for ResNet:
- groups (bool): If True, the hyperparameter tuning is considered for groups.
- groups_min (int): The minimum group value to be used for hyperparameter tuning.
- groups_max (int): The maximum group value to be used for hyperparameter tuning.
- block (bool): If True, the type of block in resent is considered for hyperparameter.
- block_list (list): The list containing different type of blocks to be considered for hyperparameter tuning.
- norm_layer (bool): If True, the type of normalization layers in resent is considered for hyperparameter.
- norm_layer_list (list): The list containing different type of normalization layers to be considered for hyperparameter tuning.
- width_per_group (bool): If True, the number of groups per each layer in resent is considered for hyperparameter.
- width_per_group_list (list): The list containing different types of groups.
ShuffleNet Hyperparameter Tuning
These are hyperparameters for ShuffleNet:
- groups (bool): If True, the number of groups per each layer in resent is considered for hyperparameter.
- groups_min (int): The minimum group value to be used for hyperparameter tuning.
- groups_max (int): The maximum group value to be used for hyperparameter tuning.
- stage_repeat (bool): If True, the number of stage_repeats in shufflenet is considered for hyperparameter.
- stage_repeat_1 (list): The list containing all the values stage_repeats_1.
- stage_repeat_2 (list): The list containing all the values stage_repeats_2.
- stage_repeat_3 (list): The list containing all the values stage_repeats_3.
XNOR NET Hyperparameter Tuning
- channels (int/bool): If False, the hyperparameter tuning is considered for channels.
- channels_1 (list): The list containing all values to be tested for channel 1.
- channels_2 (list): The list containing all values to be tested for channel 2.
- channels_3 (list): The list containing all values to be tested for channel 3.
- channels_4 (list): The list containing all values to be tested for channel 4.
- channels_5 (list): The list containing all values to be tested for channel 5.
- strides (bool): If False, the hyperparameter tuning is considered for strides.
- strides_1 (list): The list containing all values to be tested for strides 1.
- strides_2 (list): The list containing all values to be tested for strides 2.
- strides_3 (list): The list containing all values to be tested for strides 3.
- strides_4 (list): The list containing all values to be tested for strides 4.
- strides_5 (list): The list containing all values to be tested for strides 4.
- kernel_size (bool): If False, the hyperparameter tuning is considered for kernel size.
- kernel_size_1 (list): The list containing all values to be tested for kernel size 1.
- kernel_size_2 (list): The list containing all values to be tested for kernel size 2.
- kernel_size_3 (list): The list containing all values to be tested for kernel size 3.
- kernel_size_4 (list): The list containing all values to be tested for kernel size 4.
- kernel_size_5 (list): The list containing all values to be tested for kernel size 5.
- padding (bool): If False, the hyperparameter tuning is considered for padding.
- padding_1 (list): The list containing all values to be tested for padding 1.
- padding_2 (list): The list containing all values to be tested for padding 2.
- padding_3 (list): The list containing all values to be tested for padding 3.
- padding_4 (list): The list containing all values to be tested for padding 4.
- padding_5 (list): The list containing all values to be tested for padding 5.
- groups (bool): If False, the hyperparameter tuning is considered for groups.
- groups_1 (list): The list containing all values to be tested for groups 1.
- groups_2 (list): The list containing all values to be tested for groups 2.
- groups_3 (list): The list containing all values to be tested for groups 3.
- groups_4 (list): The list containing all values to be tested for groups 4.
- groups_5 (list): The list containing all values to be tested for groups 5.