train_datagen = ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, validation_split=0.2) # set validation split train_generator = train_datagen.flow_from_directory( train_data_dir, target_size=(img_height, img_width), batch_size=batch_size, class_mode='binary', subset='training') # set as training data validation_generator = train_datagen.flow_from_directory( train_data_dir, # same directory as training data target_size=(img_height, img_width), batch_size=batch_size, class_mode='binary', subset='validation') # set as validation data
Here is what the above code is Doing:
1. We are using the ImageDataGenerator class to generate batches of tensor image data with real-time data augmentation.
2. We are rescaling the pixel values (between 0 and 255) to the [0, 1] interval (as you know, neural networks prefer to deal with small input values).
3. We are using the flow_from_directory method to generate batches of image data (and their labels) directly from our jpgs in their respective folders.
4. We are using the same generator for both the training and validation data.
5. We are setting the class_mode argument to binary because we have only two classes to predict.
6. We are setting the batch_size argument to 32.
7. We are setting the target_size argument to the desired size of our images, in this case, 150×150.
8. We are setting the subset argument to “training” in order to use the training data.
9. We are setting the subset argument to “validation” in order to use the validation data.