完成狗猫数据集的两阶段分类、对⽐两阶段训练得到的模型精度并⽤数据增强训
练得到的模型
1、导⼊keras库,查看版本
观察中的发现
import keras
keras.__version__
2、下载数据集并放到相应的⽬录下,该原始数据集包含25,000张猫和狗的图像(每类12,500张),并且⼤⼩为543MB(压缩)。下载后并将其解压缩,我们将创建⼀个包含三个⼦集的新数据集:⼀个训练集,每个类别有1000个样本,⼀个验证设置每个类别有500个样本,最后是包含每个类别500个样本的测试集。这是执⾏此操作的⼏⾏代码:
import os, shutil
root_dir = os.getcwd()
data_path = os.path.join(root_dir,'data')
# The path to the directory where the original
# datat was uncompresd
original_datat_dir = os.path.join(data_path,'train')
# The directory where we will
# store our smaller datat
ba_dir = os.path.join(data_path,'cats_and_dogs_small')
os.mkdir(ba_dir)
# Directories for our training,
# validation and test splits
train_dir = os.path.join(ba_dir,'train')
os.mkdir(train_dir)
validation_dir = os.path.join(ba_dir,'validation')
os.mkdir(validation_dir)
test_dir = os.path.join(ba_dir,'test')
os.mkdir(test_dir)
# Directory with our training cat pictures
train_cats_dir = os.path.join(train_dir,'cats')
os.mkdir(train_cats_dir)
# Directory with our training dog pictures
train_dogs_dir = os.path.join(train_dir,'dogs')
os.mkdir(train_dogs_dir)
# Directory with our validation cat pictures
validation_cats_dir = os.path.join(validation_dir,'cats')
os.mkdir(validation_cats_dir)
# Directory with our validation dog pictures
validation_dogs_dir = os.path.join(validation_dir,'dogs')
os.mkdir(validation_dogs_dir)
# Directory with our validation cat pictures
test_cats_dir = os.path.join(test_dir,'cats')
os.mkdir(test_cats_dir)
# Directory with our validation dog pictures
test_dogs_dir = os.path.join(test_dir,'dogs')
os.mkdir(test_dogs_dir)
# Copy first 1000 cat images to train_cats_dir
# Copy first 1000 cat images to train_cats_dir
fnames =['cat.{}.jpg'.format(i)for i in range(1000)]
for fname in fnames:
src = os.path.join(original_datat_dir, fname)
dst = os.path.join(train_cats_dir, fname)
# Copy next 500 cat images to validation_cats_dir围城的主人公
fnames =['cat.{}.jpg'.format(i)for i in range(1000,1500)]
for fname in fnames:
src = os.path.join(original_datat_dir, fname)
dst = os.path.join(validation_cats_dir, fname)
# Copy next 500 cat images to test_cats_dir
fnames =['cat.{}.jpg'.format(i)for i in range(1500,2000)]
for fname in fnames:
src = os.path.join(original_datat_dir, fname)
dst = os.path.join(test_cats_dir, fname)
# Copy first 1000 dog images to train_dogs_dir
fnames =['dog.{}.jpg'.format(i)for i in range(1000)]
for fname in fnames:
src = os.path.join(original_datat_dir, fname)
dst = os.path.join(train_dogs_dir, fname)
# Copy next 500 dog images to validation_dogs_dir
diy书签fnames =['dog.{}.jpg'.format(i)for i in range(1000,1500)]
for fname in fnames:
src = os.path.join(original_datat_dir, fname)
dst = os.path.join(validation_dogs_dir, fname)
# Copy next 500 dog images to test_dogs_dir
fnames =['dog.{}.jpg'.format(i)for i in range(1500,2000)]
for fname in fnames:
src = os.path.join(original_datat_dir, fname)
dst = os.path.join(test_dogs_dir, fname)
3、计算每个训练分组中有多少张图⽚(train/validation/test): print('total training cat images:',len(os.listdir(train_cats_dir)))
print('total training dog images:',len(os.listdir(train_dogs_dir)))
庆祝的图片print('total validation cat images:',len(os.listdir(validation_cats_dir)))
print('total validation dog images:',len(os.listdir(validation_dogs_dir)))
print('total test cat images:',len(os.listdir(test_cats_dir)))
print('total test dog images:',len(os.listdir(test_dogs_dir)))
4、,注意特征图的深度在⽹络中逐渐增加(从32到128),⽽特征图的⼤⼩逐渐减少(从148x148减少到7x7)。在⼏乎所有的卷积⽹络中都会看到这种模式。
from keras import layers
想要爱
from keras import models
model = models.Sequential()
model.add(layers.Conv2D(32,(3,3), activation='relu',
input_shape=(150,150,3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64,(3,3), activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
嘴干是怎么回事
model.add(layers.Conv2D(128,(3,3), activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128,(3,3), activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Den(512, activation='relu'))
model.add(layers.Den(1, activation='sigmoid'))
5、特征贴图的尺⼨随每个连续层变化
model.summary()
6、对于我们的编译步骤,我们将像往常⼀样使⽤RMSprop优化器。由于我们以单个S型单元结束了⽹络,因此我们将使⽤⼆元交叉熵作为损失from keras
import optimizers
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])
7、数据预处理
当前,我们的数据以JPEG⽂件的形式位于驱动器上,因此将其放⼊⽹络的步骤⼤致如下: 读取图⽚⽂件。 将JPEG内容解码为RBG像素⽹格。 将它们转换为浮点张量。 将像素值(0到255之间)重新缩放为[0,1]间隔。
from keras.preprocessing.image import ImageDataGenerator
# All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
# This is the target directory
train_dir,
# All images will be resized to 150x150
target_size=(150,150),
batch_size=20,
# Since we u binary_crosntropy loss, we need binary labels
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150,150),
batch_size=20,
class_mode='binary')
⽣成器的输出为⽣成⼀批150x150 RGB图像(形状为((20,150,150,3)))和⼆进制图像标签(形状为((20,)),每个批次中的样本数20(批次⼤⼩)
for data_batch, labels_batch in train_generator:
print('data batch shape:', data_batch.shape)
print('labels batch shape:', labels_batch.shape)
break
韬字五行属什么
8、使⽤⽣成器将模型拟合到数据,由于数据是⽆休⽌地⽣成的,因此⽣成器需要知道例如要从⽣成器中抽取多少个样本的⽰例,批量为20个样本,因此将需要100个批次,直到我们看到我们的⽬标是2000个样本。
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=30,
validation_data=validation_generator,
validation_steps=50)