首页 > 美文鉴赏

CS231n计算机视觉作业1-Q1-写一个K近邻分类器（如何开始作业）

更新时间:2023-05-15 08:04:04 阅读：评论：0

CS231n计算机视觉作业1-Q1-写⼀个K近邻分类器（如何开始作业）

⽂章⽬录

纸质档案>女人十大名牌衣服品牌1开始

从下列⽹站下载作业包

2下载数据集

需要下载CIFAR-10数据集，如果是LINUX可以直接运⾏如下代码

cd cs231n/datats

./get_datats.sh

如果是windows可以⽤git运⾏，也可以简单的直接⽤记事本打开get_datats.sh⽂件，记事本内内容是

# Get CIFAR10

wget o.edu/~kriz/

tar -xzvf

#意思是注释

wget下载，通过wget url来完成下载

tar解压

rm删除

复制到迅雷或直接打开⽹站也能下载，下载后datats⽂件内为

这⾥需要留意后续读取数据集的⽂件为cifar10_dir = 'cs231n/datats/cifar-10-batches-py'，这个⽂件在cifar-10-python内部，所以需要拿出来

3开始编程

# Run some tup code for this notebook.

import random

import numpy as np

from cs231n.data_utils import load_CIFAR10

import matplotlib.pyplot as plt

# This is a bit of magic to make matplotlib figures appear inline in the notebook

# rather than in a new window.

#这⾥有⼀个⼩技巧可以让matplotlib画的图出现在notebook页⾯上，⽽不是新建⼀个画图窗⼝．

%matplotlib inline

将进酒原版Params['ap']='gray'

# Some more magic so that the notebook will reload external python modules;

# e /questions/1907993/autoreload-of-modules-in-ipython

#另⼀个⼩技巧，可以使　notebook　⾃动重载外部　python 模块．[点击此处查看详情][4]

#也就是说，当从外部⽂件引⼊的函数被修改之后，在notebook中调⽤这个函数，得到的被改过的函数．

%load_ext autoreload

%autoreload 2

中间出现如下问题

在这⾥按照⽹上的办法，将scipy降版本到1.20发现并不管⽤，装了pillow包后这个问题解决了。

内容理解（⾮运⾏代码）

下⾯两⾏代码

%load_ext autoreload

%autoreload 2

4加载CIFAR-10 原始数据

# Load the raw CIFAR-10 data.

cifar10_dir ='cs231n/datats/cifar-10-batches-py'

# Cleaning up variables to prevent loading data multiple times (which may cau memory issue)

try:

鹏鸟赋del X_train, y_train

del X_test, y_test

print('Clear previously loaded data.')

except:

pass

X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

# As a sanity check, we print out the size of the training and test data.

print('Training data shape: ', X_train.shape)

print('Training labels shape: ', y_train.shape)

名声的近义词

print('Test data shape: ', X_test.shape)

print('Test labels shape: ', y_test.shape)

load_CIFAR10在data_utils.py⾥这个⽂件在作业包中

def load_CIFAR10(ROOT):

""" load all of cifar """

xs =[]

ys =[]

for b in range(1,6):

f = os.path.join(ROOT,'data_batch_%d'%(b,))

X, Y = load_CIFAR_batch(f)

xs.append(X)

ys.append(Y)

Xtr = np.concatenate(xs)

Ytr = np.concatenate(ys)

del X, Y

Xte, Yte = load_CIFAR_batch(os.path.join(ROOT,'test_batch')) return Xtr, Ytr, Xte, Yte

def load_CIFAR_batch(filename):

""" load single batch of cifar """

with open(filename,'rb')as f:

datadict = load_pickle(f)# dict类型

X = datadict['data']# X, ndarray, 像素值

Y = datadict['labels']# Y, list, 标签, 分类

# reshape, ⼀维数组转为矩阵10000⾏3列。每个entries是32x32

# transpo，转置

# astype，复制，同时指定类型

X = X.reshape(10000,3,32,32).transpo(0,2,3,1).astype("float") Y = np.array(Y)

return X, Y

⽤法

def load_pickle(f):

version = platform.python_version_tuple()

if version[0]=='2':

胡寿根return pickle.load(f)

elif version[0]=='3':

return pickle.load(f, encoding='latin1')

rai ValueError("invalid python version: {}".format(version))

斯科特皮蓬确定python的版本，pickle.load 反序列化为python的数据类型

5看数据集中的样本

这⾥我们将训练集中每⼀类的样本都随机挑出⼏个进⾏展⽰

# Visualize some examples from the datat.

# We show a few examples of training images from each class.

class =['plane','car','bird','cat','deer','dog','frog','hor','ship','truck'] num_class =len(class)

samples_per_class =7

for y, cls in enumerate(class):

idxs = np.flatnonzero(y_train == y)

idxs = np.random.choice(idxs, samples_per_class, replace=Fal)

for i, idx in enumerate(idxs):

plt_idx = i * num_class + y +1

plt.subplot(samples_per_class, num_class, plt_idx)

plt.imshow(X_train[idx].astype('uint8'))

plt.axis('off')

if i ==0:

plt.title(cls)

plt.show()

# Subsample the data for more efficient code execution in this exerci

num_training =5000

mask =list(range(num_training))

X_train = X_train[mask]

y_train = y_train[mask]

num_test =500

mask =list(range(num_test))

X_test = X_test[mask]

y_test = y_test[mask]

# Reshape the image data into rows

X_train = np.reshape(X_train,(X_train.shape[0],-1))

X_test = np.reshape(X_test,(X_test.shape[0],-1))

print(X_train.shape, X_test.shape)

政治论文

5.1numpy.flatnonzero():

该函数输⼊⼀个矩阵，返回扁平化后矩阵中⾮零元素的位置（index）例⼦：

这⾥⼀定要数组，要是列表就会输出空值

5.2 np.random.choice

np.random.choice的

5.3 plt.subplot

subplot(numRows, numCols, plotNum)

import numpy as np

import matplotlib.pyplot as plt

# 分成2x2，占⽤第⼀个，即第⼀⾏第⼀列的⼦图

plt.subplot(2，2，1)

# 分成2x2，占⽤第⼆个，即第⼀⾏第⼆列的⼦图

plt.subplot(2，，22)

# 分成2x1，占⽤第⼆个，即第⼆⾏

plt.subplot(2，1，2)

plt.show()

6创建kNN分类器对象

记住 kNN 分类器不进⾏操作，只是将训练数据进⾏了简单的存储

from cs231n.classifiers import KNearestNeighbor

classifier = KNearestNeighbor()

这⾥出现如下问题

在这⾥安装了future包就解决了，推荐使⽤anaconda可以随时改变和配置包

现在我们可以使⽤kNN分类器对测试数据进⾏分类了。我们可以将测试过程分为以下两步：⾸先，我们需要计算测试样本到所有训练样本的距离。

得到距离矩阵后，找出离测试样本最近的k个训练样本，选择出现次数最多的类别作为测试样本的类别

本文发布于:2023-05-15 08:04:04，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/898800.html

上一篇：GridSearchCV与RandomizedSearchCV调参

下一篇：lightGBM的使用

标签：分类器样本测试

留言与评论（共有 0 条评论）