YoonKim的textCNN讲解，以及tensorflow实现，CNN文本分类

更新时间:2023-05-12 02:21:00 阅读：评论：0

YoonKim 的textCNN 讲解，以及tensorflow 实现，CNN ⽂本分类Ox00: Motivation 最近在研究的⼀篇经典之作，这篇⽂章可以说是cnn 模型⽤于⽂本分类的开⼭之作（其实第⼀个⽤的不是他，但是Kim 提出了⼏个variants ，并有详细的调参）对这篇paper 有⼀个tensorflow 的实现，具体参见。其实blog 已经写的很详细了，但是对于刚⼊⼿tensorflow 的新⼈来说代码可能仍存在⼀些细节不太容易理解，我也是初学，就简单总结下⾃⼰的理解，如果对读者有帮助那将是极好的。Ox01: Start!我主要对TextCNN 这个类进⾏解读，具体代码在。

研究别⼈代码时，时常问⾃⼰⼏个问题，由问题切⼊，在读的过程中找答案，这种⽅式我个⼈认为是最efficient 的 1 这个class 的主要作⽤是什么？ TextCNN 类搭建了⼀个最basic 的CNN 模型，有input layer ，convolutional layer ，max-pooling layer 和最后输出的softmax layer 。

如果对上述讲解还有什么不理解的地⽅，请移步wildml 的另⼀篇，包教包会。

说了这么多，总结⼀下这个类的作⽤就是：搭建⼀个⽤于⽂本数据的CNN 模型！

2 ⼀些参数既然TextCNN 类是基于YoonKim 的思路搭建的，那么我们接下来⼀个很重要的步骤就是将paper 中提到的各种参数设置都整理出来，有⼀些参数是关于模型的，有⼀些参数是关于training 的，⽐如epoch 等，这类参数就和模型本⾝⽆关，以此来确定我们的TextCNN 类需要传递哪些参数来初始化。赶紧把打开，来仔细找找参数吧。

策略就是在：

在训练阶段，对max-pooling layer 的输出实⾏⼀些dropout ，以概率p 激活，激活的部分传递给soft

max 层。

在测试阶段，w 已经学好了，但是不能直接⽤于unen ntences ，要乘以p 之后再⽤，这个阶段没有dropout 了全部输出给softmax 层。 4 Embedding Layer

6# Embedding layer with tf.device('/cpu:0'), tf.name_scope("embedding"): W = tf.Variable( tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0), name="W") lf.embedded_chars = bedding_lookup(W, lf.input_x)

6 7 lf.embedded_chars = bedding_lookup(W, lf.input_x)

存储全部word vector的矩阵<span tabindex="0" class="MathJax" id="MathJax-Element-1-Frame" role="prentation" data-mathml='W'>W W初始化时是随机random出来的，也就是paper中的第⼀种模型CNN-rand

训练过程中并不是每次都会使⽤全部的vocabulary，⽽只是产⽣⼀个batch（batch中都是ntence，每个ntence标记了出现哪些word(较⼤长度为quence_length)，因此batch相当于⼀个⼆维列表），这个batch就是input_x。

1lf.input_x = tf.placeholder(tf.int32, [None, quence_length], name="input_x")

但是，输⼊的word vectors得到之后，下⼀步就是输⼊到卷积层，⽤到v2d函数，

再看看conv2d的参数列表：

input: [batch, in_height, in_width, in_channels]（2）

filter: [filter_height, filter_width, in_channels, out_channels]（3）

对⽐（1）（2）可以发现，就差⼀个in_channels了，⽽最simple的版本也就只有1通道（Yoon的第四个模型⽤到了multichannel）

因此需要expand dim来适应conv2d的input要求，万能的tensorflow已经提供了这样的功能：

This operation is uful if you want to add a batch dimension to a single element. For example, if you have a single image of shape [height, width, channels], you can make it

a batch of 1 image with expand_dims(image, 0), which will make the shape [1, height, width, channels].

Example:

# ‘t’ is a tensor of shape [2]

shape(expand_dims(t, -1)) ==> [2, 1]

因此只需要

就能在embedded_chars后⾯加⼀个in_channels=1

5 Conv and Max-pooling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29# Create a convolution + maxpool layer for each filter size

pooled_outputs = []

for i, filter_size in enumerate(filter_sizes):

with tf.name_scope("conv-maxpool-%s" % filter_size):

# Convolution Layer

filter_shape = [filter_size, embedding_size, 1, num_filters]

W = tf.uncated_normal(filter_shape, stddev=0.1), name="W")

b = tf.stant(0.1, shape=[num_filters]), name="b")

conv = v2d(

strides=[1, 1, 1, 1],

padding="VALID",

name="conv")

# Apply nonlinearity

h = bias_add(conv, b), name="relu")

# Maxpooling over the outputs

pooled = tf.nn.max_pool(

ksize=[1, quence_length - filter_size + 1, 1, 1],

strides=[1, 1, 1, 1],

padding='VALID',

name="pool")

pooled_outputs.append(pooled)

# Combine all the pooled features

num_filters_total = num_filters * len(filter_sizes)

lf.h_pool = tf.concat(3, pooled_outputs)

lf.h_pool_flat = tf.reshape(lf.h_pool, [-1, num_filters_total])

⾸先，对filter_sizes中的每⼀个filter_window_size都要进⾏卷积（每⼀种size都要产⽣num_filters那么多个filter maps），所以外层就是⼀个⼤的for循环。

继续，看到了⼀个⽐较陌⽣的函数tf.name_scope('xxx')

这个函数的作⽤参见

由于在for循环内部，filter_size是固定了的，因此可以结合（3）：[filter_height, filter_width, in_channels, out_channels]得到，filter_shape = [filter_size, embedding_size, 1, num_filters]

之所以要弄清楚filter shape是因为要对filter的权重矩阵w进⾏初始化：

1W = tf.uncated_normal(filter_shape, stddev=0.1), name="W")

这⾥为什么要⽤tf.truncated_normal()函数呢？

也就是说random出来的值的范围都在[mean - 2 standard_deviations, mean + 2 standard_deviations]内。

下图可以告诉你这个范围在哪，

conv2d得到的其实是下图中的<span tabindex="0" class="MathJax" id="MathJax-Element-2-Frame" role="prentation" data-

mathml='w⋅x'>w⋅x w⋅x的部分，

还要加上bias项tf.nn.bias_add(conv, b)，并且通过relu：lu才最终得到卷积层的输出<span tabindex="0" class="MathJax" id="MathJax-Element-3-Frame" role="prentation" data-mathml='h'>h h。

那究竟卷积层的输出的shape是什么样呢？

官⽅⽂档中有⼀段话解释了卷积后得到的输出结果：

第三部进⾏了right-multiply之后得到的结果就是[batch, out_height, out_width, output_channels]，但是还是不清楚这⾥的out_height和out_width到底是什么。

那就看看wildml中怎么说的吧

“VALID” padding means that we slide the filter over our ntence without padding the edges, performing a narrow convolution that gives us an output of shape [1,

quence_length - filter_size + 1, 1, 1].

哦，这句话的意思是说out_height和out_width其实和padding的⽅式有关系，这⾥选择了”VALID”的⽅式，也就是不在边缘加padding，得到的out_height=quence_length - filter_size + 1，out_width=1

因此，综合上⾯的两个解释，我们知道conv2d-加bias-relu之后得到的<span tabindex="0" class="MathJax" id="MathJax-Element-4-Frame" role="prentation" data-mathml='h'>h h的shape= [batch, quence_length - filter_size + 1, 1, num_filters]

接下来的⼯作就是max-pooling了，来看⼀下tensorflow中给出的函数:

其中最重要的两个参数是value和ksize。

value相当于是max pooling层的输⼊，在整个⽹络中就是刚才我们得到的<span tabindex="0" class="MathJax" id="MathJax-Element-5-Frame" role="prentation" data-mathml='h'>h h，check了⼀下它俩的shape是⼀致的，说明可以直接传递到下⼀层。

另⼀个参数是ksize，官⽅解释说是input tensor每⼀维度上的window size。仔细想⼀下，其实就是想定义多⼤的范围来进⾏max-pooling，⽐如在图像中常见的2*2的⼩正⽅形区域对整个h得到feature map进⾏pooling，但是在nlp中，刚才说到了每⼀个feature map现在是[batch, quence_length - filter_size + 1, 1, num_filters]维度的，我们想知道每个

output_channels（每个channel是⼀个vector）的较⼤值，也就是最重要的feature是哪⼀个，那么就是在第⼆个维度上设定window=quence_length - filter_size + 1【这⾥感觉没解释通，待后续探索】

根据ksize的设置，和value的shape，可以得到pooled的shape= [batch, 1, 1, num_filters]，

这是⼀个filter_size的结果（⽐如filter_size = 3），pooled存储的是当前filter_size下每个ntence最重要的num_filters个features，结果append到pooled_outputs列表中存起来，再对下⼀个filter_size进⾏相同的操作。

等到for循环结束时，也就是所有的filter_size全部进⾏了卷积和max-pooling之后，⾸先需要把相同filter_size的所有pooled结果concat起来，再将不同的

filter_size之间的结果concat起来，最后的到的应该类似于⼆维数组，[batch, all_pooled_result]

all_pooled_result⼀共有num_filters\（100）*len(filter_sizes)（3）个，⽐如300个

连接的过程需要使⽤，官⽅给出的例⼦很容易理解。

最后得到的h_pool_flat也就是[batch, 300]维的tensor。

6 Dropout

1 2 3# Add dropout

with tf.name_scope("dropout"):

lf.h_drop = tf.nn.dropout(lf.h_pool_flat, lf.dropout_keep_prob)

前⾯在“dropout注意事项”中讲到了，dropout仅对hiddenlayer的输出层进⾏drop，使得有些结点的值不输出给softmax层。

7 Output

2 3 4# Final (unnormalized) scores and predictions with tf.name_scope("output"):

W = tf.get_variable(

5 6 7 8 9 10 11 12 "W",

shape=[num_filters_total, num_class],

ib.layers.xavier_initializer())

b = tf.stant(0.1, shape=[num_class]), name="b") l2_loss += tf.nn.l2_loss(W)

l2_loss += tf.nn.l2_loss(b)

lf.scores = tf.nn.xw_plus_b(lf.h_drop, W, b, name="scores") lf.predictions = tf.argmax(lf.scores, 1, name="predictions")

输出层其实是个softmax分类器，没什么可讲的，但是要注意l2正则（虽然有paper说l2加不加并没有什么区别）

但是我还有⼀个疑问是为什么对b也要进⾏正则约束？

另外，tf.nn.xw_plus_b()在open api中并没有提供，参考github上的某个

因此可以改为tf.matmul(lf.h_drop, W) + b但是不好的地⽅是⽆法设置name了。。（⽤xw_plus_b也

不会报错不改也可以）

还有⼀个奇怪的地⽅是，这⼀层按道理说应该是⼀个softmax layer，但是并没有使⽤到softmax函数，在Yoon的⽂章中也是直接得到输出的，

因此，我们也按照这种⽅式写代码，得到所有类别的score，并且选出较⼤值的那个类别(argmax)

y的shape为[batch, num_class]，因此argmax的时候是选取每⾏的max，dimention=1

因此，最后scores的shape为[batch, 1]

8 Loss function得到了整个⽹络的输出之后，也就是我们得到了y_prediction，但还需要和真实的y label进⾏⽐较，以此来确定预测好坏。

1 2 3 4# CalculateMean cross-entropy loss

with tf.name_scope("loss"):

loss = tf.nn.softmax_cross_entropy_with_logits(lf.scores, lf.input_y) lf.loss = tf.reduce_mean(loss) + l2_reg_lambda * l2_loss

还是使⽤常规的cross_entropy作为loss function。最后⼀层是全连接层，为了防⽌过拟合，最后还要在loss func中加⼊l2正则项，即l2_loss。l2_reg_lambda来确定惩罚的⼒度。

9 Accuracy

1 2 3 4# Accuracy

with tf.name_scope("accuracy"):

correct_predictions = tf.equal(lf.predictions, tf.argmax(lf.input_y, 1))

lf.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")

tf.equal(x, y)返回的是⼀个bool tensor，如果xy对应位置的值相等就是true，否则fal。得到的tensor是[batch, 1]的。

tf.cast(x, dtype)将bool tensor转化成float类型的tensor，⽅便计算

本文发布于:2023-05-12 02:21:00，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/885889.html

上一篇：学校党支部书记组织生活会个人对照检查材料【十篇】

下一篇：党员转正申请书13篇

标签：得到输出代码没有参数正则过程

留言与评论（共有 0 条评论）