Wireshark抓包常见出现错误
转⾃
1. tcp out-of-order(tcp有问题)
解答:
1)、应该有很多原因。但是多半是⽹络拥塞,导致顺序包抵达时间不同,延时太长,或者包丢失,需要重新组合数据单元因为他们可能是通过不同的路径到达你电脑上⾯的。
2)、 CRM IT 同仁上礼拜来跟我反应⼀个问题,由他们客服系统藉由邮件主机要寄送给客户的信件,常常会有寄送失败的问题,查看了⼀下 Log,发现正常的信件在主机接收 DATA 完成后会记录收到的邮件⼤⼩,然后开始进⾏后续寄送出去的处理,但这些有问题的寄送,都会发⽣ DATA 没有传送完,Server 就记录已读取到 EOF,然后结束连线,也因此这封信就不算顺利的送到 Server 上来。
初步看了⼀下排除是 Timeout 问题,因为连线断的时间都还未达设定的连线 Timeout 时间,由于 CRM 系统是外⾯⼚商写的,为了厘清问题我只好抓封包来看是不是⽤户端送出来结束传送的指令的。
抓了⼀下结果如下:
整封邮件的传送过程,包含了⼤量的 TCP Retransmission 或是 Segment Lost,到后来还有跑出 TCP Out-Of-Order,看起来是⽹路的问题,⽹路上对于 TCP Out-Of-Order 的建议是说,有些 Packet 可能 Lost,所以重新传送造成,另⼀个可能是因为 Client 到 Server 间有两条⽹路路径,像是 Load Balance 之类的架构,因此若两个封包⾛不同路径,晚送的封包却⽐早送的到达,就会发⽣ Out-Of-Order。
因此在断定有可能是⽹路造成,加上 CRM 系统上的⽹卡同事是把两张做成⼀张 Virtual,再请他拿掉 Bonding 只⽤单⼀张跑以后,问题就不存在了,观察流量还跑的⽐原本两张合起来的 Virtual 单张跑的⾼,所以 M$ 在 Bonding ⽹卡上是不是还有什么需要调整的就不得⽽之了,⾄少找出造成⼤量寄送失败的原因就好。
2. tcp gment of a reasmbled PDU
解答:1)在连个连接建⽴的时候,SYN包⾥⾯会把彼此TCP最⼤的报⽂段长度,在局域⽹内⼀般都是1460.如果发送的包⽐最⼤的报⽂段长度长的话就要分⽚了,被分⽚出来的包,就会被标记了“TCP gment of a reasmbled PDU”,可以参考下图,看⼀下,被标记了的包的SEQ和ACK都和原来的包⼀致:
2)上周在公司⾥遇到⼀个问题,⽤wireshark抓系统给⽹管上报的数据发现⾥⾯有好多报⽂被标识为“
TCP gment of a reasmbled PDU”,并且每⼀段报⽂都是180Byte,当时看到这样的标识,觉得是IP报⽂分⽚,以为系统的接⼝MTU值为设置⼩了,通过命令查询发现是1500,没有被重设过,当时有点想不通。
回来查了⼀下,发现⾃⼰的理解是错的,“TCP gment of a reasmbled PDU”指的不是IP层的分⽚,IP分⽚在wireshark⾥⽤“Fragmented IP protocol”来标识。详细查了⼀下,发现“TCP gment of a reasmbled PDU”指TCP层收到上层⼤块报⽂后分解成段后发出去。于是有个疑问,TCP层完全可以把⼤段报⽂丢给IP层,让IP层完成分段,为什么要在TCP层分呢?其实这个是由TCP的MSS(Maximum Segment Size,最⼤报⽂段长度)决定的,TCP在发起连接的第⼀个报⽂的TCP头⾥通过MSS这个可选项告知对⽅本端能够接收的最⼤报⽂(当然,这个⼤⼩是TCP净荷的⼤⼩),以太⽹上这个值⼀般设置成1460,因为1460Byte净荷+20Byte TCP头+20Byte IP头= 1500字节,正好符合链路层最⼤报⽂的要求。
⾄于收到⼀个报⽂后如何确定它是⼀个”TCP gment”?如果有⼏个报⽂的ACK序号都⼀样,并且这些报⽂的Sequence Number都不⼀样,并且后⼀个Sequence Number为前⼀个Sequence Number加上前⼀个报⽂⼤⼩再加上1的话,肯定是TCP gment了,对于没有ACK标志时,则⽆法判断。
既然收到的TCP报⽂都是180Byte的gment,那么应该是协商的时候PC端告知了MSS为180Byte,
⾄于为什么这样,只能等抓包后确认是MSS的问题再排查了。另外,有⼀种情况也可能导致这个问题:被测系统因为MTU为220Byte⽽设置MSS为180Byte,但是这种情况现在可以排除,因为前⾯讲过,已经查询过MTU值为1500。
3. Tcp previous gment lost(tcp先前的分⽚丢失)
解答:
(1)、“TCP Previous gment lost” errors are not “fatal” errors. They simply indicate that the quence number in the arriving packet is higher than the next-expected quence number, indicating that at least one gment was dropped/lost. The receiving station remedies this situation by nding duplicate ACKs for each additional packet it receives until the nder retransmits the missing packet(s). TCP is designed to recover from this situation, which is why the image is downloaded correctly despite having a (briefly) missing packet.
If you are getting a large number of lost packets, then there is likely a communication problem between the nder and receiver. A common cau of this is un-matched duplex ttings between the PC and the switch.
We (our lab) recently upgraded to Ethereal 0.10.14 with WinPCap 3.1. If I remember correctly, we had previously been using 0.10.2 with WinPCap 3.0. However, since the upgrade we have been noticing veral issues.
The first issue is with “TCP Previous gment lost” and “TCP CHECKSUM INCORRECT” messages appearing in the Packet Listing window. We do not remember eing the in the previous version of Ethereal, or at least not nearly as many as we are eing now. For example, one task for the student instructional part of the lab involves visiting a website containing two images and obrving the network
activity. After the two GET requests are nt for the images, it is not uncommon for one image to be returned with a typical 200 OK respon packet, but the respon packet for the other image will be displayed as “TCP Previous gment lost.” However, both images are downloaded and displayed perfectly fine in the browr. I would think that the gment lost error would mean the object wasn’t returned correctly and shouldn’t be able to be displayed, but apparently that is not the ca. (The cache had been cleared when this was performed, so it was not defaulting to a local copy of the image.)
Another problem we’ve been noticing is that some packets simply aren’t displayed in the Packet Listing window, even when they are obviously received. Using the same example as above, after the two GET requests are nt for the images, it is not uncommon for one image to be returned with a typical 200 OK respon, but the other respon will not appear. Yet both images are successfully displayed in the browr. Is this a problem with Ethereal not detecting the packets?
I’m not sure how typical this is, but we em to be experiencing the issues often with 0.10.14 while we never did with 0.10.2. Could it also be an issue with WinPCap, and not necessarily Ethereal? I’m just trying to find some answers as to why we are eing a sudden abundance of TCP related errors and uncaptured packets. Thanks.
(2)、I have a network client application that runs fine while I am debugging (no TCP errors),
but when I run the relea version, it runs incredibly slow. It runs as a ries of
transactions, where each transaction is a parate connection to the rver. Wireshark
analysis has determined that about 50% of all transactions involve the ries:
TCP Previous Segment Lost
TCP Dup ACK
RST
The RST consumes 3 conds per transaction, which is a Big Deal. So to prevent it, I must
prevent the initial “TCP Previous Segment Lost” (which ems, on the surface, to merely be
a time-out on a particular gment).
In the following clip, the SYN packet suffers from the “TCP Previous Segment Lost” condition.
0.000640 conds ems like too short of a time to declare this condition, as many previous
successful transactions took much longer to be successfully SYN-ACK’ed.
Can somebody explain “TCP Previous Segment Lost” in this context to help me troubleshoot my
problem?
Any help would be appreciated.
Here is a clip of a problem transaction:
4. Tcpacked lost gment(tcp应答丢失)
5. Tcp window update(tcp窗⼝更新)
6. Tcp dup ack(tcp重复应答)
TCP may generate an immediate acknowledgment (a duplicate ACK) when an out- of-order gment is received. This duplicate ACK should not be delayed. The purpo of this duplicate ACK is to let the other end know that a gment was received out of order, and to tell it what quence number is expected.
当收到⼀个出问题的分⽚,Tcp⽴即产⽣⼀个应答。这个相同的ack不会延迟。这个相同应答的意图是让对端知道⼀个分⽚被收到的时候出现问题,并且告诉它希望得到的序列号。
Since TCP does not know whether a duplicate ACK is caud by a lost gment or just a reordering of gments, it waits for a small number of duplicate ACKs to be received. It is assumed that if there is just a reordering of the gments, there will be only one or two duplicate ACKs before the reordered gment is procesd, which will then generate a new ACK. If three or more duplicate AC
Ks are received in a row, it is a strong indication that a gment has been lost. TCP then performs a retransmission of what appears to be the missing gment, without waiting for a retransmission timer to expire.
7. Tcp keep alive(tcp保持活动)
在TCP中有⼀个Keep-alive的机制可以检测死连接,原理很简单,TCP会在空闲了⼀定时间后发送数据给对⽅:
1.如果主机可达,对⽅就会响应ACK应答,就认为是存活的。
2.如果可达,但应⽤程序退出,对⽅就发RST应答,发送TCP撤消连接。
3.如果可达,但应⽤程序崩溃,对⽅就发FIN消息。
4.如果对⽅主机不响应ack, rst,继续发送直到超时,就撤消连接。这个时间就是默认
的⼆个⼩时。
us WinSock2;
procedure TForm1.IdTCPServer1Connect(AThread: TIdPeerThread);
type
TCP_KeepAlive = record
OnOff: Cardinal;
KeepAliveTime: Cardinal;
KeepAliveInterval: Cardinal
end;
var
Val: TCP_KeepAlive;
Ret: DWord;
begin
Val.OnOff:=1;
Val.KeepAliveTime:=6000; //6s
Val.KeepAliveInterval:=6000; //6s
WSAIoctl(AThread.Connection.Socket.Binding.Handle, IOC_IN or IOC_VENDOR or 4,
@Val, SizeOf(Val), nil, 0, @Ret, nil, nil)
end;
——————————————————–
KeepAliveTime值控制 TCP/IP 尝试验证空闲连接是否完好的频率。如果这段时间内没有活动,则会发送保持活动信号。如果⽹络⼯作正常,⽽且接收⽅是活动的,它就会响应。如果需要对丢失接收⽅敏感,换句话说,需要更快地发现丢失了接收⽅,请考虑减⼩这个值。如果长期不活动的空闲连接出现次数较多,⽽丢失接收⽅的情况出现较少,您可能会要提⾼该值以减少开销。缺省情况下,如果空闲连接7200000 毫秒(2 ⼩时)内没有活动,Windows 就发送保持活动的消息。通常,1800000 毫秒是⾸选值,从⽽⼀半的已关闭连接会在 30 分钟内被检测到。
KeepAliveInterval值定义了如果未从接收⽅收到保持活动消息的响应,TCP/IP 重复发送保持活动信号的频率。当连续发送保持活动信号、但未收到响应的次数超出TcpMaxDataRetransmissions的值时,会放弃该连接。如果期望较长的响应时间,您可能需要提⾼该值以减少开销。如果需要减少花在验证接收⽅是否已丢失上的时间,请考虑减⼩该值或TcpMaxDataRetransmissions值。缺省情况下,在未收到响应⽽重新发送保持活动的消息之前,Windows 会等待 1000 毫秒(1 秒)。
KeepAliveTime根据你的需要设置就⾏,⽐如10分钟,注意要转换成MS。
XXX代表这个间隔值得⼤⼩
8. Tcp retransmission(tcp重传)
作为⼀个可靠的传输协议,传输控制协议(TCP)在发送主机需要从⽬标主机收到⼀个包时确认。If the nder does not receive that acknowledgment within a certain amount of time, it acts under the assumption that the packet did not reach its destination and retransmits the packet.如果发件⼈没有收到的时间内⼀定之⾦额,确认,它的⾏为假设下,该数据包没有到达其⽬的地,以及转发数据包。