WhatWeb源码分析之运⾏流程
熟悉了部分WhatWeb源码,这⼀篇记录调试WhatWeb,梳理得到的WhatWeb运⾏流程。
调试之前,可以运⾏⼀下WhatWeb的帮助,得到WhatWeb提供的所有选项,⼤致知道WhatWeb提供的功能有哪些。ruby whatweb -h
.$$$ $. .$$$ $.
$$$$ $$. .$$$ $$$ .$$$$$$. .$$$$$$$$$$. $$$$ $$. .$$$$$$$. .$$$$$$.
$ $$ $$$ $ $$ $$$ $ $$$$$$. $$$$$ $$$$$$ $ $$ $$$ $ $$ $$ $ $$$$$$.
$ `$ $$$ $ `$ $$$ $ `$ $$$ $$' $ `$ `$$ $ `$ $$$ $ `$ $ `$ $$$'
$. $ $$$ $. $$$$$$ $. $$$$$$ `$ $. $ :' $. $ $$$ $. $$$$ $. $$$$$.
$::$ . $$$ $::$ $$$ $::$ $$$ $::$ $::$ . $$$ $::$ $::$ $$$$
$;;$ $$$ $$$ $;;$ $$$ $;;$ $$$ $;;$ $;;$ $$$ $$$ $;;$ $;;$ $$$$
$$$$$$ $$$$$ $$$$ $$$ $$$$ $$$ $$$$ $$$$$$ $$$$$ $$$$$$$$$ $$$$$$$$$'
WhatWeb - Next generation web scanner version 0.4.8-dev.
Developed by Andrew Horton aka urbanadventurer and Brendan Coles.
Homepage: /rearch/whatweb
Usage: whatweb [options] <URLs>
TARGET SELECTION:
<TARGETs> Enter URLs, hostnames, IP adddress,
filenames, or nmap-format IP address ranges.
--input-file=FILE, -i Read targets from a file. You can pipe
hostnames or URLs directly with -i /dev/stdin.
TARGET MODIFICATION:
--url-prefix Add a prefix to target URLs.
-
-url-suffix Add a suffix to target URLs.
--url-pattern Inrt the targets into a URL.
< /%inrt%/
AGGRESSION:
The aggression level controls the trade-off between speed/stealth and
reliability.
--aggression, -a=LEVEL Set the aggression level. Default: 1.
1. Stealthy Makes one HTTP request per target and also
follows redirects.
3. Aggressive If a level 1 plugin is matched, additional
requests will be made.
4. Heavy Makes a lot of HTTP requests per target. URLs
from all plugins are attempted.
HTTP OPTIONS:
--ur-agent, -U=AGENT Identify as AGENT instead of WhatWeb/0.4.8-dev. --header, -H Add an HTTP header. eg "Foo:Bar". Specifying a
default header will replace it. Specifying an
empty value, e.g. "Ur-Agent:" will remove it.
--follow-redirect=WHEN Control when to follow redirects. WHEN may be `never', `http-only', `meta-only', `same-site',
`same-domain' or `always'. Default: always.
--max-redirects=NUM Maximum number of redirects. Default: 10. AUTHENTICATION:
--ur, -u=<ur:password> HTTP basic authentication.
--cookie, -c=COOKIES U cookies, e.g. 'name=value; name2=value2'. PROXY:
--proxy <hostname[:port]> Set proxy hostname and port.
Default: 8080.
痔疮药--proxy-ur <urname:password> Set proxy ur and password. PLUGINS:
--list-plugins, -l List all plugins.
--info-plugins, -I=[SEARCH] List all plugins with detailed information.
Optionally arch with keywords in a comma
delimited list.
--arch-plugins=STRING Search plugins for a keyword.
--plugins, -p=LIST Select plugins. LIST is a comma delimited t
of lected plugins. Default is all.
Each element can be a directory, file or plugin
name and can optionally have a modifier, +/-.
爱因斯坦传记
Examples: +/tmp/moo.rb,+/tmp/foo.rb
title,md5,+./plugins-disabled/
./plugins-disabled,-md5
-p + is a shortcut for -p +plugins-disabled.
--grep, -g=STRING Search for STRING in HTTP respons. Reports
with a plugin named Grep.
--custom-plugin=DEFINITION Define a custom plugin named Custom-Plugin, Examples: ":text=>'powered by abc'"
":version=>/powered[ ]?by ab[0-9]/"
":ghdb=>'intitle:abc \"powered by abc\"'"
":md5=>'8666257030b94d3bdb46e05945f60b42'"
"{:text=>'powered by abc'}"
--dorks=PLUGIN List Google dorks for the lected plugin. OUTPUT:
--verbo, -v Verbo output includes plugin descriptions.
U twice for debugging.
--colour,--color=WHEN control whether colour is ud. WHEN may be `never', `always', or `auto'.
--quiet, -q Do not display brief logging to STDOUT.
--no-errors Suppress error messages.
大金鱼LOGGING:
--log-brief=FILE Log brief, one-line output.
--log-verbo=FILE Log verbo output.
--log-errors=FILE Log errors.
--log-xml=FILE Log XML format.
射手座三--log-json=FILE Log JSON format.
--log-sql=FILE Log SQL INSERT statements.
--log-sql-create=FILE Create SQL databa tables.
--log-json-verbo=FILE Log JSON Verbo format.
--log-magictree=FILE Log MagicTree XML format.
--log-object=FILE Log Ruby object inspection format.
--log-mongo-databa Name of the MongoDB databa.
--log-mongo-collection Name of the MongoDB collection.
Default: whatweb.
-
-log-mongo-host MongoDB hostname or IP address.
Default: 0.0.0.0.
--log-mongo-urname MongoDB urname. Default: nil.
--log-mongo-password MongoDB password. Default: nil. PERFORMANCE & STABILITY:
--max-threads, -t Number of simultaneous threads. Default: 25.
--open-timeout Time in conds. Default: 15.
--read-timeout Time in conds. Default: 30.
--wait=SECONDS Wait SECONDS between connections.
This is uful when using a single thread.
HELP & MISCELLANEOUS:
--short-help Short usage help.
-
-help, -h Complete usage help.
--debug Rai errors in plugins.廉政家书
--version Display version information.
EXAMPLE USAGE:
*
./
* slashdot with verbo plugin descriptions.
./whatweb - slashdot
* An aggressive scan detects the exact version of WordPress.
./whatweb -a 3
* Scan the local network quickly and suppress errors.
whatweb --no-errors 192.168.0.0/24
* Scan the local network for https websites.
whatweb --no-errors --url-prefix 192.168.0.0/24
* Scan for crossdomain policies in the Alexa Top 1000.
./whatweb -i \
--url-suffix /l -p crossdomain_xml
OPTIONAL DEPENDENCIES
--------------------------------------------------------------------------------
To enable MongoDB logging install the mongo gem.
To enable character t detection and MongoDB logging install the rchardet gem.
可以看到WhatWeb提供了丰富选项,在这⾥我选参数v运⾏WhatWeb获取⼀个特定⽬标的指纹,来梳理WhatWeb的运⾏流程。
在whatweb源代码的680⾏下断点,开始调试。
上⾯这⼀段代码到741⾏结束是变量初始化的过程,其中w是构建参数解析。继续向下执⾏,其中743⾏开始到933⾏结束是对⽤户输⼊的值进⾏解析。如下所⽰:
这⾥,我们只输⼊了-v参数,在745⾏下断点,可以看到如下所⽰:
变量verbo的值会加1。继续运⾏会跳转到对终端输出颜⾊的配置,根据操作系统类型进⾏设置。
继续跟进,根据判断条件来进⾏检测插件的选择。如果没有指定⾃定义插件,那么就会加载缺省插件。因为我只指定-v 参数,那么u_custom_plugin是fal、plugin_lection是nil。我们跟进到PluginSupport.load_plugins函数看看,这个函数就是加载插件⽬录的。
进⼊load_plugins函数可以看到设置了缺省⽬录。
在这个⽬录下搜索插件识别⽂件。下⾯的代码是加载相关插件⽂件。
彩虹歌继续跟进到函数load_plugin中去,跟进298这⼀⾏的load f,看下⾯三张截图
这个组合起来就⽐较好理解下⾯的赋值。
继续下去有⼀个优化插件的函数调⽤:
心潮起伏跟进去看⼀看,就是对插件识别脚本的进⼀步细化。
跳出上⾯的函数,继续往下调试,就到了定义HTTP Request的头了,可以⽤户⾃定义,也可以采⽤缺省值。
没什么好说的,继续下去,接着是对⽬标url的筛选:
跟进去看看:
这⾥需要注意的是这⼏个正则表达式,下⾯这个是匹配类似192.168.0.1-200这种表⽰形式的IP范围字符串,后⾯⼀个正则表达式是不匹配单个IP地址。
接下来是对URL地址进⾏规则化:
最重要的部分就要来了,处理指定的URL,获取指纹信息。这算是核⼼代码段了。这⼀块我调试了很久,也⽐较迷惑,因为是多线程,存在线程切换,容易糊涂,不⼀定说的完全明⽩,试着写⼀写。
这是调⽤next_target函数,获取⽬标URL地址。跟进到next_target函数看看:
这是赋值的过程。最后有⼀个判断最近⽬标的值是不是超过100个:
跳出next_target函数之后,继续执⾏,这⾥是⼀个类似do…while的代码段。跳转到判断线程是不是超过缺省值。
继续跟进,跳转到w(do) do |thistarget|代码块中进⾏执⾏,如下:
跟进进去,进⼊对target的初始化函数:
继续调试的过程中,会在⼏个代码段来回切换,继续跟进下去:
这⾥设置了⼀些参数,然后对⽬标URL进⾏访问,会得到HTTP请求的Respon,包含很详细的各字段。
继续跟进,到了根据HTTP返回值与插件⽂件进⾏匹配的部分:
跟进去看看,就是实现的代码了:
上⾯的实现,涉及到锁的使⽤。
最后就是结果的输出了。
上⾯的运⾏流程分析还是⽐较粗糙,下⼀篇继续深⼊分析部分实现细节。高考作文字数要求