Python 使用指定的網(wǎng)卡發(fā)送HTTP請求的實例
需求: 一臺機器上有多個網(wǎng)卡, 如何訪問指定的 URL 時使用指定的網(wǎng)卡發(fā)送數(shù)據(jù)呢?
$ curl --interface eth0 www.baidu.com # curl interface 可以指定網(wǎng)卡
閱讀 urllib.py 的源碼, 追述到 open_http –> httplib.HTTP –> httplib.HTTP._connection_class = HTTPConnection
HTTPConnection 在創(chuàng)建的時候會指定一個 source_address.
HTTPConnection.connect 時調(diào)用 HTTPConnection._create_connection = socket.create_connection
# 先看一下本地網(wǎng)卡信息 $ ifconfig lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 options=3<RXCSUM,TXCSUM> inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 nd6 options=1<PERFORMNUD> en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether c8:e0:eb:17:3a:73 inet6 fe80::cae0:ebff:fe17:3a73%en0 prefixlen 64 scopeid 0x4 inet 192.168.20.2 netmask 0xffffff00 broadcast 192.168.20.255 nd6 options=1<PERFORMNUD> media: autoselect status: active en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=4<VLAN_MTU> ether 0c:5b:8f:27:9a:64 inet6 fe80::e5b:8fff:fe27:9a64%en8 prefixlen 64 scopeid 0xa inet 192.168.8.100 netmask 0xffffff00 broadcast 192.168.8.255 nd6 options=1<PERFORMNUD> media: autoselect (100baseTX <full-duplex>) status: active
可以看到en0和en1, 這兩塊網(wǎng)卡都可以訪問公網(wǎng). lo0是本地回環(huán).
直接修改 socket.py 做測試.
def create_connection(address, timeout=_GLOBAL_DEFAULT_TIMEOUT, source_address=None): """If *source_address* is set it must be a tuple of (host, port) for the socket to bind as a source address before making the connection. An host of '' or port 0 tells the OS to use the default. source_address 如果設置, 必須是傳遞元組 (host, port), 默認是 ("", 0) """ host, port = address err = None for res in getaddrinfo(host, port, 0, SOCK_STREAM): af, socktype, proto, canonname, sa = res sock = None try: sock = socket(af, socktype, proto) # sock.bind(("192.168.20.2", 0)) # en0 # sock.bind(("192.168.8.100", 0)) # en1 # sock.bind(("127.0.0.1", 0)) # lo0 if timeout is not _GLOBAL_DEFAULT_TIMEOUT: sock.settimeout(timeout) if source_address: print "socket bind source_address: %s" % source_address sock.bind(source_address) sock.connect(sa) return sock except error as _: err = _ if sock is not None: sock.close() if err is not None: raise err else: raise error("getaddrinfo returns an empty list")
參考說明文檔, 直接分三次綁定不通網(wǎng)卡的 IP 地址, 端口設置為0.
# 測試 en0 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' .148.245.16 # 測試 en1 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' .94.115.227 # 測試 lo0 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen return opener.open(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open return getattr(self, name)(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http h.endheaders(data) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders self._send_output(message_body) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output self.send(msg) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send self.connect() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect self.timeout, self.source_address) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 578, in create_connection raise err IOError: [Errno socket error] [Errno 49] Can't assign requested address
測試通過, 說明在多網(wǎng)卡情況下, 創(chuàng)建 socket 時綁定某塊網(wǎng)卡的 IP 就可以, 端口需要設置為0. 如果端口不設置為0, 第二次請求時, 可以看到拋異常, 端口被占用.
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen return opener.open(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open return getattr(self, name)(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http h.endheaders(data) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders self._send_output(message_body) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output self.send(msg) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send self.connect() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect self.timeout, self.source_address) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 577, in create_connection raise err IOError: [Errno socket error] [Errno 48] Address already in use
如果是在項目中, 只需要把 socket.create_connection 這個函數(shù)的形參 source_address 設置為對應網(wǎng)卡的 (IP, 0) 就可以.
# test-interface_urllib.py import socket import urllib, urllib2 _create_socket = socket.create_connection SOURCE_ADDRESS = ("127.0.0.1", 0) #SOURCE_ADDRESS = ("172.28.153.121", 0) #SOURCE_ADDRESS = ("172.16.30.41", 0) def create_connection(*args, **kwargs): in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS args = tuple(args) in_args = True if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _create_socket(*args, **kwargs) socket.create_connection = create_connection print urllib.urlopen("http://ip.haschek.at").read()
通過測試, 可以發(fā)現(xiàn)已經(jīng)可以通過制定的網(wǎng)卡發(fā)送數(shù)據(jù), 并且 IP 地址對應網(wǎng)卡分配的 IP.
問題, 爬蟲經(jīng)常使用 requests, requests 是否支持呢. 通過測試, 可以發(fā)現(xiàn), requests 并沒有使用 python 內(nèi)置的 socket 模塊.
看源碼, requests 是如果創(chuàng)建的 socket 連接呢. 方法和查看 urllib 創(chuàng)建socket 的方式一樣. 具體就不寫了.
因為我用的是 python 2.7, 所以可以定位到 requests 使用的 socket 模塊是 urllib3.utils.connection 的.
修改方法和 urllib 相差不大.
import urllib3.connection _create_socket = urllib3.connection.connection.create_connection # pass urllib3.connection.connection.create_connection = create_connection # pass
運行后, 可能會拋出異常. requests.exceptions.ConnectionError: Max retries exceeded with .. Invalid argument
這個異常不是每次出現(xiàn), 跟 IP 段有關(guān)系, 跳轉(zhuǎn)遞歸層數(shù)太多導致, 只需要將 kwargs 中的 socket_options去掉即可. 127.0.0.1肯定會出異常.
import socket import urllib import urllib2 import urllib3.connection import requests as req _default_create_socket = socket.create_connection _urllib3_create_socket = urllib3.connection.connection.create_connection SOURCE_ADDRESS = ("127.0.0.1", 0) #SOURCE_ADDRESS = ("172.28.153.121", 0) #SOURCE_ADDRESS = ("172.16.30.41", 0) def default_create_connection(*args, **kwargs): try: del kwargs["socket_options"] except: pass in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS args = tuple(args) in_args = True if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _default_create_socket(*args, **kwargs) def urllib3_create_connection(*args, **kwargs): in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS in_args = True args = tuple(args) if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _urllib3_create_socket(*args, **kwargs) socket.create_connection = default_create_connection # 因為偶爾會出問題, 所以使用默認的 socket.create_connection # urllib3.connection.connection.create_connection = urllib3_create_connection urllib3.connection.connection.create_connection = default_create_connection print " *** test requests: " + req.get("http://ip.haschek.at").content print " *** test urllib: " + urllib.urlopen("http://ip.haschek.at").read() print " *** test urllib2: " + urllib2.urlopen("http://ip.haschek.at").read()
注意: 使用 urllib3.utils.connection 好像不起作用
稍微再完善一下, 就是把根據(jù)網(wǎng)卡名自動獲取 IP.
import subprocess def get_all_net_devices(): sub = subprocess.Popen("ls /sys/class/net", shell=True, stdout=subprocess.PIPE) sub.wait() net_devices = sub.stdout.read().strip().splitlines() # ['eth0', 'eth1', 'lo'] # 這里簡單過濾一下網(wǎng)卡名字, 根據(jù)需求改動 net_devices = [i for i in net_devices if "ppp" in i] return net_devices ALL_DEVICES = get_all_net_devices() def get_local_ip(device_name): sub = subprocess.Popen("/sbin/ifconfig en0 | grep '%s ' | awk '{print $2}'" % device_name, shell=True, stdout=subprocess.PIPE) sub.wait() ip = sub.stdout.read().strip() return ip def random_local_ip(): return get_local_ip(random.choice(ALL_DEVICES)) # code ...
只需要把 args[2] = SOURCE_ADDRESS 和 kwargs["source_address"] = SOURCE_ADDRESS改成 random_local_ip() 或者 get_local_ip("eth0")
至于有什么用途, 就全憑想象了.
以上這篇Python 使用指定的網(wǎng)卡發(fā)送HTTP請求的實例就是小編分享給大家的全部內(nèi)容了,希望能給大家一個參考,也希望大家多多支持腳本之家。
相關(guān)文章
將圖片文件嵌入到wxpython代碼中的實現(xiàn)方法
前面一篇文章中提到的那個程序,GUI中包含了一張圖片。在編譯成exe文件發(fā)布時,無法直接生成一個單獨的exe文件。因此需要直接把圖片寫入到代碼中2014-08-08Python利用shutil模塊實現(xiàn)文件夾的復制刪除與裁剪
shutil模塊是對os模塊的補充,主要針對文件的拷貝、刪除、移動、壓縮和解壓操作。本文將利用shutil模塊實現(xiàn)文件夾的復制刪除與裁剪,需要的可以參考一下2022-05-05