您好,登錄后才能下訂單哦!
本篇文章給大家分享的是有關怎么在Python中利用網卡發送HTTP請求,小編覺得挺實用的,因此分享給大家學習,希望大家閱讀完這篇文章后可以有所收獲,話不多說,跟著小編一起來看看吧。
需求: 一臺機器上有多個網卡, 如何訪問指定的 URL 時使用指定的網卡發送數據呢?
$ curl --interface eth0 www.baidu.com # curl interface 可以指定網卡
閱讀 urllib.py 的源碼, 追述到 open_http –> httplib.HTTP –> httplib.HTTP._connection_class = HTTPConnection
HTTPConnection 在創建的時候會指定一個 source_address.
HTTPConnection.connect 時調用 HTTPConnection._create_connection = socket.create_connection
# 先看一下本地網卡信息 $ ifconfig lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 options=3<RXCSUM,TXCSUM> inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 nd6 options=1<PERFORMNUD> en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether c8:e0:eb:17:3a:73 inet6 fe80::cae0:ebff:fe17:3a73%en0 prefixlen 64 scopeid 0x4 inet 192.168.20.2 netmask 0xffffff00 broadcast 192.168.20.255 nd6 options=1<PERFORMNUD> media: autoselect status: active en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=4<VLAN_MTU> ether 0c:5b:8f:27:9a:64 inet6 fe80::e5b:8fff:fe27:9a64%en8 prefixlen 64 scopeid 0xa inet 192.168.8.100 netmask 0xffffff00 broadcast 192.168.8.255 nd6 options=1<PERFORMNUD> media: autoselect (100baseTX <full-duplex>) status: active
可以看到en0和en1, 這兩塊網卡都可以訪問公網. lo0是本地回環.
直接修改 socket.py 做測試.
def create_connection(address, timeout=_GLOBAL_DEFAULT_TIMEOUT, source_address=None): """If *source_address* is set it must be a tuple of (host, port) for the socket to bind as a source address before making the connection. An host of '' or port 0 tells the OS to use the default. source_address 如果設置, 必須是傳遞元組 (host, port), 默認是 ("", 0) """ host, port = address err = None for res in getaddrinfo(host, port, 0, SOCK_STREAM): af, socktype, proto, canonname, sa = res sock = None try: sock = socket(af, socktype, proto) # sock.bind(("192.168.20.2", 0)) # en0 # sock.bind(("192.168.8.100", 0)) # en1 # sock.bind(("127.0.0.1", 0)) # lo0 if timeout is not _GLOBAL_DEFAULT_TIMEOUT: sock.settimeout(timeout) if source_address: print "socket bind source_address: %s" % source_address sock.bind(source_address) sock.connect(sa) return sock except error as _: err = _ if sock is not None: sock.close() if err is not None: raise err else: raise error("getaddrinfo returns an empty list")
參考說明文檔, 直接分三次綁定不通網卡的 IP 地址, 端口設置為0.
# 測試 en0 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' .148.245.16 # 測試 en1 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' .94.115.227 # 測試 lo0 $ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()' Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen return opener.open(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open return getattr(self, name)(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http h.endheaders(data) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders self._send_output(message_body) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output self.send(msg) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send self.connect() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect self.timeout, self.source_address) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 578, in create_connection raise err IOError: [Errno socket error] [Errno 49] Can't assign requested address
測試通過, 說明在多網卡情況下, 創建 socket 時綁定某塊網卡的 IP 就可以, 端口需要設置為0. 如果端口不設置為0, 第二次請求時, 可以看到拋異常, 端口被占用.
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen return opener.open(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open return getattr(self, name)(url) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http h.endheaders(data) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders self._send_output(message_body) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output self.send(msg) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send self.connect() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect self.timeout, self.source_address) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 577, in create_connection raise err IOError: [Errno socket error] [Errno 48] Address already in use
如果是在項目中, 只需要把 socket.create_connection 這個函數的形參 source_address 設置為對應網卡的 (IP, 0) 就可以.
# test-interface_urllib.py import socket import urllib, urllib2 _create_socket = socket.create_connection SOURCE_ADDRESS = ("127.0.0.1", 0) #SOURCE_ADDRESS = ("172.28.153.121", 0) #SOURCE_ADDRESS = ("172.16.30.41", 0) def create_connection(*args, **kwargs): in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS args = tuple(args) in_args = True if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _create_socket(*args, **kwargs) socket.create_connection = create_connection print urllib.urlopen("http://ip.haschek.at").read()
通過測試, 可以發現已經可以通過制定的網卡發送數據, 并且 IP 地址對應網卡分配的 IP.
問題, 爬蟲經常使用 requests, requests 是否支持呢. 通過測試, 可以發現, requests 并沒有使用 python 內置的 socket 模塊.
看源碼, requests 是如果創建的 socket 連接呢. 方法和查看 urllib 創建socket 的方式一樣. 具體就不寫了.
因為我用的是 python 2.7, 所以可以定位到 requests 使用的 socket 模塊是 urllib3.utils.connection 的.
修改方法和 urllib 相差不大.
import urllib3.connection _create_socket = urllib3.connection.connection.create_connection # pass urllib3.connection.connection.create_connection = create_connection # pass
運行后, 可能會拋出異常. requests.exceptions.ConnectionError: Max retries exceeded with .. Invalid argument
這個異常不是每次出現, 跟 IP 段有關系, 跳轉遞歸層數太多導致, 只需要將 kwargs 中的 socket_options去掉即可. 127.0.0.1肯定會出異常.
import socket import urllib import urllib2 import urllib3.connection import requests as req _default_create_socket = socket.create_connection _urllib3_create_socket = urllib3.connection.connection.create_connection SOURCE_ADDRESS = ("127.0.0.1", 0) #SOURCE_ADDRESS = ("172.28.153.121", 0) #SOURCE_ADDRESS = ("172.16.30.41", 0) def default_create_connection(*args, **kwargs): try: del kwargs["socket_options"] except: pass in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS args = tuple(args) in_args = True if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _default_create_socket(*args, **kwargs) def urllib3_create_connection(*args, **kwargs): in_args = False if len(args) >=3: args = list(args) args[2] = SOURCE_ADDRESS in_args = True args = tuple(args) if not in_args: kwargs["source_address"] = SOURCE_ADDRESS print "args", args print "kwargs", str(kwargs) return _urllib3_create_socket(*args, **kwargs) socket.create_connection = default_create_connection # 因為偶爾會出問題, 所以使用默認的 socket.create_connection # urllib3.connection.connection.create_connection = urllib3_create_connection urllib3.connection.connection.create_connection = default_create_connection print " *** test requests: " + req.get("http://ip.haschek.at").content print " *** test urllib: " + urllib.urlopen("http://ip.haschek.at").read() print " *** test urllib2: " + urllib2.urlopen("http://ip.haschek.at").read()
注意: 使用 urllib3.utils.connection 好像不起作用
稍微再完善一下, 就是把根據網卡名自動獲取 IP.
import subprocess def get_all_net_devices(): sub = subprocess.Popen("ls /sys/class/net", shell=True, stdout=subprocess.PIPE) sub.wait() net_devices = sub.stdout.read().strip().splitlines() # ['eth0', 'eth2', 'lo'] # 這里簡單過濾一下網卡名字, 根據需求改動 net_devices = [i for i in net_devices if "ppp" in i] return net_devices ALL_DEVICES = get_all_net_devices() def get_local_ip(device_name): sub = subprocess.Popen("/sbin/ifconfig en0 | grep '%s ' | awk '{print $2}'" % device_name, shell=True, stdout=subprocess.PIPE) sub.wait() ip = sub.stdout.read().strip() return ip def random_local_ip(): return get_local_ip(random.choice(ALL_DEVICES)) # code ...
只需要把 args[2] = SOURCE_ADDRESS 和 kwargs["source_address"] = SOURCE_ADDRESS改成 random_local_ip() 或者 get_local_ip("eth0")
以上就是怎么在Python中利用網卡發送HTTP請求,小編相信有部分知識點可能是我們日常工作會見到或用到的。希望你能通過這篇文章學到更多知識。更多詳情敬請關注億速云行業資訊頻道。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。