当前位置:主页 > 资料 >

The Python Requests Module
栏目分类:资料   发布日期:2018-08-03   浏览次数:

导读:本文为去找网小编(www.7zhao.net)为您推荐的The Python Requests Module,希望对您有所帮助,谢谢! Dealing with HTTP requests is not an easy task in any programming language. If we talk about Python, it comes with two b

本文为去找网小编(www.7zhao.net)为您推荐的The Python Requests Module,希望对您有所帮助,谢谢!

去找(www.7zhao.net欢迎您



Dealing with HTTP requests is not an easy task in any programming language. If we talk about Python, it comes with two built-in modules, urllib and urllib2 , to handle HTTP related operation. Both modules come with a different set of functionalities and many times they need to be used together. The main drawback of using urllib is that it is confusing (few methods are available in both urllib , urllib2 ), the documentation is not clear and we need to write a lot of code to make even a simple HTTP request. www.7zhao.net

To make these things simpler, one easy-to-use third-party library, known as , is available and most developers prefer to use it instead or urllib / urllib2 . It is an Apache2 licensed HTTP library powered by and httplib .

copyright www.7zhao.net

Installing the Requests Module

Installing this package, like most other Python packages, is pretty straight-forward. You can either download the from Github and install it or use pip: 本文来自去找www.7zhao.net

$ pip install requests 内容来自www.7zhao.net 

For more information regarding the installation process, refer to the .

内容来自www.7zhao.net

To verify the installation, you can try to import it like below: www.7zhao.net

import requests 内容来自www.7zhao.net 

If you don't receive any errors importing the module, then it was successful.

内容来自www.7zhao.net

Making a GET Request

GET is by far the most used HTTP method. We can use GET request to retrieve data from any destination. Let me start with a simple example first. Suppose we want to fetch the content of the home page of our website and print out the resultin HTML data. Using the Requests module, we can do it like below:

copyright www.7zhao.net

import requests

r = requests.get('https://api.github.com/events')  
print(r.content) 
欢迎访问www.7zhao.net

It will print the response in an encoded form. If you want to see the actual text result of the HTML page, you can read the .text property of this object. Similarly, the status_code property prints the current status code of the URL: copyright www.7zhao.net

import requests

r = requests.get('https://api.github.com/events')  
print(r.text)  
print(r.status_code) 

本文来自去找www.7zhao.net

requests will decode the raw content and show you the result. If you want to check what type of encoding is used by requests , you can print out this value by calling .encoding . Even the type of encoding can be changed by changing its value. Now isn't that simple?

本文来自去找www.7zhao.net

Reading the Response

The response of an HTTP request can contain many headers that holds different information. 内容来自www.7zhao.net

is a popular website to test different HTTP operation. In this article, we will use to analyse the response to a GET request. First of all, we need to find out the response header and how it looks. You can use any modern web-browser to find it, but for this example, we will use Google's Chrome browser.

去找(www.7zhao.net欢迎您

  • In Chrome, open the URL , right click anywhere on the page, and select the "Inspect" option
  • This will open a new window within your browser. Refresh the page and click on the "Network" tab.
  • This "Network" tab will show you all different types of network requests made by the browser. Click on the "get" request in the "Name" column and select the "Headers" tab on right.

内容来自www.7zhao.net

The content of the "Response Headers" is our required element. You can see the key-value pairs holding various information about the resource and request. Let's try to parse these values using the requests library: copyright www.7zhao.net

import requests

r = requests.get('http://httpbin.org/get')  
print(r.headers['Access-Control-Allow-Credentials'])  
print(r.headers['Access-Control-Allow-Origin'])  
print(r.headers['CONNECTION'])  
print(r.headers['content-length'])  
print(r.headers['Content-Type'])  
print(r.headers['Date'])  
print(r.headers['server'])  
print(r.headers['via']) 欢迎访问www.7zhao.net 

We retrieved the header information using r.headers and we can access each header value using specific keys. Note that the key is not case-sensitive .

欢迎访问www.7zhao.net

Similarly, let's try to access the response value. The above header shows that the response is in JSON format: (Content-type: application/json) . The Requests library comes with one built-in JSON parser and we can use requests.get('url').json() to parse it as a JSON object. Then the value for each key of the response results can be parsed easily like below:

欢迎访问www.7zhao.net

import requests

r = requests.get('http://httpbin.org/get')

response = r.json()  
print(r.json())  
print(response['args'])  
print(response['headers'])  
print(response['headers']['Accept'])  
print(response['headers']['Accept-Encoding'])  
print(response['headers']['Connection'])  
print(response['headers']['Host'])  
print(response['headers']['User-Agent'])  
print(response['origin'])  
print(response['url']) 
www.7zhao.net

The above code will print the below output:

www.7zhao.net

{'headers': {'Host': 'httpbin.org', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'close', 'Accept': '*/*', 'User-Agent': 'python-requests/2.9.1'}, 'url': 'http://httpbin.org/get', 'args': {}, 'origin': '103.9.74.222'}
{}
{'Host': 'httpbin.org', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'close', 'Accept': '*/*', 'User-Agent': 'python-requests/2.9.1'}
*/*
gzip, deflate  
close  
httpbin.org  
python-requests/2.9.1  
103.9.74.222  
http://httpbin.org/get 

欢迎访问www.7zhao.net

Third line, i.e. r.json() , printed the JSON value of the response. We have stored the JSON value in the variable response and then printed out the value for each key. Note that unlike the previous example, the key-value is case sensitive. copyright www.7zhao.net

Similar to JSON and text content, we can use requests to read the response content in bytes for non-text requests using the .content property. This will automatically decode gzip and deflate encoded files.

www.7zhao.net

Passing Parameters in GET

In some cases, you'll need to pass parameters along with your GET requests, which take the form of query strings. To do this, we need to pass these values in the params parameter, as shown below: copyright www.7zhao.net

import requests

payload = {'user_name': 'admin', 'password': 'password'}  
r = requests.get('http://httpbin.org/get', params=payload)

print(r.url)  
print(r.text) copyright www.7zhao.net 

Here, we are assigning our parameter values to the payload variable, and then to the GET request via params . The above code will return the following output:

内容来自www.7zhao.net

http://httpbin.org/get?password=password&user_name=admin  
{"args":{"password":"password","user_name":"admin"},"headers":{"Accept":"*/*","Accept-Encoding":"gzip, deflate","Connection":"close","Host":"httpbin.org","User-Agent":"python-requests/2.9.1"},"origin":"103.9.74.222","url":"http://httpbin.org/get?password=password&user_name=admin"} 

本文来自去找www.7zhao.net

As you can see, the Reqeusts library automatically turned our dictionary of parameters to a query string and attached it to the URL.

www.7zhao.net

Note that you need to be careful what kind of data you pass via GET requests since the payload is visible in the URL, as you can see in the output above. copyright www.7zhao.net

Making POST Requests

HTTP POST requests are opposite of the GET requests as it is meant for sending data to a server as opposed to retrieving it. Although, POST requests can also receive data within the response, just like GET requests.

欢迎访问www.7zhao.net

Instead of using the get() method, we need to use the post() method. For passing an argument, we can pass it inside the data parameter: 本文来自去找www.7zhao.net

import requests

payload = {'user_name': 'admin', 'password': 'password'}  
r = requests.post("http://httpbin.org/post", data=payload)  
print(r.url)  
print(r.text) 

copyright www.7zhao.net

Output: www.7zhao.net

http://httpbin.org/post  
{"args":{},"data":"","files":{},"form":{"password":"password","user_name":"admin"},"headers":{"Accept":"*/*","Accept-Encoding":"gzip, deflate","Connection":"close","Content-Length":"33","Content-Type":"application/x-www-form-urlencoded","Host":"httpbin.org","User-Agent":"python-requests/2.9.1"},"json":null,"origin":"103.9.74.222","url":"http://httpbin.org/post"} 本文来自去找www.7zhao.net 

The data will be "form-encoded" by default. You can also pass more complicated header requests like a tuple if multiple values have same key, a string instead of a dictionary, or a multipart encoded file.

欢迎访问www.7zhao.net

Sending Files with POST

Sometimes we need to send one or more files simultaneously to the server. For example, if a user is submitting a form and the form includes different form-fields for uploading files, like user profile picture, user resume, etc. Requests can handle multiple files on a single request. This can be achieved by putting the files to a list of tuples, like below: 欢迎访问www.7zhao.net

import requests

url = 'http://httpbin.org/post'  
file_list = [  
    ('image', ('image1.jpg', open('image1.jpg', 'rb'), 'image/png')),
    ('image', ('image2.jpg', open('image2.jpg', 'rb'), 'image/png'))
]

r = requests.post(url, files=file_list)  
print(r.text) 
www.7zhao.net

The tuples containing the files' information are in the form (field_name, file_info) . copyright www.7zhao.net

Other HTTP Request Types

Similar to GET and POST, we can perform other HTTP requests like PUT, DELETE, HEAD, and OPTIONS using the requests library, like below:

www.7zhao.net

import requests

requests.put('url', data={'key': 'value'})  
requests.delete('url')  
requests.head('url')  
requests.options('url') 
欢迎访问www.7zhao.net

Handling Redirections

Redirection in HTTP means forwarding the network request to a different URL. For example, if we make a request to " ", it will redirect to " " using a . 欢迎访问www.7zhao.net

import requests

r = requests.post("http://www.github.com")  
print(r.url)  
print(r.history)  
print(r.status_code) 

copyright www.7zhao.net

Output: copyright www.7zhao.net

https://github.com/  
[<Response [301]>, <Response [301]>]
200 去找(www.7zhao.net欢迎您 

As you can see the redirection process is automatically handled by requests , so you don't need to deal with it yourself. The history property contains the list of all response objects created to complete the redirection. In our example, two Response objects were created with the 301 response code. HTTP 301 and 302 responses are used for permanent and temporary redirection, respectively. 去找(www.7zhao.net欢迎您

If you don't want the Requests library to automatically follow redirects, then you can disable it by passing the allow_redirects=False parameter along with the request.

copyright www.7zhao.net

Handling Timeouts

Another important configuration is telling our library how to handle timeouts, or requests that take too long to return. We can configure requests to stop waiting for a network requests using the timeout parameter. By default, requests will not timeout. So, if we don't configure this property, our program may hang indefinitely, which is not the functionality you'd want in a process that keeps a user waiting.

欢迎访问www.7zhao.net

import requests

requests.get('http://www.google.com', timeout=1) 去找(www.7zhao.net欢迎您 

Here, an exception will be thrown if the server will not respond back within 1 second (which is still aggressive for a real-world application). To get this to fail more often (for the sake of an example), you need to set the timeout limit to a much smaller value, like 0.001.

去找(www.7zhao.net欢迎您

The timeout can be configured for both the "connect" and "read" operations of the request using a tuple, which allows you to specify both values separately: 去找(www.7zhao.net欢迎您

import requests

requests.get('http://www.google.com', timeout=(5, 14)) 
本文来自去找www.7zhao.net

Here, the "connect" timeout is 5 seconds and "read" timeout is 14 seconds. This will allow your request to fail much more quicklly if it can't connect to the resource, and if it does connect then it will give it more time to download the data.

内容来自www.7zhao.net

Cookies and Custom Headers

We have seen previously how to access headers using the headers property. Similarly, we can access cookies from a response using the cookies property. 本文来自去找www.7zhao.net

For example, the below example shows how to access a cookie with name cookie_name :

本文来自去找www.7zhao.net

import requests

r = requests.get('http://www.examplesite.com')  
r.cookies['cookie_name'] 内容来自www.7zhao.net 

We can also send custom cookies to the server by providing a dictionary to the cookies parameter in our GET request. 本文来自去找www.7zhao.net

import requests

custom_cookie = {'cookie_name': 'cookie_value'}  
r = requests.get('http://www.examplesite.com/cookies', cookies=custom_cookie) 欢迎访问www.7zhao.net 

Cookies can also be passed in a object. This allows you to provide cookies for a different path.

内容来自www.7zhao.net

import requests

jar = requests.cookies.RequestsCookieJar()  
jar.set('cookie_one', 'one', domain='httpbin.org', path='/cookies')  
jar.set('cookie_two', 'two', domain='httpbin.org', path='/other')

r = requests.get('https://httpbin.org/cookies', cookies=jar)  
print(r.text) 

欢迎访问www.7zhao.net

Output: copyright www.7zhao.net

{"cookies":{"cookie_one":"one"}} 欢迎访问www.7zhao.net 

Similarly, we can create custom headers by assigning a dictionary to the request header using the headers parameter. www.7zhao.net

import requests

custom_header = {'user-agent': 'customUserAgent'}

r = requests.get('https://samplesite.org', headers=custom_header) 

欢迎访问www.7zhao.net

The Session Object

The session object is mainly used to persist certain parameters, like cookies, across different HTTP requests. A session object may use a single TCP connection for handling multiple network requests and responses, which results in performance improvement. 去找(www.7zhao.net欢迎您

import requests

first_session = requests.Session()  
second_session = requests.Session()

first_session.get('http://httpbin.org/cookies/set/cookieone/111')  
r = first_session.get('http://httpbin.org/cookies')  
print(r.text)

second_session.get('http://httpbin.org/cookies/set/cookietwo/222')  
r = second_session.get('http://httpbin.org/cookies')  
print(r.text)

r = first_session.get('http://httpbin.org/anything')  
print(r.text) 去找(www.7zhao.net欢迎您 

Output:

www.7zhao.net

{"cookies":{"cookieone":"111"}}

{"cookies":{"cookietwo":"222"}}

{"args":{},"data":"","files":{},"form":{},"headers":{"Accept":"*/*","Accept-Encoding":"gzip, deflate","Connection":"close","Cookie":"cookieone=111","Host":"httpbin.org","User-Agent":"python-requests/2.9.1"},"json":null,"method":"GET","origin":"103.9.74.222","url":"http://httpbin.org/anything"} 内容来自www.7zhao.net 

The httpbin path will set a cookie with name and value . Here, we set different cookie values for both first_session and second_session objects. You can see that the same cookie is returned in all future network requests for a specific session. 去找(www.7zhao.net欢迎您

Similarly, we can use the session object to persist certain parameters for all requests.

内容来自www.7zhao.net

import requests

first_session = requests.Session()

first_session.cookies.update({'default_cookie': 'default'})

r = first_session.get('http://httpbin.org/cookies', cookies={'first-cookie': '111'})  
print(r.text)

r = first_session.get('http://httpbin.org/cookies')  
print(r.text) 本文来自去找www.7zhao.net 

Output: copyright www.7zhao.net

{"cookies":{"default_cookie":"default","first-cookie":"111"}}

{"cookies":{"default_cookie":"default"}} 内容来自www.7zhao.net 

As you can see, the default_cookie is sent with each requests of the session. If we add any extra parameter to the cookie object, it appends to the default_cookie . "first-cookie": "111" is append to the default cookie "default_cookie": "default"

www.7zhao.net

Using Proxies

The proxies argument is used to configure a proxy server to use in your requests. www.7zhao.net

http = "http://10.10.1.10:1080"  
https = "https://10.10.1.11:3128"  
ftp = "ftp://10.10.1.10:8080"

proxy_dict = {  
  "http": http,
  "https": https,
  "ftp": ftp
}

r = requests.get('http://sampleurl.com', proxies=proxy_dict) www.7zhao.net 

The requests library also supports proxies. This is an optional feature and it requires the requests[socks] dependency to be installed before use. Like before, you can install it using pip:

copyright www.7zhao.net

$ pip install requests[socks] 去找(www.7zhao.net欢迎您 

After the installation, you can use it as shown here: 本文来自去找www.7zhao.net

proxies = {  
  'http': 'socks5:user:<a href="/cdn-cgi/l/email-protection" data-cfemail="7606170505361e190502">[email protected]</a>:port'
  'https': 'socks5:user:<a href="/cdn-cgi/l/email-protection" data-cfemail="fd8d9c8e8ebd95928e89">[email protected]</a>:port'
} 欢迎访问www.7zhao.net 

SSL Handling

We can also use the Requests library to verify the HTTPS certificate of a website by passing verify=true with the request.

本文来自去找www.7zhao.net

import requests

r = requests.get('https://www.github.com', verify=True) 欢迎访问www.7zhao.net 

This will throw an error if there is any problem with the SSL of the site. If you don't want to verity, just pass False instead of True . This parameter is set to True by default. 本文来自去找www.7zhao.net

Downloading a File

For downloading a file using requests , we can either download it by streaming the contens or directly downloading the entire thing. The stream flag is used to indicate both behaviors. 内容来自www.7zhao.net

As you probably guessed, if stream is True , then requests will stream the content. If stream is False , all content will be downloaded to the memory bofore returning it to you. 内容来自www.7zhao.net

For streaming content, we can iterate the content chunk by chunk using the iter_content method or iterate line by line using iter_line . Either way, it will download the file part by part.

欢迎访问www.7zhao.net

For example: 内容来自www.7zhao.net

import requests

r = requests.get('https://cdn.pixabay.com/photo/2018/07/05/02/50/sun-hat-3517443_1280.jpg', stream=True)  
downloaded_file = open("sun-hat.jpg", "wb")  
for chunk in r.iter_content(chunk_size=256):  
    if chunk:
        downloaded_file.write(chunk) 

内容来自www.7zhao.net

The code above will download an image from server and save it in a local file, sun-hat.jpg . 去找(www.7zhao.net欢迎您

We can also read raw data using the raw property and stream=True in the request.

copyright www.7zhao.net

import requests

r = requests.get("http://exampleurl.com", stream=True)  
r.raw copyright www.7zhao.net 

For downloading or streaming content, iter_content() is the prefered way. www.7zhao.net

Errors and Exceptions

requests throws different types of exception and errors if there is ever a network problem. All exceptions are inherited from requests.exceptions.RequestException class. 内容来自www.7zhao.net

Here is a short description of the common erros you may run in to: 去找(www.7zhao.net欢迎您

  • ConnectionError exception is thrown in case of DNS failure , refused connection or any other connection related issues.
  • Timeout is raised if a request times out.
  • TooManyRedirects is raised if a request exceeds the maximum number of predefined redirections.
  • HTTPError exception is raised for invalid HTTP responses.

For a more complete list and description of the exceptions you may run in to, check out the . 内容来自www.7zhao.net

Conclusion

In this tutorial I explained to you many of the features of the requests library and the various ways to use it. You can use requests library not only for interacting with a REST API, but it can be used equally as well for scraping data from a website or to download files from the web.

去找(www.7zhao.net欢迎您

Modify and try the above examples and drop a comment below if you have any question regarding requests . 欢迎访问www.7zhao.net

去找(www.7zhao.net欢迎您


本文原文地址:http://stackabuse.com/the-python-requests-module/

以上为The Python Requests Module文章的全部内容,若您也有好的文章,欢迎与我们分享!

内容来自www.7zhao.net

Copyright ©2008-2017去找网版权所有   皖ICP备12002049号-2 皖公网安备 34088102000435号   关于我们|联系我们| 免责声明|友情链接|网站地图|手机版