A browser is an HTTP client and the Web server is an HTTP server :
The format of the request and response messages are similar, and English-oriented.
Both kinds of messages consist of:
<initial line, different for request vs. response> Header1: value1 Header2: value2 Header3: value3 <optional message body goes here, like file contents or query data; it can be many lines long, or even binary data $&*%@!^$@>
To retrieve the file at the URL
http://www.somehost.com/path/file.html
first open a socket to the host www.somehost.com, port 80
Then, send something like the following through the socket:
GET /path/file.html HTTP/1.0 From: someuser@jmarshall.com User-Agent: HTTPTool/1.0 [blank line here]
The server should respond with something like:
After sending the response, the server closes the socket.HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Happy New Millennium!</h1> (more file contents) . . . </body> </html>
These headers help webmasters troubleshoot problems. They also reveal information about the user. When you decide which headers to include, you must balance the webmasters' logging needs against your users' needs for privacy.
If you're writing servers, consider including these headers in your responses:
Last-Modified: Fri, 31 Dec 1999 23:59:59 GMT
An HTTP message may have a body of data sent after the header lines. In a response, this is where the requested resource is returned to the client (the most common use of the message body), or perhaps explanatory text if there's an error. In a request, this is where user-entered data or uploaded files are sent to the server.
If an HTTP message includes a body, there are usually header lines in the message that describe the body. In particular,
An HTTP proxy is a program that acts as an intermediary between a client and a server. It receives requests from clients, and forwards those requests to the intended servers. The responses pass back through it in the same way. Thus, a proxy has functions of both a client and a server.
When a client uses a proxy, it typically sends all requests to that proxy, instead of to the servers in the URLs. Requests to a proxy differ from normal requests in one way: in the first line, they use the complete URL of the resource being requested, instead of just the path. For example,
GET http://www.somehost.com/path/file.html HTTP/1.0
That way, the proxy knows which server to forward the request to (though the proxy itself may use another proxy).
A POST request is used to send data to the server to be processed in some way, like by a CGI script. A POST request is different from a GET request in the following ways:
The most common use of POST, by far, is to submit HTML form data to CGI
scripts.
In this case:
POST /path/script.cgi HTTP/1.0 From: frog@jmarshall.com User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 home=Cosby&favorite+flavor=flies
Improvements include:
To comply with HTTP 1.1, clients must
Starting with HTTP 1.1, one server at one IP address can be multi-homed, i.e. the home of several Web domains. For example, "www.host1.com" and "www.host2.com" can live on the same server.
A complete HTTP 1.1 request might be
except the ":80" isn't required, since that's the default HTTP port.GET /path/file.html HTTP/1.1 Host: www.host1.com:80 [blank line here]
If a server wants to start sending a response before knowing its total length (like with long script output), it might use the simple chunked transfer-encoding, which breaks the complete response into smaller chunks and sends them in series. You can identify such a response because it contains the "Transfer-Encoding: chunked" header. All HTTP 1.1 clients must be able to receive chunked messages.
To comply with HTTP 1.1, servers must:
Because of the urgency of implementing the new Host: header, servers are not allowed to tolerate HTTP 1.1 requests without it. If a server receives such a request, it must return a "400 Bad Request" response, like
HTTP/1.1 400 Bad Request Content-Type: text/html Content-Length: 111 <html><body> <h2>No Host: header received</h2> HTTP 1.1 requests must include the Host: header. </body></html>
Caching is an important improvement in HTTP 1.1, and can't work without timestamped responses. So, servers must timestamp every response with a Date: header containing the current time, in the form
Date: Fri, 31 Dec 1999 23:59:59 GMT
All time values in HTTP use Greenwich Mean Time.
To avoid sending resources that don't need to be sent, thus saving bandwidth, HTTP 1.1 defines the If-Modified-Since: and If-Unmodified-Since: request headers. The former says "only send the resource if it has changed since this date"; the latter says the opposite. Clients aren't required to use them, but HTTP 1.1 servers are required to honor requests that do use them.
Unfortunately, due to earlier HTTP versions, the date value may be in any of three possible formats:
If-Modified-Since: Fri, 31 Dec 1999 23:59:59 GMT If-Modified-Since: Friday, 31-Dec-99 23:59:59 GMT If-Modified-Since: Fri Dec 31 23:59:59 1999
HTML form data is usually URL-encoded to package it in a GET or POST submission. In a nutshell, here's how you URL-encode the name-value pairs of the form data:
name1=value1&name2=value2&name3=value3
with a length of 34.name=Lucy&neighbors=Fred+%26+Ethel