C#: Using HttpWebRequest to interact with websites

HttpWebRequest is a .NET class (System.Net) for making requests to web servers through HTTP (and HTTPS). HttpWebResponse is used to handle the response.

In this article I’m going to show the usage of the pair of classes and how you can use it to interact with websites. I’m going to be focusing on retrieving text responses (or the source code of a web pages) and not binary files. I’m also not going to be mentioning WebClient or HttpClient, they may be an article for another time.

At the end of the article I’m sharing a helper class but more about that later.

Let’s get started.

 

1. Making a GET request

This is the simplest request of retrieving a web page from a URL.


CookieContainer cookierJar = new CookieContainer();

HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.example.com");
webRequest.AllowAutoRedirect = true;
webRequest.CachePolicy = new System.Net.Cache.RequestCachePolicy(System.Net.Cache.RequestCacheLevel.NoCacheNoStore);
webRequest.CookieContainer = cookierJar ;
webRequest.UserAgent = "My Thirsty Browser";

I’ve set a few properties that I consider best-practice but they aren’t strict required. I’ve added the cookie jar so that any cookies set by the web page can be reused or examined, this is especially useful for retaining user sessions.

  • AllowAutoRedirect | Follow redirect headers if the URL has a redirect
  • CachePolicy | Don’t cache or store the request
  • CookieContainer | Store and re-use cookies
  • UserAgent | Some web server will ignore requests that don’t have a user agent

Some code for generating a more extensive user agent is available in the helper class.

 

2. Get the response

This is the basic response pattern. In this case we’re concerned with getting the status code and the response text (source code).


int statusCode = 0;
string sourceCode = string.Empty;

HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();

statusCode = (int)webResponse.StatusCode;

StreamReader readContent = new StreamReader(webResponse.GetResponseStream());
sourceCode = readContent.ReadToEnd();

webResponse.Close();
webResponse = null;

You’ll notice that we’re not handling response time-outs or any errors. The code above will fall-over if the server takes too long to reply.

 

3. Get the response with error handling

This code is more resilient to hard and soft errors. Status codes can throw exceptions in HttpWebResponse. These are soft-errors, they aren’t caused by any error in the code but from the status reported by the web server. You can see that the code is somewhat long, I prefer to retain focus on the status code and the source code (that doubles as the error message).


int statusCode = 0;
string sourceCode = string.Empty;

try
{
  HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();

  statusCode = (int)webResponse.StatusCode;

  StreamReader readContent = new StreamReader(webResponse.GetResponseStream());
  sourceCode = readContent.ReadToEnd();

  webResponse.Close();
  webResponse = null;
}
catch (WebException xc)
{
  if (xc.Response is HttpWebResponse)
  {
    HttpWebResponse rs = xc.Response as HttpWebResponse;
    StreamReader readContent = new StreamReader(rs.GetResponseStream());
    if (readContent != null)
    {
      sourceCode = readContent.ReadToEnd();
    }

    statusCode = (int)rs.StatusCode;
  }
  else
  {
    statusCode = (int)xc.Status;
    sourceCode = xc.Message;
  }
}
catch (Exception xc)
{
  sourceCode = xc.Message;
}

 

4. Practical applications

The code above gives you a pattern for retrieving a web page. This is the basis for many practical applications, such as:

  • Checking to see if there is an update for your favourite software
  • Checking to see if your e-commerce store has that item you want back in stock
  • Analysing the on-page SEO for a website
  • Implementing your own API for a website

Please note: certain websites may restrict or prohibit the use of automated bots (or software) from accessing its resources. Please check the terms and conditions of the respective sites before poking it.

 

5. Making a POST request

You can think of a POST request as submitting a (search or login) form. It’s very similar to point 1 but we’re sending additional data to the server in the request stream.


CookieContainer cookierJar = new CookieContainer();

HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.example.com");
webRequest.AllowAutoRedirect = true;
webRequest.CachePolicy = new System.Net.Cache.RequestCachePolicy(System.Net.Cache.RequestCacheLevel.NoCacheNoStore);
webRequest.CookieContainer = cookierJar ;
webRequest.UserAgent = "My Thirsty Browser";
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";

byte[] buffer = Encoding.UTF8.GetBytes("username=johnsmith&password=1234");

webRequest.ContentLength = buffer.Length;

Stream dataStream = webRequest.GetRequestStream();
dataStream.Write(buffer, 0, buffer.Length);
dataStream.Close();

The post data is sent in the request stream. The code is the same for all HTTP methods.

  • Method | The HTTP methods: POST, PUT, DELETE etc…
  • ContentType | Type of content being sent, we’re sending a form here
  • ContentLength | Length of the data being sent, this must be accurate

 

6. Get the response

See point 3, the code is identical to the GET response.

 

7. Practical applications

Let’s say you wanted to get a webpage that is only accessible while a user is logged in. By using a shared cookie container, you could submit the login form (using a POST request) and retain the authentication cookies. Any subsequent GET requests you make will be under the user’s session.

This is very useful if a website doesn’t offer an API, you can effectively write your own interface without having access to the back-end.

I’m planning at least one article on a practical application of this code. Check back later.

 

8. Helper class

This code is pretty simple but is reasonably lengthy. I’ve put together a helper class to let me use HttpWebRequest at a higher level. Examples of its usage is here.

• GET Request-Response

public int GetResponse(out string sourceCode, string url, string referrerURL = "")


HttpWeb webClient = new HttpWeb();

int statusCode = 0;
string sourceCode = string.Empty;

statusCode = webClient.GetResponse(out sourceCode, "http://www.example.com");

• Create Request

public HttpWebRequest CreateRequest(string url)
public HttpWebRequest CreateRequest(string url, string referrerURL)


HttpWeb webClient = new HttpWeb();

HttpWebRequest webRequest = webClient.CreateRequest("http://www.example.com");

• GET Response

public int GetResponse(out string sourceCode, HttpWebRequest webRequest)


HttpWeb webClient = new HttpWeb();

int statusCode = 0;
string sourceCode = string.Empty;

HttpWebRequest webRequest = webClient.CreateRequest("http://www.example.com");
statusCode = webClient.GetResponse(out sourceCode, webRequest);

• Send DATA and GET Response

public int GetPOSTResponse(out string sourceCode, HttpWebRequest webRequest, string postData)


HttpWeb webClient = new HttpWeb();

int statusCode = 0;
string sourceCode = string.Empty;

HttpWebRequest webRequest = webClient.CreateRequest("http://www.example.com");
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";

statusCode = webClient.GetPOSTResponse(out sourceCode, webRequest, "username=johnsmith&password=1234");

• Send Basic Authentication

public static HttpWebRequest AddBasicAuthentication(HttpWebRequest webRequest, string username, string password)


HttpWeb webClient = new HttpWeb();

int statusCode = 0;
string sourceCode = string.Empty;

HttpWebRequest webRequest = webClient.CreateRequest("http://www.example.com");
webRequest = webClient.AddBasicAuthentication(webRequest, "johnsmith", "1234");

statusCode = webClient.GetResponse(out sourceCode, webRequest);

• Download

Source Code

Social Media

 Share Tweet

Categories

Programming

Tags

.NET C# HttpWebRequest HttpWebResponse

Post Information

Posted on Mon 1st Jan 2018

Modified on Sat 5th Aug 2023