C#: Using WebBrowser to take screenshots of web pages programmatically

Do you need to take screenshots of numerous web pages? Yes? Then this article is for you.

You can take screenshots of web pages programmatically using the WebBrowser class (System.Windows.Forms).

Let’s get straight to the code.

 

The code

The first step is to instantiate the WebBrowser and load a URL. Some wait-code runs in-place until the page has completed loading. A screenshot is taken and the WebBrowser object can be disposed of.

I found that a common problem were pages that didn’t load or took too long to load all its assets. I added a loading timeout to ensure that the code doesn’t get stuck in that wait-loop.


protected bool takeScreenshot(string url, int width, int height, int timeout = 10)
{
  bool hasError = false;
  DateTime startTime = DateTime.Now;

  string screenshotFilename = url;
  foreach (char c in Path.GetInvalidFileNameChars())
  {
    screenshotFilename = screenshotFilename.Replace(c, '_');
  }

  WebBrowser webBrowser = new WebBrowser();
  webBrowser.Navigate(url);
  webBrowser.Width = width;
  webBrowser.Height = height;
  webBrowser.ScrollBarsEnabled = false;
  webBrowser.AllowNavigation = false;
  webBrowser.ScriptErrorsSuppressed = true;

  while (webBrowser.ReadyState != WebBrowserReadyState.Complete)
  {
    if (DateTime.Now.Subtract(startTime).TotalSeconds > timeout)
    {
      hasError = true;
      break;
    }

    Application.DoEvents();
  }

  if (!isTimeout)
  {
    Bitmap bmp = new Bitmap(width, height, PixelFormat.Format32bppRgb);
    try
    {        
      browser.DrawToBitmap(bmp, new Rectangle(0, 0, width, height));
      bmp.Save(string.Concat("C:\\", screenshotFilename, ".png")), ImageFormat.Png);
    }
    catch (Exception exc) 
    {
      hasError = true;
    finally
    {
      bmp.Dispose();
      bmp = null;
    }
  }

  webBrowser.Dispose();
  webBrowser = null;

  return !hasError;
}

 

Throttling

Some website don’t like continuous requests, this is a quick way to throttle your requests. The example below run three pages then waits for three second before continuing. The webpageList would not normally be empty.


string[] webpageList = new string[0];

int throttleRun = 3;
int throttleWait = 3;

int runCounter = throttleRun;

for (int i = 0; i < webpageList.Length; i++)
{
  takeScreenshot(webpageList[i], 1920, 3240);

  if ((throttleRun <= 0) || (throttleWait <= 0))
  {
    // do nothing
  }
  else
  {
    allowCounter--;
    if (allowCounter <= 0)
    {
      Thread.Sleep(new TimeSpan(0, 0, throttleWait));

      runCounter = throttleRun;
    }
  }
}

 

I hope this helps someone.

Social Media

 Share Tweet

Categories

Programming Web Browser

Tags

.NET C# WebBrowser

Post Information

Posted on Wed 3rd Jan 2018

Modified on Sun 13th Mar 2022