How to create a screenshot from a website or html with PhantomJS in Node.js


PhantomJS is a headless WebKit scriptable with a JavaScript API multiplatform, available on major operating systems as: Windows, Mac OS X, Linux, and other Unices. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.

In this article, you will learn how to manipulate the PhantomJS CLI with Node.js using the webshot module.

Requirements

You will need PhantomJS (installed or a standalone distribution) accesible from the PATH (learn how to add a variable to the PATH in windows here). In case it isn't available in the path, you can specify the executable to PhantomJS in the configuration later.

You can obtain PhantomJS from the following list in every platform (Windows, Linux, MacOS etc) in the download area of the official website here.

Note: there's no installation process in most of the platforms as you'll get .zip file with two folder, examples and bin (which contains the executable of PhantomJS).

Implementation

PhantomJS is a command line tool (CLI), therefore we would need to use this utility with Node.js using a child process. However, we won't reinvent the wheel and you neither, to make this task easily for us, use a third party module, in this case we are talking about the node-webshot module. Node Webshot provides a simple API for taking webpage screenshots. The module is a light wrapper around PhantomJS, which utilizes WebKit to perform the page rendering.

To install this module in your project, execute the following command in your terminal:

npm install webshot

Note: however, the webshot module has a prebuilt of phantomjs included as a dependency, located in your-project/node_modules/phantomjs-prebuilt/lib/phantom/bin/phantomjs.exe and it's automatically used if not any phantomPath is providen, therefore the usage of webshot would work without any configuration.

Save it in your project if you need to using the --save parameter, after the installation you'll be able to require the module using require('webshot').

As mentioned previously, you need phantomjs accessible from the command line, in case you don't, specify the full path to the executable by providing the phantomPath option:

var webshot = require('webshot');

var options = {
    phantomPath: "C:\\Users\\sdkca\\Desktop\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe"
};

// Use webshot here with the options object as third parameter
// Example :
webshot('google.com', 'google.png', options, (err) => {
    // screenshot now saved to google.png
});

Webshot tries to use the binary provided by the phantomjs NPM module, and falls back to the phantomPath if the module isn't available.

Create screenshot from website

You can create a screenshot from any website, just provide the website URL as first parameter and the output file as second parameter:

var webshot = require('webshot');

webshot('ourcodeworld.com', 'ourcodeworld-image.png', options, (err) => {
    if(err){
        console.log("An error ocurred ", err);
    }
    // screenshot now saved to ourcodeworld-image.png
});

Create screenshot from html file or plain html string

You can create a screenshot from a html string, just provide the markup as a string in the first parameter, the output filename as second parameter and specify in the options that you're using plain html:

var webshot = require('webshot');

var options = {
    siteType:'html'
};

webshot('<html><body>Hello World</body></html>', 'hello_world.png', options, (err) => {
  // screenshot now saved to hello_world.png
});

You can create a screenshot from a html file by setting the siteType to file and provide as first parameter of the webshot function, the absolute path to the file:

var webshot = require("webshot");

var options = {
    siteType: "file"
};

webshot("index.html", "ourcodeworld-file.png", options, (err) => {
    if(err){
        return console.log(err);
    }

    console.log("Image succesfully created");
});

Alternatively, you can read the content of the file using the filesystem module and set the siteType to html:

var webshot = require('webshot');
var fs = require("fs");

var options = {
    siteType:'html'
};

webshot(fs.readFileSync("index.html", "UTF-8"), 'hello_world.png', options, (err) => {
  // screenshot now saved to hello_world.png
});

You can set more options in the object, see the available options in the docs of the repository here.

Change format of the screenshot

The generated screenshots format can be either png, jpg or jpeg. To change the output format, set the streamType with a string with the format (besides note that the output filename needs to have the same extension):

var webshot = require("webshot");

var options = {
    streamType: "jpeg"
};

webshot("ourcodeworld.com", "ourcodeworld-file.jpeg", options, (err) => {
    if(err){
        return console.log(err);
    }

    console.log("Image succesfully created");
});

Webshots options

In the same way you do with the CLI of PhantomJS, you can set options dinamically within an object for the webshot module (and for PhantomJS). The following table shows all the available options for webshot and for PhantomJS:

Option Default Value Description
windowSize
{ width: 1024
, height: 768 }
The dimensions of the browser window. screenSize is an alias for this property.
shotSize
{ width: 'window'
, height: 'window' }
The area of the page document, starting at the upper left corner, to render. Possible values are 'screen', 'all', and a number defining a pixel length.

'window' causes the length to be set to the length of the window (i.e. the shot displays what is initially visible within the browser window).

'all' causes the length to be set to the length of the document along the given dimension.
shotOffset
{ left: 0
, right: 0
, top: 0
, bottom: 0 }
The left and top offsets define the upper left corner of the screenshot rectangle. The right and bottom offsets allow pixels to be removed from the shotSize dimensions (e.g. a shotSize height of 'all' with a bottom offset of 30 would cause all but the last 30 rows of pixels on the site to be rendered).
phantomPath 'phantomjs' The location of phantomjs. Webshot tries to use the binary provided by the phantomjs NPM module, and falls back to 'phantomjs' if the module isn't available.
phantomConfig {} Object with key value pairs corresponding to phantomjs command line options. Don't include `--`. For example: `phantomConfig: {'ignore-ssl-errors': 'true'}`
cookies [] List of cookie objects to use, or null to disable cookies.
customHeaders null Any additional headers to be sent in the HTTP request.
defaultWhiteBackground false When taking the screenshot, adds a white background if not defined elsewhere.
customCSS '' When taking the screenshot, adds custom CSS rules if defined.
quality 75 JPEG compression quality. A higher number will look better, but creates a larger file. Quality setting has no effect when streaming.
streamType 'png' If streaming is used, this designates the file format of the streamed rendering. Possible values are 'png', 'jpg', and 'jpeg'.
siteType 'url' siteType indicates whether the content needs to be requested ('url'), loaded locally ('file'), or is being provided directly as a string ('html').
renderDelay 0 Number of milliseconds to wait after a page loads before taking the screenshot.
timeout 0 Number of milliseconds to wait before killing the phantomjs process and assuming webshotting has failed. (0 is no timeout.)
takeShotOnCallback false Wait for the web page to signal to webshot when to take the photo using window.callPhantom('takeShot');
errorIfStatusIsNot200 false If the loaded page has a non-200 status code, don't take a screenshot, cause an error instead.
errorIfJSException false If a script on the page throws an exception, don't take a screenshot, cause an error instead.
captureSelector false Captures the page area containing the provided selector and saves it to file.

PhantomJS callbacks

Arbitrary scripts can be run on the page before it gets rendered by using any of Phantom's page callbacks, such as onLoadFinished or onResourceRequested. For example, the script below changes the text of every link on the page:

var webshot = require('webshot');

var options = {
    onLoadFinished: function() {
        var links = document.getElementsByTagName('a');

        for (var i=0; i<links.length; i++) {
            var link = links[i];
            link.innerHTML = 'My custom text';
        } 
    }
};


webshot('google.com', 'google.png', options, (err) => {
  // screenshot now saved to google.png
});

Note that the script will be serialized and then passed to Phantom as text, so all variable scope information will be lost.

Happy coding !

Become a more social person