1. Install
  2. Sign in
  3. Scrape!

Join over 150,000 active users

  • Works on most popular sites
  • See results in seconds
  • Automate your data extraction
  • Live customer support

Javascript Snippets

Using Javascript, you can clean up your scraped results and do more sophisticated data extraction than is possible with just Recipe Creator Selectors and xPath. Data Miner will pass the scraped data to a javascript function that you provide. Then you can modify the data and pass it back to Data Miner for saving into your data collection.

With custom Javascript you can:

  • Extract Email addresses from text
  • Remove unwanted text from scraped data
  • Change currency type, change units.
  • Separate or join column data

A) Clean up scraped data with Javascript:

js cleanup Example of Data clean up Script:

        var cleanup = function(results) {
          // loop through each row of results and change each column

          //debugger;

          $.each(results, function(){
            this.values[0] = "xxxx -" + this.values[0];
            this.values[1] = this.values[1] + "- yyyyy";
          });

          return results; // return modified results
        };
                        

B) Click on elements before scraping

js hooks
You can provide your own function in Javascript that Data Miner will run before it scrapes the data. Pre and Post scraping snippets give you an incredible power to do any work before or after scraping is performed.

Examples of how Pre and Post snippets can help you:

  • With Pre-Scrape, you can wait for an element to be present on the page before starting the scrape process.
  • With Pre-Scrape, you can fill a from and submit before scraping the page.
  • With Pre-Scrape, you can click on an item on the page or do AJAX calls.
  • With Post-Scrape, you can clean up your data. Or Click on a button.

C) Filling forms with Javascript:

Scrape Search Page
        var workflow = {
            paginationType: "ajax",

            fillForm: function(context, resolve) {
                console.log("starting POST hook");

                if (!context.inputData)
                    context.inputData = {
                        name: "pizza",
                    };

                return [{
                    type: "text",
                    selector: "input[name$='find_desc']",
                    value: context.inputData.name,
                    waitAfter: 1
                }, {
                    type: "button",
                    selector: ".main-search_submit",
                    done: function() {
                        resolve();
                    }
                }];
            }
        };             

With Data Miner you can automatically fill forms by uploading a CSV containing the forms URL and your values into your Collections and using a form filling recipe. To create a form filling recipe you must include the Javascript snippet and update the selectors to the right attributes for your site. In addition, make sure the CSV column titles match exactly to the javascript key titles (For example, the first name Allen has a key of "first").

Once the recipe is complete, run a job with the CSV as the source collection and your new form filling recipe as the recipe. The Job will visit the form, inject the values and then scrape the results. Accumulating the data to the Data Collections output file.




See even more examples of Javascript Snippets blow:



        /* --------------------------------------------------------------------------------------------------------------------

        Here is an example of pre-scrape hook. In this example an element is found using the jquery slector ".tsd_name > a".
        Then the element is clicked. Then we wait for 2 seconds for the page to change and then we tell Data Miner to continue
        to scrape the page.

        */

        var workflow = {

            "preScrape": function(request, callBack) {
                console.log("starting Pre-scrape hook");

                var $el = $(".tsd_name > a"); // Element to click on.
                var waitTime = 2; // Wait for n seconds and then continue to scrape the page

                if ($el.length > 0) {
                    $el[0].click()
                }

                setTimeout(function() {
                    callBack();
                }, waitTime * 1000);
            }
        }


        /* --------------------------------------------------------------------------------------------------------------------

        Here is another example of pre-scrape hook. In this example the pre-scrape hook will wait for 5 seconds until
        the element specified by jquery selector #footer appears on the page. In a loop we test for the presence of #footer and
        if not present we wait for 1 second and repeat the loop. Once the element is found we call CallBack which transfers
        the control back to Data Miner to continue to scrape.

        */

        var workflow = {

            /* --------------------------------------------------------------
            preScrape function:
                Will be executed before any scraping is done. Must callBack to give the execution control back to Data Miner

            Input:
                request: Context for the request. URL, scraping, parameter etc.
                callBack: callback function to be called when all the pre-scraping work is done.
            Return:
                nothing

            ------------------------------------------------------------- */
            "preScrape": function(request, callBack) {
                console.log("starting Pre-scrape hook");
                //debugger;

                var condition = "#footer"; // Wait for presence of this element before scraping
                loopCounterMax = 5; // Maximum number seconds to wait before giving up

                var wait = function() {
                    var $test = $(condition);

                    if ($test.length > 0 || loopCounter > loopCounterMax) {
                        if (callBack)
                            callBack();    // Must be called at the end when all the PreScrape work is done

                    } else {
                        loopCounter++;
                        setTimeout(wait, 1000);
                    }
                };

                loopCounter = 0;
                wait();
            },
        }


        /* --------------------------------------------------------------------------------------------------------------------

        Here is another example of post-scrape hook. In this example you are given the data that was scraped from the page in
        form on an array. Then you can modify the result and return the array back to Data Miner.

        */

        var workflow = {

            /* --------------------------------------------------------------
            postScrape function:
                Will be executed after the scraping is finished. You will get the scraped results and can
                clean up or modify them

            Input:
                results: Scraped data array
            Return:
                results: Modified data array

            ------------------------------------------------------------- */
            "postScrape": function(results) {
                console.log("starting Post-scrape hook");

              // loop through each row of results and change each column

              //debugger;

              $.each(results, function(){
                this.values[0] = "xxxx -" + this.values[0];
                this.values[1] = this.values[1] + "- yyyyy";
              });

              return results; // return modified results

            },


        /* --------------------------------------------------------------------------------------------------------------------

        Here is an example of scrape hook. You can simply replace the scrape functionality of Data Miner by providing you own
        scrape function which will be call instead of the scrape function of Data Miner.

        */

        var workflow = {

            /* --------------------------------------------------------------
            scrape function:
                Will be executed instead of the default [originalScrape] scrape function of Data Miner.
                The xpaths in the data Miner UI will be ignored. However the number of columns of data returned
                must match the number of columns specified the the UI.

            Input:
                request: Context for the request. URL, scraping, parameter etc.
                originalScrape: the default scrape function of Data Miner.
                callBack: callback function to return the results.
            Return:
                results: Modified data array

            ------------------------------------------------------------- */
            "scrape": function(request, originalScrape, callBack) {
                console.log("starting scrape hook");

                var result=[];
                result.push({
                    "values": [
                        "1234", "1234"
                    ]
                });

                callBack(results);
            }
        };

         /* --------------------------------------------------------------
        For Splitting names(Splits by a space)
        Use cleanup
        ------------------------------------------------------------- */
        var cleanup = function(results) {
            //debugger;
            $.each(results, function() {
                var x = this.values[2].indexOf(" ");
                this.values[2] = this.values[2].substring(0, x);
                this.values[3] = this.values[3].substring(x, this.values[3].length);
            });
            return results; // return modified results
        };

         /* --------------------------------------------------------------
        Split names by space and are in “Last, First” format, also removes comma,
        --------------------------------------------------------------*/
            var cleanup = function(results) {
                //debugger;
                $.each(results, function() {
                    var x = this.values[1].indexOf(" ");
                    var y = this.values[1].indexOf(",");
                    this.values[1] = this.values[1].substring(x, this.values[2].length);
                    this.values[2] = this.values[2].substring(0, y);
                    console.log(x);
                });
                return results; // return modified results
            };

         /* --------------------------------------------------------------
        Replace any non alphanumeric character with a“ - “
        --------------------------------------------------------------*/
            var cleanup = function(results) {
                //debugger;
                $.each(results, function() {
                    this.values[1] = this.values[1].replace(/[^a-z0-9()_]/gi, '-');
                });
                return results; // return modified results
            };
         /* --------------------------------------------------------------
        Click a Button
        --------------------------------------------------------------*/
        var workflow = {
            "preScrape": function(request, callBack) {
                console.log("starting Pre-scrape hook");
                var condition = "a[class~='xxxx']";
                var $test = $(condition);
                if ($test.length > 0) {
                    $test[0].click();
                    var wait = function() {
                        callBack();
                    };
                    setTimeout(wait, 3000);
                } else callBack();
                return results;
            }
        };

        /* --------------------------------------------------------------
        Button click and close
        --------------------------------------------------------------*/
        var workflow = {
            "preScrape": function(request, callBack) {
                console.log("starting Pre-scrape hook");
                //debugger;
                var condition = "button[data-lira-action~='edit-contact-info'"; // Wait for presence of this element before scraping
                //debugger
                var $test = $(condition);
                if ($test.length > 0) {
                    $test[0].click();
                    var wait = function() {
                        callBack();
                    };
                    setTimeout(wait, 3000);
                } else callBack();
            },
            "postScrape": function(results) {
                console.log("starting Post-scrape hook");
                var $close = $(".dialog-close");
                if ($close.length > 0) {
                    $close[0].click();
                }
                return results;
            }
        };

        /* --------------------------------------------------------------
        Filter Data Miner results
         --------------------------------------------------------------*/
        var workflow = {
            "postScrape": function(results) {
                console.log("starting Post-scrape hook");
                // loop through each row of results and change each column
                //debugger;

                var results2 = [];
                $.each(results, function() {
                    if (this.values[2] !== "banana")   // filter column 2 values and exclude "banana"
                        results2.push(this);
                });
                return results2; // return modified results
            }
        };


        /* --------------------------------------------------------------
        Auto Scrolling with an interval and a max(twitter)
         --------------------------------------------------------------*/
        var workflow = {
            "preScrape": function(request, callBack) {
                console.log("starting Pre-scrape hook");
                //debugger;
                var waitTime = 3000; // milliseconds
                var maxLoopCount = 50;
                var count = 0;
                var loopCount = 0;

                function loop() {
                    loopCount++;
                    if ($("li[class~='stream-item']").length !== count && loopCount < maxLoopCount) {
                        window.scrollTo(0, document.body.scrollHeight);
                        count = $("li[class~='stream-item']").length;
                    } else if (callBack) callBack();
                }
                var tid = setInterval(loop, waitTime);
            }
        };


        /* --------------------------------------------------------------
        Isolate data by Index
         --------------------------------------------------------------*/
        var cleanup = function(results) {
            //debugger;
            $.each(results, function() {
                this.values[3] = this.values[3].substring(0, 13);
                this.values[4] = this.values[4].substring(14, 30);
            });
            return results; // return modified results
        };


        /* --------------------------------------------------------------
        Using Form filling with drop down menus. The following javascript will click to open a form,
        click a drop down menu, select an item from within the list and then click submit.

        A CVS with a column titled "location" and then data below it 0 though the number of elements in the drop down menu
        can be injected into a selector allowing you to select different items in a drop down and search them, when injecteing
        basic text doesn't work.
         --------------------------------------------------------------*/
        var workflow = {
            paginationType: "ajax",

            fillForm: function(context, resolve) {
                console.log("starting POST hook");

                if (!context.inputData)
                    context.inputData = {
                        location: "0", //starting from 0, the location is where the item lives within the list in the drop down.

                    };

                return [{
                    type: "button",
                    selector: "a[class~='XXXX']", //open button selector
                    waitAfter: 2
                },{
                    type: "button",
                    selector: "*[class~='XXXX']", //form button selector
                    waitAfter: 2
                },{
                    type: "button",
                    selector: "*[id~='XXXX" + context.inputData.location +"']", //inputData.location is the number defined above or
                            injected from the CSV and then added to the drop down item selector.
                    waitAfter: 2
                },{
                    type: "button",
                    selector: "button[name~='skipandexplore']", //submit button selector
                    done: function() {
                        resolve();
                    }
                }];
            }
        };

                    

Get the 1.0 version of Data Miner


Note: We recommend that you run both the Production version and the 1.0 versions side by side. If you find a blocking issue in either version, let us know. But each version runs independently and they don't interfere with each other.

Download Data Miner 1.0 version.
Public Recipe Picture
Updated: 5/2/2018 by Ben
Can't find what you're looking for? Let us know what's missing! Contact Me