Give it a try with the steps below or continue reading to learn more.
Public Recipes are a great way to learn the tool and get instant results. Public recipes are created and maintained by our users. The ones starting with with an asterisk (*) symbol are the most popular ones and tend to be the most successful.
To use a public recipe, you must first find the data you want to scrape. Data Miner can only extract data when it's visible in your browser. Next, open up the Data Miner extension and find the recipe that best describes your data. Then click "Run". Once the recipe runs, click download, pick your format and open it in Excel. If there aren't any recipes under the public tab, this means no other users have created any yet and you will have to create your own.
You should always find a few generic recipes on any page. These are designed to extract simple items and should work for any site. Below is the pop up where you'll find the public recipes.
If you're trying to scrape a page that doesn't have any public recipes or they don't capture the data you're after, you can always easily create your own recipe. Our recipe creator has a smart find tool that allows you to extract data without having to write any code.
The video above will walk you through the basic process of creating a recipe for a list page. When creating a recipe, you must first determine if it's a list page or a detail page. List pages have rows and look like search results pages, detail pages do not have rows and look like a profile or product pages. Once you determine the type of page, then you'll begin selecting your data. Data is captured using selectors. These are pieces of HTML taken straight from the page and tells Data Miner where to look. Below are a few selectors and tips to get your started.
|id||#name||ids are the most specific type of selector and in most cases will only select one item. Good for detail pages, bad for list.|
|class||.name||classes are the best type of selector, they are general enough to capture your element on a detail page or list page but may select too many items from time time.|
|HTML element||div||these are HTML tags used to structure the page. They are used frequently and typically provide too many results. Not suggested for use.|
|Link||< a >||these are HTML tags used for links. Keep an eye out for them when capture URLs and adding in next page.|
A pink dotted line represents the current element Data Miner is looking at
A solid green line represents the containers/rows for your data
A solid purple line represents the data
A solid blue line represents the next page element that will be clicked
If you are having trouble finding a good selector, you can try the "Select Parent" button. This button will shift the focus to the next HTML element and provide additional selectors. Please not, the further out you move the focus the more data that could be captured. If you capture too much you will have to refine the selector. We will cover this more in the advanced recipe writing section. Here
If your selector is good but captures too many elements, you can try the Sibling Element arrows. These arrows will turn the elements into an ordered list. So, you can pick which highlighted element you want by clicking the arrows up or down. Clicking once gets you the first element. Clicking twice the second element. Click the arrows until your desired data is highlighted. The list starts at Zero and the first few clicks might not highlight data in view so you might not see movement right away. Please note: the numbering of elements is specific to the page you built the recipe on, so only use the sibling arrows when you are confidante the elements won't change page to page. Otherwise if one piece of data is not present it could shift your data down because the order has changed.
When building the next page feature in the Nav tab, you will be on the look out for the < a > tag. If your class is
outside of the < a > tag, you can simply add an a to your selectors. For example:
If the above does not work, another selector that works for many occasions is:
You can also replace the "next" with an arrow if it's that is what you see on your page.
If you'd like to learn more about Data Miner, we'd recommend checking out our advanced features in the Advanced Learning section.