How To Scrape Google Image Search?
Artificial intelligence has become a major part of our lives these days. You might be planning on training an image classifier, but in order to do so, you need to scrape Google images first to train your classifier and set specific algorithms in place. Wondering how you can scrap the web for images? Well, good news, you have landed on the right spot. Let us go into the steps right away without further due.
Open Your Browser
In order to scrape images from the web, the first thing that you need to do is open your web browser. Now the technique works in most of the web browsers, but for the assurance, we would suggest that you stick with Google Chrome as it is the most trusted and popular out of all.
Disable Ad Blocker
Secondly, since you are going to be downloading a ton of images directly from their URL, therefore you want no interruptions. In order to ensure that, go to your web browser settings and disable the ad blocker option to make sure that all URLs are able to run when downloaded.
Use Reverse Image Search Function
Next in line is to search the images that you want. In most cases, people often go with search terms such as, for instance, if you want images of mountains, you will type mountains in the search bar. But other than that, if you want specific images to the one that you already have, you can use the reverse image search function in Google Images to get your desired results.
Scroll To The Bottom Of The Page
Once you have the page loaded up, you need to scroll down to the bottom of the page to ensure that all images are loaded and ready to be downloaded. Furthermore, this step will also allow you to be sure whether all the loaded images are according to your requirements or not.
Open Console
Next, you need to open the JavaScript Console. Not everyone has high-end processing softwares; therefore, using your built-in console is the best and the easiest way to go. Press CTRL + SHIFT + J at the same time, and a popup window will appear on your screen.
Enter The Following Command
Now you need to run a command in order to copy the URLs of all the images displayed in your Google Image Search Results. In order to do so, simply copy the following commands in your console and let it run. Once you run it, a file with all the URLs will be automatically downloaded in your default Downloads folder.
URLs = Array.from(document.querySelectorAll(‘.rg_di .rg_meta’)).map(el=>JSON.parse(el.textContent).ou);
window.open(‘data:text/csv;charset=utf-8,’ + escape(urls.join(‘\n’)));
Download The Images
Now that you have the URLs, it is time that you download all the images at once without having to download them manually one by one. In order to do so, copy the following command in your console and make adjustments accordingly. The first parameter is the path and the second in the destination. So adjust accordingly.
from fastai.vision import *
download_images(/path/to/download/file, destination_folder)
You are Done!
And you are all done! If you are looking for more images, then the trick is to translate your search tag into different languages to get different results as well. You can remove the duplicates later when finalizing your data.