This repo is used to record some search skills for googling.
Almost each result returned by google has three parts, and they are the title, url and body. As the following picture shows, the content in red, light green and blue box is title, url and body part respectively.
The skills to be introduced may be roughly divided into three categories: location-directed, content-directed and others.
In this way, we can filter pages according the location. specifically, we can use the site and url of a page to control the location.
By its very name, we can tune our search to make result limited to some specific location and by location, I mean the domain of web pages. In google, we can use site keyword to achieve this goal.
For example, we wanna search pages which contain javascript prototype, we can type search like this:
As we can see in the picture above, the result pages do contain javasript and/or prototype, but we now want pages only from some domain, say stackoverflow.com, for pages from this domain are more likely have high-quality contents we may need, so we can tune our search like this:
So the result pages now are all from the same domain of stackoverflow.com.
We can further do something on the url of pages. And in google we have two related operators: inurl and allinurl.
For example, we wanna pages whose urls contain zju and/or cs , so we can construct our search like this:
The key difference between inurl and allinurl is that the latter one requires all the keywords be contained in url, while the former require at least one of keywords represent in url. The following picture shows the result using allinurl:
As far as I know, we can tune the site and url of pages to make result pages more closer to what we expect as to the location of pages.
In this way, we can tune our search according the title, body, filetype of a result page. And in google we have corresponding keyword intitle, allintitle, intext, allintext and filetype to achieve these goals respectively.
Search pages whose titles matching some keyword.
For example, we wanna search pages whose titles contain the zju keyword, we can do like this:
And we can check the html source the see whether the title truly contains the zju keyword, as we can see the following picture, it does.
Notice: the words are case-insensitive.
As I mentioned earlier, the operators prefixed with all require all the keywords exist simultaneously.
For example, if we wanna search pages whose titles contain the zju keyword, we can do like this:
Similarly, we can check the html source shown below:
Search for body part of a html page.
For example, we wanna search pages whose body contains google search skills, we can write this:
You may notice that I also visited the fourth result in the above picture. And I found that the fourth doesn't contain the skills keyword, this is not surprising, for intext only require at least one of the keyword.
If we want to body of pages contain all the three keywords, we can use allintext:
And if we want the three keywords contained in body of pages continuously, we can use double quotes, which will be introduced later:
Find and download different kinds of documents.
For example, we wanna search some pdf files which contain the javascript keyword, we would type search like this:
but if we don't use the filetype operator and just type javascript pdf, then we are more likely not to get expected results:
There are some operators can be used both in location-directed and content-directed search, like minus sign and double quotes.
Eliminate irrelevant results.
Search an exact phrase.
Use an asterisk within quotes to specify unknown or variable words
Here’s a lesser known trick: searching a phrase in quotes with an asterisk replacing a word will search all variations of that phrase. It’s helpful if you’re trying to determine a song from its lyrics, but you couldn’t make out the entire phrase (e.g. "imagine all the * living for today"), or if you're trying to find all forms of an expression (e.g. "* is thicker than water").
- every word in a query matters
- word order matters
- words are case-insensitive
- never trust one source
- Keep in mind that we should tune our search for good result. So if you are not satisfied with the result returned, you can try to tune the search until you find what is expected.
For more advanced search skills, you can visit the following two links.