Author Topic: The Search Tutorial Thread  (Read 23242 times)

0 Members and 1 Guest are viewing this topic.

Offline Otto Puzzell

  • Founder and
  • Editor
  • *
  • Posts: 31556
  • Country: us
  • Puzzle Points 444
  • Open field, with a window.
  • YearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYears
    • AutoPuzzles
The Search Tutorial Thread
« on: September 07, 2011, 06:16:19 AM »
As our site has grown since its humble beginnings in 2006, the volume of solved puzzles and other information has become increasingly more difficult to search. Many of you have noticed that running searches for solved puzzles has become troublesome.

With this tutorial thread, I hope to proffer some tips that will make searching the Auto Puzzles site easy and effective, whether you use Google, Yahoo, or Bing as your primary search tool. Before I get into the nuts and bolts, let's take a look at the spark plug of search engine usage - the Boolean search string.

Boolean search traces its origins - and name - to British-born Irish mathematician George Boole (November 1815 – 8 December 1864). 'Twas he who said:

Quote
... no general method for the solution of questions in the theory of probabilities can be established which does not explicitly recognize ... those universal laws of thought which are the basis of all reasoning

Logic. Pure and simple. And with his very clear and concise observation, Boole established himself as the father of computer science, though only the most rudimentary of computing devices had been conceived by the time he passed away.

So, what does this have to do with searching AutoPuzzles? Luckily, in the 147 years since Mr. Boole shuffled off his mortal coil, a lot of very bright minds have been working at developing the hardware that makes computers and the internet possible, and the application of Boolean logic to do the same. More importantly, applying Boolean logic to online searches has been made incredibly easy, thanks to the work of organizations like AltaVista (once part of Digital Equipment), Ask.com (once known as AskJeeves) and the powerhouse Big 3 of internet searches, Google, Yahoo! and Bing. These companies have labored long and hard to take much of the guesswork out of searching, to the point they have manipulated Boolean search to behave much like natural search. How does natural search differ from pure Boolean? Today, if you type a question into a Google, Yahoo! or Bing search bar, the search engines behind them make very educated guesses as to what you mean, and what you don't mean. For instance, you might type the question:

Where are the solved puzzles on AutoPuzzles?

Within the first page of search results, Google presents a link to our solved puzzles index (it's the sixth result), as do Yahoo! and Bing. The latter two present the link as the very first search result. You'll find that Yahoo and Bing will always deliver similar results, as Yahoo! search is now powered by the Bing search engine. Many people who use search engines as a tool in making a living now gravitate toward Bing, as its natural search capabilities are often better that Google's.

At my employer, searching for information about people and companies is necessary to succeed, and it requires deft wielding of search engines to find information about people and companies. As my role has migrated toward business development (sales), I have become a bit rusty, compared to the 'doers' in our company, who hammer out search strings all day. But over the last 10 years, it's become increasingly necessary to overcome the 'natural' search capabilities of the search sites, and use well-crafted Boolean search strings to find the information we're after. So, how does a Boolean search string work?

(don't worry, we'll get to the "easy to do" part later.)  ;)

Boolean logic consists of three logical operators:

•   OR
•   AND
•   NOT


Let's look first at the logial operator "OR"

Question: I would like information about college.

If I run this search in an online search tool, I will retrieve records in which AT LEAST ONE of the search terms is present. We are searching on the terms college and also university since documents containing either of these words might be relevant.

OR logic is most commonly used to search for synonymous terms or concepts. This search would be simply run by typing the following text into a search engine search box:

college OR university

With the application of natural search tendencies, just typing the words college and university (without the OR operator) the Big 3 search sites, assume that I mean college AND university, and will deliver a list of results that include both words. If however I include the OR operator, I get more results. With common terms like college and university, too many results, most likely. But, what if were looking for two obscure words? Much more useful!
« Last Edit: September 07, 2011, 06:23:44 AM by Otto Puzzell »
You wanna be the man, you gotta Name That Car!

Offline Otto Puzzell

  • Founder and
  • Editor
  • *
  • Posts: 31556
  • Country: us
  • Puzzle Points 444
  • Open field, with a window.
  • YearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYears
    • AutoPuzzles
Re: The Search Tutorial Thread
« Reply #1 on: September 07, 2011, 06:17:21 AM »
Tomorrow:

AND, OR, NOT and beyond!
You wanna be the man, you gotta Name That Car!

Offline Otto Puzzell

  • Founder and
  • Editor
  • *
  • Posts: 31556
  • Country: us
  • Puzzle Points 444
  • Open field, with a window.
  • YearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYears
    • AutoPuzzles
Re: The Search Tutorial Thread
« Reply #2 on: September 14, 2011, 05:00:28 AM »
OK - a bit late, but here it is:

AND logic

Question: I'm interested in the relationship between Ford and Mustang.

In this search, we retrieve records in which BOTH of the search terms are present

Here is an example of how AND logic works:

Ford = 1,270,000,000 results
Mustang = 223,000,000 results
Ford AND Mustang = 32,200,000 results

The more terms or concepts we combine in a search with AND logic, the fewer results we will retrieve.

For example:

Ford = 1,270,000,000 results
Mustang = 223,000,000 results
Ford AND Mustang = 32,200,000 results
Ford AND Mustang AND Shelby AND convertible AND Delaware = 644,000 results

In addition, search engines make use of the proximity operator NEAR and AROUND. A proximity operator determines the closeness of terms within the text of a source document. NEAR and AROUND are restrictive AND's. The closeness of the search terms is determined by the particular search engine. Most search engines default to some form of proximity searching by default. More on this in a future installment

NOT logic

Question: I want information about Shelby, but I don't want to see anything about Ford.

In this search, we retrieve records in which ONLY ONE of the terms is present, the one we have selected by our search

Here is an example of how NOT logic works:

Shelby = 116,000,000 results
Ford = 1,270,000,000 results
Shelby NOT Ford = 17,700,000 results

NOT logic excludes records from your search results. Be careful when you use NOT: the term you do want may be present in an important way in documents that also contain the word you wish to avoid. For example, consider a Web page that includes the statement "Shelby, after severing ties with Ford, collaborated with Chrysler." The search illustrated above would exclude this document from your results.

Combined AND and OR logic

Question: I want information about the displacement of XK-E engines.

Search: displacement AND (e-type OR XK-E)

You can combine both AND and OR logic in a single search, as shown above.

The use of parentheses in this search is known as FORCING THE ORDER OF PROCESSING. In this case, we surround the OR words with parentheses so that the search engine will process the two related terms as a unit. The search engine will use AND logic to combine this result with the second concept. Using this method, we are assured that the semantically-related OR terms are kept together as a logical unit.

More to come...
You wanna be the man, you gotta Name That Car!

Offline Otto Puzzell

  • Founder and
  • Editor
  • *
  • Posts: 31556
  • Country: us
  • Puzzle Points 444
  • Open field, with a window.
  • YearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYears
    • AutoPuzzles
Re: The Search Tutorial Thread
« Reply #3 on: September 28, 2011, 05:48:13 AM »
Mr. T tells me there is more to come, soon
You wanna be the man, you gotta Name That Car!

Offline pguillem

  • Professional
  • *
  • Posts: 4027
  • Country: ca
  • Puzzle Points 516
  • Designer
  • YearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYearsYears
Re: The Search Tutorial Thread
« Reply #4 on: October 02, 2011, 01:50:25 PM »
Back to basics is always refreshing.  It also helps to use the "  " function, which allows to search for a specific subject, and to look eventually on specific sites with Google advanced search.  Otto will probably explain that.

But there is a more general question which makes research sometimes a little bit more difficult  Whatever it is, an object has many properties, and the description of the object just pinpoints some of them.  For example, I could label the same photo "Citroën Méhari", "Méhari Orange", "Citroënade", "affichefin", "Mouzeil" or even "DSCN24015792" if the indexation is made by my camera.  That's what makes the search so difficult, even with Google images, as long as you use keywords instead of the little camera in Google Image Search.  Sometimes, the descriptors can also be implicit : for example, in Allemano "Faceless" Professional Puzzle, one of the trucks is described as an Elektro-Auto.  In fact, the truck is an ÖAF, but maybe ÖAF trucks are so well-known in Austria that the museum didn't find useful to be so precise, or maybe that it was just interested in the technology, so that ÖAF was hided.

Another major difficulty : all objects are described in particular languages, and the descriptors differ from one language to one other.  For example, a truck is a "lorry" in british english, a "camion" in french and spanish, a "caminhao" in portuguese, an "autocarro" in italian, a "lkw" in german but a "lastwagen" in swiss german, a "teherautó" in hungarian and so on.  There is no easy solution here : you just have to learn these words and the most easy way is to explore the sites in foreign languages.  Needless to say that you'll be have great pleasure with the foreign alphabets and with the ideograms.  But Google Search accepts them, don't it ?

That's what makes Auto-Puzzles so fun, as long as you don't use the little camera.  But Auto-Puzzlers are very cunning at finding efficient ways of making it impossible.  So keep up with the old-fashioned way !
« Last Edit: October 02, 2011, 05:32:51 PM by pguillem »