Conquer the Information Age

Employers Value Web Research Skills


Become a Web mining expert

Find Golden Information Nuggets

Computer skills are an absolute necessity for virtually everyone in any professional career.  Today, more than ever before, information is a high-value commodity used by businesses to make good decisions for both short- and long-term strategies and to conduct daily activities.

If you, as an employee, can bring strong information gathering skills to the job, your value to the employer increases significantly, which may translate into greater pay.  The Internet is a goldmine of information, but like mining for gold, you have to know where to look and how to separate the plain old rocks and "fool's gold" from the real nuggets. Following are some tricks to make an otherwise daunting task easy and efficient while making you look good to your employer.

Learn to speak Internetese Know the terminology of the Internet. A glossary of Internet terms is at the end of this activity, and hyperlinks are provided in the text to terms in the glossary. Use the "Back Button" to return to the activity from the glossary.

Tap into the "Mother Lode"

Search engines. The key to finding the "Mother Lode" on the World Wide Web is the search engine.  Hundreds of search engines are available on the web, but only a handful have huge databases cataloging the majority of web sites in cyberspace.  Your best bet is to start with one of the larger search engines, such as:

  1. http://google.com
  2. http://altavista.com
  3. http://excite.com
  4. http://go.com
  5. http://northernlight.com
  6. http://yahoo.com
  7. http://nbci.com

Google is a relative newcomer to the search engine business, but it has become one of the best because its database is huge, it's easy to use, it has an uncluttered presentation, and it's fast.

Mine the Lode

Keywords. In this activity, you'll be searching for the growth forecast for your career field using one or more of the major search engines. To appreciate the value of using search refinement techniques, you will start with a broad keyword search term that returns thousands of web site hits. You will then increasingly refine your keyword search string to better focus the search results.

Step 1:

Select a search engine from the list above—you may repeat the activity on different search engines to practice your searching skill.
 

The desired target of your search is the Occupational Outlook Handbook published by the U.S. Department of Labor, Bureau of Labor Statistics. For the purposes of this exercise, you'll assume that you do not know the title of the document or the agency that produces it.  Therefore, you'll have to use search words that are general in nature.

Step 2:

  • In the search window of the search engine, type in the keyword "job" without the quotation marks and click the Search or Go button. Depending on the search engine you use, you will get anywhere from almost 9 million to almost 27 million web site hits. (NOTE: If you used Yahoo, make sure you select the Web Page search option.)
  • Searching through 27 million web sites isn't efficient, so let's add another keyword to the search.  Go back to the search engine search window and type a plus sign (+) in front of "job," add a space after "+job" and type in "+labor" (again without the quotation marks). The search string will look like this: +job +labor. Click the Search button again.You will get between 8,000 and 1.2 million hits. 
     
Inclusive or exclusive? The plus sign is used to make it mandatory that the word is contained on the target web page.  Without the plus sign, the web page can contain any rather than all of the keywords. The plus sign is one of several logical operators that can be used to refine keyword searches.
  • The number of web site hits generated by your "+job +labor" search string is certainly less, but still not efficient to find what you're looking for.  Go back to the search engine window, add a space after "+labor" and type in "+statistics." Your search string will look like this: +job +labor +statistics. Click the Search button again. You will get about 630 to 471,000 hits depending on the search engine you used.
  • After using just these three keywords on most of the search engines, you will see the Occupational Outlook Handbook in the first ten listings on the result page (most will list the Handbook in the number one through number three positions). (If you used Northernlight, look on the menu on the left for Bureau of Labor Statistics and click on the link.)
  • Now click on the hyperlink for the Occupational Outlook Handbook. Once you're on the web site that contains the information you need, you still need to find the specific information about your career field within the web site. 

Step 3:

  • The better web sites that contain large amounts of information often provide you with yet another convenient search engine to find web pages within the web site. With others, you may have to browse through the web site to find the page containing the information you are seeking—look for navigation menus or buttons to help locate likely web pages. (NOTE: Some search engines offer the capability, under their advanced search options, to search all the web pages associated with a particular URL.)
  • Using the search engine provided on the first page or homepage of the Occupational Outlook Handbook, type into the search window one keyword that is most likely to be contained in all job descriptions of your career field.
  • If you get several hits, choose the one that most closely describes your career and click on it.  If you don't get any hits, try another keyword for your search, or select the job title index provided on the site, and browse through the listings to find the one that most closely matches your career.

Step 4:

  • You're almost there! Now you're going to search for the growth forecast for your career field.
  • You could scroll down the web page looking for the information, but it's far more efficient to search the web page itself for your keyword.
  • Hold down the Ctrl key on your keyboard and tap the F key.  In the search window that pops up on your screen, type in the keyword "outlook" without the quotation marks, and click the Find Next button twice.  There you have it, the growth forecast for your career is at your fingertips—you've quickly found your "golden nugget"!

Plain Old Rocks and Fool's Gold

Web site credibility. No web search lesson would be complete without a few words of caution—VERIFY information credibility! Anyone can put anything on the Internet. It's your duty to ensure that the information you've found is the real thing: in our gold mining analogy, what we're doing would be an assay.

Carefully check the credibility of any site that ends in ".com," ".net," or ".org."  Anyone can acquire these domain names.  Sometimes disreputable web sites acquire a domain name very similar to that of real organizations to "spoof" the inexperienced web miner—"fool's gold" of the web.  While ".org" web sites are usually reputable organizations, remember to verify that the site is indeed the real web site for the organization.  The ".com" organizations are usually for-profit businesses and may skew information to support their business markets.  If you use these sites, it is probably a good idea to look at multiple sources to verify the accuracy of the information.

Domain names in the United States registry that end in ".edu," ".gov," ".mil," and ".us" are controlled, and the domain registrar must verify the domain registrant.  These sites have a reasonably high assurance of providing accurate information; however, look very carefully at information on student-created web sites located on ".edu" domains.
 

United States Root Domains

.com normally for-profit businesses
.net normally networks, but may also be businesses
.org normally nonprofit organizations, but may be businesses
.edu verified educational institutions
.mil verified US military web sites
.gov verified US government agencies
.us verified state and local government agencies


You will also find many web sites ending in a two-letter country code. These are web sites outside the United States.  For instance, ".uk" is the United Kingdom (England), ".de" is Germany, and ".jp" is Japan.  If you use these web sites, it is also a good idea to verify the accuracy of the information by checking other web sites.

For future reference and to document the source in your research file, also copy and paste the URL of the web site where you found the information. Here is the easy way of doing this:

  1. Place the cursor on the URL in the browser address window and click the left mouse button once so the URL is highlighted.
  2. Click the right mouse button once while the cursor is on the highlighted URL.
  3. Select the Copy option from the popup menu.
  4. Paste the URL into the research compilation document you create using a word processor or text editor.

Panning for Gold: Tips and Tricks

Logical operators. Use of logical or Boolean operators makes web searches even more efficient. Almost every search engine provides a listing of logical operators that can be used along with instructions for using them.  Some search engines have an "Advanced Search" feature or something similar which automatically formulates your search using logical operators—it's pretty painless because you don't have to remember the logical operators or how they work. The following table contains commonly used operators and usage.
 

Logical Operators

No operators Search results will contain pages that contain any of the words listed.
Quotation marks The search engine looks for web pages that have an exact match for words between the double quotation marks. Remember our search for the Occupational Outlook Handbook? If you know the exact title of a document, you could place the title within quotes and search for it. Try a search with the Handook title to see how it works.
+ This is the "include" symbol: all words marked with the plus sign must appear on the web page in order for the search engine to list it in the search results.
- This is the "exclude" symbol: any web page that contains an excluded word will not be returned in the search results. For instance, a search for +pets +dogs -cats will list pages that contain "pets" and "dogs" but no pages that also list "cats" on the page.
AND or & Use of AND between keywords is similar to using the include plus sign. You can also use the & symbol.
OR or | Use of OR between keywords is similar to using no operators.  Use of AND and OR is useful in combination.  For instance, a search for pets AND dogs OR cats will list all pages that contain the word "pets" and either "dogs" or "cats." You can also use the | (vertical rule) symbol.
AND NOT or ! Use of AND NOT is like using the exclude minus sign operator.  For instance, pets AND NOT dogs lists all pages that contain the word "pets" excluding any page that contains the word "dogs." You can also use the ! (exclamation mark) symbol.
NEAR or ~ This operator returns web sites with words that are located close to each other on the page (usually with no more that ten words between them).  This would be handy to use for phrase searches where the phrase always contains the two words listed but may contain other words in between. You can also use the ~ (tilde) symbol.
Parentheses Parentheses are used to designate an operation to be performed first. For instance, a search for (family AND pets) AND (dogs OR cats) would return all pages that contain the words "family, pets, dogs" or "family, pets, cats" or both combinations.


In addition to the logical operators listed, some search engines have even more commands available to refine your search.  These usually vary from search engine to search engine so you will have to consult the advanced search help page of that search engine for more information.

Web Gold Mining Recap

  • Select a good search engine.  The larger the database of web sites that the search engine uses, the more likely it is that you'll find what you're looking for.
  • Use combinations of keywords (usually three to four good keywords) along with appropriate logical operators which will return a manageable number of web sites with the highest probability of containing the information you are seeking.
  • Use searches or menus on the selected web site to assist in quickly locating the information within the web site.
  • Use page searches on web pages that contain a large amount of information to quickly jump to the area of the page which contains the object of your search—remember that's hold down the Ctrl key and tap F to pop up the Find dialog box.  Your first clue that the page contains a lot of information is a very small scroll bar on the right of the page window.

You've got it!

Go for the GOLD: Practice web searches to polish your skills and increase your employability!

 

Internet Terminology

browser The application program used on a computer to connect to and view web pages. Internet Explorer and Netscape are the two most commonly used browsers.
directory A directory is also commonly found on many of the larger search engines.  It only contains locations of subscriber web sites that are cataloged according to content. Many of the directory services are now charging a fee for a listing (or preferred listing).  Yahoo.com is an example of a directory service which now also incorporates a web search engine.
DNS Domain Name Service (or Server): A service database that translates domain names into IP addresses in order to locate an Internet resource.
domain name The name associated with a particular Internet resource.  For instance, microsoft.com and ucla.edu are both registered domain names. Public domain names used on the Internet are reserved for a fee from domain registrars.
e-mail Electronic mail: The portion of the Internet used for messaging communication.  Two types of e-mail systems are used.
  • SMTP/POP3: True Internet messaging using Simple Mail Transfer Protocol for outgoing messages and Post Office Protocol version 3 for incoming messages. An e-mail program such as Outlook or Netscape Communicator is used to send and receive messages.
  • HTTP: Messaging through a dynamically constructed web page to transfer mail using WWW resources. A browser is the program used to send and receive messages.
FTP File Transfer Protocol: One of the methods of transferring files over the Internet. It is most commonly used to transfer large files because it is faster.
hit In search engine parlance, a hit is a successful result of search where one or more items (hits) are listed on the search result page.
HTML HyperText Markup Language: The coding used most commonly to develop web pages.  You can view the HTML for a web page by right clicking your mouse in a blank area of the page and selecting View Source.  In addition to HTML, web pages may contain dynamic content through Active Server Pages (.asp), Java and PERL scripting languages, or other coding techniques.
HTTP HyperText Transfer Protocol: The method used to send web pages to a browser such as Internet Explorer or Netscape.
hyperlink or link Text (usually underlined), a button, or an icon that is used to jump to another place in the current page, another web page, another web site, or to download a file. The cursor normally changes to a pointing finger when it is over a hyperlink.
Internet The network of resources available world-wide to share information and files.  It includes the World Wide web, electronic mail, over the network collaboration, and file transfers.
IP Internet Protocol: The method used on the Internet to locate and navigate to an Internet resource, such as a web site. It can be identified as a four-digit number separated by periods.  Think of it as a phone number or street address for an Internet resource.  You can type the IP of a web site directly into your browser address window if you know the IP. For instance, http://206.47.20.65 will take you to the home page for Corel.
keyword A word selected to perform a search on the Internet. Normally, multiple keywords in conjunction with logical operators are used to refine the search.
root domain The grouping of letters following the last period on the right of all domain names. Each root domain is a separate database which facilitates the location of resources for a domain name.
search engine An Internet service which facilitates location of Internet resources by using keywords and optionally logical operators to refine search results. Logical operators are symbols such as + or - and words such as AND or OR.
TCP/IP Transport Control Protocol/Internet Protocol: The official designation for the method used to connect computers on the Internet and exchange information.
URL Uniform Resource Locator: The full text entry used to locate an Internet resource. For instance, http://www.microsoft.com and ftp://ftp.corel.com/pub/ are both URLs.
VPN Virtual Private Network: A method used to connect computers discreetly to a network by "tunneling" through the Internet. Depending on the methodology, a very secure connection can be achieved.
WWW World Wide Web: The portion of the Internet used for web site resources.

 


© Copyright 2002,  South-Western Educational Publishing, a division of Thomson Learning. All rights reserved.