Link to WFC Website

 Newsletter - August, 1999

In This Issue
WF Workshops
XML Extender
  in DB2
WF for Resumes
  and Jobs
Seminars on WF
Searchable Public
Intarka Prospect-
WF Book Review
  To The Max
WF at Siebel
Focused Crawling
Latest Web
Copernic Server

R. Hackathorn

Take a Quick
Survey about this


Subscribe To
This Newsletter

Previous Issues

PDF Version

Home Page for

The Web
Farming Book

An Introduction
to Web Farming

WF Workshop Series 

    A Web Farming workshop series starts on September 9 with a one-day treatment of Global Discovery Services: Use and Misuse. Frustrated with HotBot, AltaVista and the like? This is a MUST for you!
    Sponsored by
US West Advanced Technologies, it is conducted at their Boulder facilities. Click to read more about this topic

Seminars on Web Farming 

    Three-day seminars on Building Web Farming Systems: Methods & Tools are offered on:
    - September 15-17 in San Francisco
    - November 10-12 in Dallas
    For full details and registration, see the DCI
online brochure. Sign up NOW. Space is limited. Content is awesome!

Intarka ProspectMiner 

    Intarka is a startup focused on using click to view the Intarka websiteweb farming technology in various settings. Their first product is ProspectMiner that discovers and filters sales prospects using a combination of keyword searching, relevance analysis, and learning feedback. Funded by New Enterprise Associates, they have 38 employees at their San Jose and West Bengal offices. Read their white paper on ProspectMiner. It is one of the better business justifications for a topic-specific web farming application.

WF at Siebel 

    Siebel is following in SAP footsteps. (see the May issue.) As a major vendor of salesclick to visit the Siebel website automation and similar systems, Siebel offers InterActive Briefings which gathers information on company profiles, business news, subsidiaries, and affiliates from web-based sources. In addition, Siebel just signed an agreement with Dun & Bradstreet to enhance their external data. Click to read more about this topic

Focused Crawling 

    Another IBM Almaden project is leading the way to better web farming technology. As the Best Paper Prize at the recent WWW8 Conference, this paper describes the use of custom crawling and link analysis to generate lists of topic-specific websites. Click to read more about this topic

Copernic Server 

    Copernic has repackaged their search tool into a server for businesses wishing to offer their own specialized meta-search site. See the MetNets site as an example.

XML Extender in DB2 

    In the June issue, a brief mention was made about the new XML Extender in IBM DB2 UDB. Here is a longer description of the functionality supported by this extender, as it is going through beta testing. Click to read more about this topic

WF for Resumes/Jobs 

    The management recruiting industry is hot! With a tight labor market for high tech personnel, these firms are under pressure to find the right people quickly. Guess what information resource they are using - the Web! Click to read more about this topic

Searchable Public Records 

    Search Systems offers a directory of 982 searchable databases containing land records, court cases, licensing registrations, and much more for most of the U.S. state governments.
   click to visit this website

WF Book Review 

    The June issue of Enterprise Systems Journal contained a review   by Prof. Elliot King at Loyola College of the book Web Farming for the Data Warehouse. He has captured the spirit, along with the content, of the book. Click to read more about this topic

Globalization To The Max  

    A recent lexis.gif (13645 bytes)book The Lexis and the Olive Tree by Thomas Friedman has created quite a stir. Regardless of your political orientation, the book contains insights into the instant global world in which we now live. Great airplane reading. And, be sure to read the reader reviews.

WebData Databases  

    ExperTelligence offers WebData, another comprehensive guide to searchable ExperTelligence.gif (2257 bytes)databases. Excellent Yahoo-style organization. They specialize in "finding, categorizing and organizing online databases."

Latest Web Assessment

    Ever wonder how big the Web really is? The latest assessment comes from Steve Lawrence and Lee Giles of the NEC Research Labs. Based on a random sampling of IP addresses, they estimated that there are 2.8 million active websites containing 800 million indexable webpages in 15 TB of text (only 6 TB if you strip out the HTML junk). More importantly, they estimated that the major search engines are covering only 16% of the indexable pages, with a strong bias towards popular, U.S. and commercial sites. Request the free report. Click to read more about this topic