- Human Search Assistants -
The interactive search for content on the world wide web
Critical evaluation plan about the financial implementation of:
" The combination of call-centres on the Internet, search engines and videoconferencing"
Statement : "To be an expert and to be an expert in finding on the world wide web,
are two different things."
Vision : "To make the web into a library managed by everybody!"
Mission : "Financing the peer to peer chain of human intervention for localisation on the web"
Preface : the False Mirror of the World
The World Wide Web gives a false image of the world because so much information is
hidden, not accessible or not up to date. Even when you find the information you want,
you have to be careful to remember where you can find it back.
The databases of the complete world wide web, with their web-sites and archives could be compared to an artificial world of knowledge, changing every day but never up to date:
the finding place of the archives changes and the web-sites must be adapted every three months at least.
A search-engine like "Yahoo" which is very frequently used gives old results.
Television on the contrary is an actual and direct medium: it changes while we are looking.
I still spent more time on television watching than on my computer looking into the Net.
If I want go into detail, I look up into an URL or a more specific web-site.
Could it be possible to combine specific information while watching television? Anyway,
the internet is a transaction channel that is underestimated as an information-channel.
The model is based on the business plan of "ARJIL & Associés Banque"
(without "Investment Highlights" and without "Financial Model"); during her course
mrs. Jessica Pavec-Burke provided us also with a real example of a business plan in
the New Economy and explained her critical evaluation.
Unfortunately, a comprehensive comparison with the other Belgian company involved in
human search assistance on the Web, "Web-diggers", was not possible because the curator
mr. Leplat, would not give be the necessary information and figures.
As a result this business proposal must be considered as an exercise and has no purpose whatever than to
provide for a satisfactory insight into the practice of making a business proposal.
It is possible that the figures in the financial model will be changed by the time of the presentation in Valencia while I am expecting data and figures.
By the time I expect to refine the financial model (on Excell) and to add the strategic plan and the swot-analysis.
A file with the patent will be at your disposal at the presentation and I am confidant that the drawings of the patent will make clear everything.
-------------------------------------------------------
1. Introduction - Statement: "The Internet is Not Our Friend!"
Definitions and developments:
a. What is the Internet?
a1. The Internet is the world's single biggest networked community; it spans every continent,
thousands of networks, millions of computers, and hundreds of millions of people.
From a marketing perspective, the more people who use the Internet, the more important it is
to be represented on it, and in recent years businesses have acquired Internet connections;
Electronic mail addresses, and Web sites in an attempt to make themselves accessible to the
Internet community.
a2. The Internet is "the global information system that
1. is logically linked together by a globally unique address space based upon the Internet protocol (IP)or its subsequent extensions/follow,
2. is able to support communications using the transmission-control protocol/Internet protocol (TCP/IP) suite or its subsequent extensions/follow and/or other
IP-compatible protocols, and
3. provides, uses or makes accessible either publicly or privately, high-level services layered on the communications and related infrastructure described herein." (Federal Networking Council in an resolution passed on October 24, 1995).
a3. "a chaotic repository for the collective output of the world's digital "printing presses""
(Lynch 1997) . This third definition is the most likely to be ours.
Question: how to find a place where you can register world-wide your stolen camera?
b. What is the World Wide Web?
The World Wide Web is an open ended networked hypertext system.
Links in a document can refer to other documents on other machines in other parts of the World.
The Web can provide a uniform interface to other information services.
The Web can deliver data such as images, video, sound as well as html and text and now also host virtual systems such as multicast.
The Web is fundamentally client / server in concept, but this changed the last years.
Until now, Internet communities have been limited by the flat interactive qualities of email and network newsgroups,
where people can exchange recommendations and ideas but have great difficulty commenting on one another's postings, structuring information, performing
searches, and creating summaries. Peer to peer challenges the authority of the client/server model, allowing shared information to reside instead with producers and users. Peer to peer
networks empower the user to collaborate on producing and consuming information, adding to it, and building communities around it.
As a peer to peer project many companies will be involved.
Peer to peer is a class of applications that take advantage of resources-storage, cycles, content, human presence
available at the edges of the Net.
Because accessing these decentralised resources means operating in an environment of unstable connectivity and unpredictable IP addresses,
peer to peer nodes must operate outside the DNS and have significant or total autonomy
from central servers.
The finding of information with "human search assistants" will be structured as a peer to peer.
Recent developments of peer to peer:
- Example 1: Napster has demonstrated that systems that work with
The economic logic of the Internet rather than against it can have astonishing growth.
power is gradually shifting to the individual for things like stock brokering and buying
airline tickets.
Media businesses that have assumed such shifts wouldn't affect them are going to be taken by surprise when millions of passive consumers are replaced by millions
of one-person media channels.
Such architecture erases the people-based distinction between
provider and consumer just as surely as it erases the computer-based distinction between
server and client.
- Example 2: Freenet. It is reported (Nature 401, 1999, "Diameter of the W.W.W.") that the Web is a small-world network with a characteristic pathlength of 19 clicks. This means that from any given web page it is possible to surf to any other one of the 800 million reachable pages in existence.
Such a path can only be constructed by an intelligent agent like a human while an unintelligent robot choosing links at random would get nowhere.
The only scope for such a unintelligent robot is brute-force indexing: a robot attempting to locate a web page at a distance of 19 hops would need to index at least a full 10% of the Web.
The Freenet-network has an index of local files stored in neighboring servers and when a Freenet-server receives a query it first checks to see if the query can be satisfied locally.
If it cannot, it uses the local index to decide which server to forward the request to.
Freenet dumps unpopular data on the floor, so people flooding the system with unpopular data are ignored.
- Example 3: Gnutella. Each client in the network is concerned only with the files it has stored locally and if the query can be satisfied locally,
the client sends a response, if not, it doesn't respond at all.
The query is essentially broadcast to all computers on the Gnutella network Each Gnutella client can interpret the query however it sees it.
c. Those three examples miss something: a hierarchy.
If we could a system that directs into the network a collecting point of high value interpretation and domain-knowledge we could interfere in the chaotic node-
structure and develop a network that goes from the top to below and provides a chain
of answers to precise questions.
The hierarchy between a head-librarian and librarian
will provide for this truncated hierarchy.
It is not why there is no more a client/server relation that there should no more be any kind of hierarchy.
Answer: serial.stealitback.com
This could be answered by a head-librarian!
2. The Project
a. Description of the problem
Reference definition: the system = the network of head-librarians = InternetV (company name)
1.Introduction
Wandering around the Web might take an hour or two.
Or a half week-end.
Here comes the challenge.
What a (professional) searcher needs is someone to make this kind of problem go away.
We all hear about the wonderful, evolved powers of today's search engines, particularly those that grind away at the Web and even try to pluck the veil from
the "Invisible Web".
I want a specialised search engine to climb over the barriers (registration, fees, passwords, formatted question taking screens etc. ).
to conduct a serial search of all sites with archives and even throw in other news media when possible.
If necessary a fixed fee could be paid for each successful search and fees per article.
Even pass through ads for going to the original site or add ads throughout the results.
Anyway no charges for nothing.
When for instances you look for newspaper archives, you always remain locked in on commercial sources for the newspaper searching where the speed of archival searching is definitely cost-effective and even if you go through all the newspaper archives available,
one still end up back on the commercials for the newspapers that didn't have archives on their Web sites or whose own Web sites archives didn't extend as far back as the commercial versions.
2. Can search engines find everything?
In summer 2000 Brightplanet, a company specialised in search technology of the "invisible Web" revealed that the Web did not contain 1 billion of web pages as initially thought,
but 550 billion. This is known by the makers of search engines: it is information which is not accessible for this generation of spiders which only look for static web-pages accessible by links.
But there is more than only html-pages : there is more and more information in databases while the extension of information as static web-pages is not anymore verifiable,
one can only have access to this databases "on the fly" when the user asks for it.
Pages from the telephone-book, the online encyclopaedia, museum-information, government publications, statistics, archives, flash-animations, PDF etc... are made directly from databases and the user not aware that these grow even more than the traditional sites.
An insight into those databases is given by Complete Planet with a reference to the 27.000 databases of the 200.000 that are not known.
3. The objective
To build a financial construction able to organise a world wide peer to peer network of independent human librarians capable of giving information of the exact location of information on the World Wide Web.
4. Definition(American patent:)
Method for searching on the internet using at least one human search assistant, helping a user
for finding information on the internet and whereby this assistant is an expert in searching on the internet.
The human assistant assists the user where the information he is looking for can be found in the World Wide Web or giving the information found in the World Wide Web.
Method in which the human search assistant has such expertise in searching on the internet that he can be considered as a web librarian,
and is able to give more information than the place to look on the World Wide Web and is able to supervise the user consulting the internet.
Method in which the human search assistant preferably makes use of search engines for searching on the internet between the user and the human search assistant is on-line, visual and in real time.
Method wherein the communication between the user and the human search assistant works by voice recognition.
Method wherein the human search assistant is consulted via one of the following devices:
A computer, a mobile phone, a palmtop or an interactive television apparatus or the set-top box associated therewith.
The method in which the user makes contact with the same search assistant by means of voice
recognition, iris recognition or fingerprint recognition.
Method using more human search assistants on the Web, whereby one or more head human search assistants - head librarians - have a number of adjunct human search assistants who
have their specialised domain, whereby the head librarian can canalise the question of the user to a librarian.
5. Why collecting data of the Web is difficult.
- Change of events and various types of dynamic content ( ephemeral pages, dynamic database selections, version super-impositioning).
- Lack of authentication mechanisms and rights management.
- Deceptive games played by some webmasters to increase their pages visibility in search engines like Alta Vista 1999, Excite 1999.
- Separation in time and space of meta data and data.
It must also be stressed that human intervention can elude the problems of computer-language defaults which can not always make comprehensive the request of the user.
See hereof in particular searching the Web for art chapter 5, b examples.
6. What does it mean: "The Internet is not our friend"
It means that it is no longer the objective instrument that it was intended to be by its founders
while for commercial reasons some webmasters will do anything to appear on the top of the list at major search engines.
Examples:
1. Index spam: misleading material in meta tags
2. White text on white background at a small point size.
3. Link cardinality misuse.
4. Payment for cardinality: Ask Jeeves, Google, Yahoo.
5. Paid indexing: Goto.
This is an abbreviation of the original text by Francis De Smet. The original text can be asked by mail to the author
Download the American patent