A free lunch after all?

by David Bradley

Developers of web search engines and data mining tools expend vast sums attempting to find the most efficient ways for users to search. Mighty algorithms travail indexes with great speed and seemingly great efficacy, narrowing a keyword search to a "short" list of results within seconds. However, the "No Free Lunch Theorem" holds that there simply is no algorithm that can perform better than a random selection on the set of all problems. In other words, no matter how hard those developers try they will simply never beat a randomly selected set of "results".

In the present paper, Palotai, Farkas, and Lörincz seek to understand how the topological structure of the environment influences algorithmic efficiency and whether or not there might be a "free" lunch, after all. They compare the performance of algorithms in unearthing news from a large news website, using basic learning techniques - selective learning, reinforcement learning, and their combinations, in random, scale-free, and scale-free small world (SFSW) environments. Their empirical results suggest that selective learning is the most efficient in SFSW topology, but in non-small world topologies, combining selective and reinforcement learning algorithms gave the best results.

Evolving systems, both natural and artificial (like the Web), exhibit scale-free or scale-free small world properties. Previous researchers have shown that there is no performance difference between optimization or search algorithms if the algorithms are tested on every possible problem. This implies that differences in the performance of specific algorithms are simply a result of specific properties of the problem being looked at. By uncovering these properties, it should then be possible to develop optimized search approaches, despite the no free lunch theorem. The structure of a database or index is an important property and others have already demonstrated that an evolutionary algorithm that adapts to the results it obtains performs better in artificial scale-free environments than in lattice, small world, or random environments. Knowing this could be exploited usefully in developing more effective data mining and web search tools.

Everyday millions of users carry out millions of searches on the Web, looking, on the whole, for novel information. The search itself therefore bears its own natural reward function, explains Lörincz, the number of novel web pages or documents that the algorithms can find. The fact that the researchers did not, and could not, specify the temporal and structural details of this reward system is perhaps the most important aspect of the current paper. With this in mind the team set about designing an experiment to compare algorithms approaches.

Their controllable experiment was designed so that the complexity of experiments and the number of designer specifiable parameters were minimal and looked at the structure of the search environment, the algorithms, and the fitness values. They chose the web as the most appropriate data set as it is a large source of rapidly changing data and because of its organic evolution has an almost entirely scale-free small world (SFSW) structure. They chose a large news website containing time stamped entries and downloaded a portion of this with which to experiment. They could thus preserve the temporal structure of the rewards, and at the same time modify the underlying connectivity structure so that the way in which novel information can be found could be changed to test different search methods. They used a group of computer agents - Web crawlers. These travel from link to link and forage for new, not-yet-seen information. The agents used the simplest possible algorithms: selective learning concerned the selection and memorization of good links. They also used a simple version of reinforcement learning (RL). RL was used alone and in conjunction with selective learning in separate experiments.

There are two types of agents - foragers and reinforcing agents (RA). Foragers crawl the web and send back the addresses (URLs, uniform resource locators) of the selected documents to the reinforcing agent. The RA is a simple machine that sends out foragers one after the other an equal number of times. It acts as a central agent that can determine which forager finds new information first and then reinforces that forager. The RA sends reinforcements to the foragers based on the received URLs. Lörincz and his colleagues have employed a fleet of foragers to study the competition among individual foragers. The foragers then compete with each other to find the most relevant documents and so efficiently collect new relevant documents without interacting directly with each other. Foragers adapt based on reinforcements using their learning algorithms. The central agent has been used to model real life, where there is no actual agent, but where food that is eaten in any given area by one forager is obviously then unavailable to foragers that follow, so they are not reinforced by searching the same region.

The researchers then applied a set of constraints on finding the minimal set of algorithms. As such, the algorithms should allow the identification of unimportant parameters, support the specialization of individual foragers, allow the melding of evolutionary learning and individual learning, and minimize communication as much as possible. The team deployed sixteen foragers to search the CNN website for relevant documents; that is pages that are less than 24 hours old and that have not been previously marked as seen. In this the download part of the experiment, the goal was to assimilate a database that is unbiased from the point of view of later modifications to its structure. The first eight forages used both selection based and reinforcement learning. The other eight foragers used only selection based learning. The selection based algorithm, known as the weblog update algorithm, learns the best starting URLs of searches from which the forager can periodically restart its search. The reinforcement learning based URL ordering update algorithm modifies the crawling directions of the foragers, i.e., the type of food (here, topic) they like. Once foraging was complete the team investigated the link structure of returned documents and found it to obey a power law function, in other words it was scale free.

The next step was to repeat the experiment but in a simulation mode rather than on the web. For this, the forager architecture was implemented in Matlab as if it were running on the same computer used in the "real" experiments. They conducted simulations with four different kinds of foragers in each environment: WR foragers used both the weblog update and the reinforcement learning based URL ordering update algorithms. WL foragers used only the weblog update algorithm without URL ordering update; these foragers each had a different variable weight factor. RL foragers used only the reinforcement learning based URL ordering update algorithm without the weblog update algorithm. Finally, fix foragers did not use the weblog update and the reinforcement learning based URL ordering update algorithms. These foragers had fixed starting URLs and fixed weights.

The researchers found that the efficiency of the algorithms depends strongly on the weight vectors associated with each forager in the simulation. In contrast, the number of starting points has little impact on efficiency. Two foragers with twenty different starting points were comparable in efficiency to sixteen foragers using 160 different starting points. The number of starting points and the number of foragers are less important than the weight vectors in the simulations. Larger numbers of crawlers and having more than one computer can only change the situation weakly, which gives the researchers confidence that their studies of structural dependencies would be valid.

They found that freshness and age values of different foragers change in the SFSW environment while for the other three environments the values are almost the same for the different type of foragers. RL and Fix foragers can get trapped in clustered environments as they cannot find outbound links in less relevant documents. The weblog algorithm provides a way to reach the best clusters that have been found and so escape the worst cluster. As such, the RL and Fix foragers in the SFSW environment perform worse than WL and WR foragers.

Finding new documents is hardest in the SFRandom environment for all foragers as incoming link degree distribution is exponential which means that there are relatively more small degree pages than in the other three scale-free environments. It is thus more difficult for foragers to reach those pages that have lots of links to many pages. Overall, finding relevant documents is easiest and download efficiency is best in the SFSW environment for all foragers. This is because of the high number of found relevant documents compared to the other three environments. However, finding new URLs is hardest in the SFSW environment because of its clustered nature. It seems that selection fits the SFSW structure and vice versa, which the researchers suggest could explain the abundance of emergent SFSW structures in nature and perhaps even the organic nature of the web itself.

This finding can be understood in terms of the no free lunch theorem by considering it in reverse. If we find that a particular algorithm is more efficient than random search or than any other algorithm, then this winning algorithm "knows" (or fits) the underlying problem better than the others. The researchers conclude from this that highly clustered SFSW structures and simple selective learning algorithms intrinsically match each other.

Consider data mining on the web. If the goal is to harvest the largest number of novel documents within the shortest time interval then, according to this study, the best approach is to use the selective weblog algorithm, provided that the domain of interest is SFSW. The scenario may be very different, however, if the goal is simply to download all novel documents. This is an intriguing problem that calls for the examination of the graphs of the not-downloaded documents for the different algorithms and, possibly, for the examination of the virtual graph between the foragers, explains Lörincz. In this virtual graph, two foragers are connected to each other if they have sent back the same document. The connection is strong (they are close to each other) if they send back a large number of identical documents. Foraging may be optimized if the foragers learn which foragers are close to them and share the knowledge about their recent paths. Distributed communication between close foragers may improve data mining without too much communication overhead. This issue is the subject of the current research efforts of Lörincz and colleagues.

This article originally accompanied the paper by Lörincz et al. in Karger's journal Complexus for which David Bradley is an editorial board member.