There are statistical distributions of every imaginable type in life, some simpler than others. Yet, no matter how complex the pattern, someone will try to fit a curve to the data and postulate some theory. The authors start with the Zipf law, applied to the popularity of Web sites, where the frequency of visits to a Web site is an inverse power of its rank. By ranking sites in order of popularity, and computing a best-fit exponent factor, you can predict the frequency of visit occurrences.
From a literature review, the authors show that the exponent power tends to vary widely. By adding two extra parameters, one an additive factor to the Web site’s visit frequency, and the second an additive factor to the Web site’s rank, they are able to stabilize the exponent to a mean of 1.02 and a standard deviation of 0.05. This tight range for the exponent is shown to be consistent for varying sample periods, sizes, and locations.
The authors then postulate, but do not prove, what these two additional factors mean. The additive factor for the frequency is considered a corrective factor for the finite sample size, and the second factor is there to account for the Web sites that are not queried because they were serviced from the local machine cache. However, this is all irrelevant if the model does not improve predictability.
The variability in the exponent, ranging from 0.60 to 1.03, has been reduced at a cost of introducing a variability of two orders of magnitude in the frequency parameter, and a range of between 0.45 and 17.82 for the rank parameter. That, at least in my mind, is an example of a worse model, not a better one.