John of Holywood was a monk of English or Irish origin, who taught astronomy at Paris in the first half of the thirteenth-century [JH]. Also known by his latinized name Joannes de Sacrobosco (Sacrobusto, Sacrobuschus), he owes his fame to the Tractatus de Sphæra, a popular textbook on spherical geometry which was widely used at European universities over the next 300 years. He also wrote de Algorismo, a treatise on the arithmetic of positive integers --- including the extraction of square and cube roots --- using Arabic numerals [DS].
The Hollywood Constant, which is the subject of this page, has no relationship whatsoever to good old single-elled John. Denoted henceforth by H*, it is defined as the smallest non-negative integer that has never been used in the title of a movie.
Some misguided souls have tried to discredit this line of research by pointing out that, since new movies are continually being made, the Hollywood "Constant" is bound to increase with time. Therefore, they argue, the search for H* would be the pointless pursuit of an ever moving target.
Those critics cite, for instance, the movies A Quarter Million Teenagers (1964) [135], Half a Million Teenagers (1970) [136], and One Million Teenagers (1985) [137], produced by the US health authorities to educate youngsters about venereal diseases. They also note that the Hindi movie Four Hundred Million, released in 1946 [138], was eventually superseded by One Billion in 1991 [139]. They claim to see in these examples a general trend that, over the XXth Century (Fox), has led to the replacement of million [140] by billion [141] in many other important applications. This phenomenon is undoubtedly a consequence (or perhaps cause?) of similar inflationary trends in world population, consumer prices, military budgets, national debts, and corporate swindles.
Those critics also attribute to global warming the fact that movie titles recorded shade temperatures of 90 degrees F in 1965 [90], 92° F in 1975 [92], and a whooping 40° C (104° F) in 1976, at an unspecifiable Italian location [142].
Even if those criticisms were valid, they would not be reason enough to dismiss H* as worthless concept. After all, physicists are quite comfortable with the idea that other universal parameters, like the gravitational "constant", may vary with time. Besides, if the interval between changes in H* is longer than the attention span of the average reader of this page, that parameter may be regarded as constant, for all its practical uses. (Which are not exactly overwhelming, are they?)
Moreover, the variability of H* is by no means proved, or even probable. For instance, the "global warming" theory has been largely discredited by recent data. In 1986 --- ten years after the Italian record above --- cinematographic temperatures had dropped slightly to 37.2 degrees C [143]. By 1997 they were down to 26° C [144], and three years later they were a mere 75° F (23.89° C), in full Texas Summer [145]. Considering that, back in 1955, 40 degrees C were recorded at Rio [146], we see no reason to worry about global warming in movies. Besides, most theaters have very good air conditioning these days.
On a more fundamental level, statistical studies of the viewable movie universe have revealed a steady increase in the rate of re-makes, re-hashes, re-releases and re-runs of old movies. After trying hard but in vain to explain away these disturbing results, many researchers have abandoned the popular idea of an unbounded movie universe, and postulate instead a gradual slow-down of its expansion, leading toward a limiting repertoire of finite size --- or maybe even to a phase of contraction and collapse, whereby what once was a vast, mostly empty-headed market will be ultimately crushed into a tiny theater showing only flickering, hand-cranked black-and white shorts. In that case, H* will be not only well-defined, but a parameter of cosmological significance.
Theoretical research on H* has been rather inconclusive so far. The pigeonhole principle [PP] would seem to imply that H* cannot exceed the number M of movies ever made; but the magnitude of this parameter is uncertain as well. Researchers at the Internet Movie Database project [MD] are confident that M does not exceed 107, but have offered no justification for this guess. It is widely believed that M does not exceed the product P × D, where D is the number of days since the invention of the moving picture, and P is the number of people who were alive in the world at some time within that interval; which would imply H* < 1014. However this estimate assumes that no director has produced more than one movie per day, on the average, over his lifetime; an assumption that responsible researchers in the field are reluctant to make.
Moreover, scientists have recently discovered that a single movie title may contain two or more numbers [147]. Since there seems to be no upper bound on the length of a movie title [11], this unwelcome discovery clearly invalidates the pigeonhole argument. Sad to say, the theory of H* is now back to square zero.
The disappointing lack of progress on the theoretical front has been partly offset by extensive experimental work [1..84], which concluded that
»»» H* = 85 ««« |
Strictly speaking, the experimental work is still incomplete, so 85 is only a lower bound for H*. While American movies have been well researched already, thanks to the extensive database available on the net [MD], coverage of European movies is still rather patchy; and most of the other film cultures of the World remain virtually unexplored. According to some researchers, this lower bound is bound to increase, by leaps and bounds, as more movies are examined. (After 85, the next candidate values for for H* are 118 [86..117] and then 126 [119..125].)
On the other hand, the experimental data strongly suggest that H* is indeed 85. Sifting through more than 3 × 105 runs, and an unbearable number of re-runs, scientists found not a single valid observation of the number 85 in a movie title. Even more remarkable is the fact that the next 32 numbers after 85, from 86 through 117, have all been used in bona-fide movie titles. This combination of a single null result followed by 32 non-nulls cannot possibly be due to chance, and screams (in Dolby Digital Surround) for a scientific explanation.
It must be no coincidence, either, that 85 is also the atomic number of astatine --- the first element of the periodic table that has not been detected anywhere in the natural universe. Given all this evidence, even the most skeptical reader will surely agree that there is a universal prejudice against 85 --- not only from Universal Studios and other movie producers, but from God Himself.
Whatever repulsive force is keeping movies away from 85, it seems to have affected nearby numbers as well. A thorough literature search turned up only two occurrences of 86: by E. Shroff in India (1986) [86], and by S. Ishii in Japan (1989) [148]. These experiments from remote parts of the world haven't been adequately reviewed by Western researchers, so their merit (scientific as well as artistic) is still open to question. (We cannot avoid noticing that 86 is exactly twice 43, the atomic number of technetium, which is the first element with no stable natural isotopes; and it too has been seen only in remote parts of the Universe.) There is also only one dubious occurrence of 87, recorded by N. Jani in Malaysia (1987) [87] --- which may be a repeat of the 1982 Stanford Monopole Event [MM].
To some physicists, the narrow spectral gap around 85 is a sure sign of a new family of subatomic particles to be found, and billion-dollar science projects to be launched. One group is already seeking financial support for such a project, whose goal is the artificial synthesis of a full-length movie to be named The 85 Resonance. They are careful to point out, however, that this artificially produced movie may be very difficult to observe, since it is expected vanish from theaters within 10-6 seconds from its creation.
Researchers who plan to move into this exciting field should acquaint themselves with its fundamental concepts and paradigms. First, even though the constant is named after a small town in North America, it is meant to be a fundamental cosmological parameter like pi, c, or 42. Hence, in its definition one should consider not just Hollywood productions, but all movies ever made in this universe, of all genera.
However, as in any respectable science, a movie only qualifies if it has been published in respectable theaters, and was reviewed (or at least viewed) by a substantial segment of the movie-going community. So home movies do not count, and neither do unfinished movie projects, imaginary movies, movies-within-movies, etc. Moreover, we note that in all scientific fields there is a sharp distinction between technical journals like Nature, that may be cited, and popular science magazines like Discovery, that should not. Accordingly, in measurements of the Hollywood Constant, one should only consider real full-length features made for the theater. TV movies --- including shows, serials, soap operas, short documentaries, newscasts, and the like --- do not count. So you can forget about Three's Company, Sixty Minutes, and even Babylon 5 --- unless they come out in "real movie" versions, of course. For the same reason, short films and newsreels --- which are analogous to abstracts and short communications in other fields --- do not qualify as scientific evidence. (One may make an exception for very old short movies, which were "main features" by the standards of their time.)
Researchers generally agree that, in measurements of H*, cardinal numbers [149] and ordinal numbers [7] are equally acceptable. The numbers may be written in digits [102]), in words [138], or a mix of the two [150]. Numbers-in-words include ordinary number phrases [41], digit by digit spellings [123], or, again, a mix the two [151]. Roman numerals like VIII [8] are OK, but watch out --- not every XXX [152] is a 30 [153].
This being an exact science, one may --- indeed, must --- evaluate any arithmetic expressions appearing in the title, like 87 + 11 [98], 3 × 3 [154], 4 × 4 [155] 8 × 8 [64], 0.5 × 6 [156]11 × 14 [157], sqrt(0) [158], 0/oo [159], etc.. Common numerical paraphrases like a pair of [160] and dozen [161] must be replaced by their numerical values. If better evidence is lacking, one may assume that, say, Seven Wives for Seven Brothers [14] is an additive operation that yields 14 people up a mountain somewhere in America [162]. Needless to say, oo (infinity) [163] is not an integer, and oo - 4 is still oo [164].
In any case, an occurrence of a number in a movie title only qualifies as a valid datum if it is an integral part of the title itself --- as opposed to the serial number identifying an episode or sequel. Thus, Richard III [165] is a valid instance of the number 3, while Start Trek IV [166] is not an instance of 4. This stern rule holds even for title-less artsy-type movies that are known only by their serial number [167]. For the same reason, one should disregard year numbers that merely distinguish the movie from previous versions, like Boccaccio '70 [168], Boccaccio '91 [169] or Airport '77 [170]. A year number only counts if that specific date plays a significant role in the movie, as in Woodstock '94 [94] or 2001: A Space Odyssey [171].
Citable occurrences must be explicitly and completely defined in the title itself. Thus, The Brand of Satan [172] is not a valid occurrence of 666. In general, one should ignore any number whose value is unspecified [173], or only partly specified, such as handful [174], odd number [175], wrong number [176], etc.. This rule also excludes formulas with undefined variables, such as N - 27 [177] or unspecified factors, whether they are constant [178], variable [179], or accidentally omitted [180]. In particular, one should avoid numbers that are intentionally generic [181]. Please ignore those befuddled mathematicians who claim that N is a Number [182], and steer clear of "numbers" that are defined as non-numbers [183] --- lest you become another casualty of Russell's Barber Paradox [BP]. And, needless to say, death is not a number [184].
Measurement units [75] and numerical classifiers [149] should be ignored: just take the number and ignore the unit. (Watch out, however, for the difference between the SI unit of time [185] and the ordinal two [186].) It is not allowed to recast such titles in alternative measurement units, as in *96,561 Kilometers Under the Sea [187], *637 Yen a Minute [188], or *Celsius 232 [189] --- unless the conversion was actually used as the official release title in some other country [31]. The rule applies even to currency conversions within the same measurement system: for this research, Le diamant de cent sous [190] is not worth a single franc. Along the same line, it is illegal to numeralize words like nickel [191], July [145], midnight [192], millennium [193], platoon [194], etc.: while they imply a definite position in a series, or a definite quantity of something, they are not abstract numbers that could be used for other things. And, finally, note that Nine to Five [195], which definitely counts for both 9 and 5, does not cover 10,11,12,1,2,3 and 4 as well.
Sometimes the technical details of what counts as a number become research problems in themselves. For instance, how should we handle fractions like ½ × ½ × ¾ [196] and 9/10 [197]? Should they be truncated down with floor(), or rounded off to the nearest integer? Rounding is definitely more accurate, but then shall we count Fellini's classic 8½ [198] as 8, or as 9? And, when evaluating 8½ × 11 [199], should we round the fraction before or after the multiplication? Should we read seven percent [200] as the integer 7, or as the fraction 0.07? If we opt for the latter, consider 99 and 44/100% [99] --- is that 99.44 or 99.0044? And how should we handle compound non-metric measurements like 6'4'' [201]? Is Planet X [202] located three orbits beyond planet VII, or halfway between planets W and Y? Are we allowed to count 2 + 2 = 6 [203] under 4 as well as 6? Is 007 [204] a signed int, or an octal char? And so on.
Naturally, there are many other parameters of scientific interest that can be used to further characterize the relevant phenomena. For instance, the second-order Hollywood constant H*2 is defined as the smallest integer that has not been used in two or more different movie titles. This concept trivially generalizes to the kth order constant H*k. A preliminary analysis of the data strongly suggests that H*2 = 59 [58,205,206].
Conversely, the Hyperbolic Hollywood Constant, denoted H* (read H-sup-superstar), is the largest integer that has ever been used in a movie title. At this time we only know that H* is at least 1012 [207]. Obviously, this concept too can be generalized to higher orders, and it has already been established that H*2 is at least 109 [208,139,141,209].
Actually, the Hollywood Constant H* is merely the first non-negative integer zero of the Hollywood Function h*(n), which is the number of movies that use the number n in its title. Note that all the generalized constants above are merely features of h*.
Although we still do not know the value of h*(n) for any n, it is conjectured that the function is as irregular as the sequence of prime numbers, and therefore just as fascinating and useful. Most of the basic questions about h* are still quite open. Certain numbers, like 7, 12, and 418, seem to be much more popular than others: is this a feature of h*, or merely an experimental bias? If those peaks are real, are their locations algorithmically computable? And what is the explanation for those occasional titles with extremely high random-looking values, like 16643225059 [210] and 701112078 [211]: experimental errors, cosmic rays [CR], or sporadic simple groups [SG]?
On the other hand, if h* is actually smooth, does it decay according to Zipf's Law [ZL], or does is thin out as n/log(n), like the density of prime numbers [PD]? Does it satisfy any simple recurrence or differential equation? Does the inverse of its summation outgrow Ackermann's function [AF]? Are all its nontrivial complex zeros located on the line Re(z) = ½ [RH]?
Presumably, the function h*(n) admits an analytic continuation to the whole real line. This extension may provide a sounder framework for the study of movie titles with fractions [212] and non-standard real numbers [213]. At present, those movies are ascribed to measurement errors, but they may turn out to be as fundamental and revolutionary for this field as the fractional-charged quarks were for nuclear physics.
The analytic extension would also allow the study of h*(n) for negative numbers [214] or even over the whole complex plane [215]. Another exciting prospect is the generalization of h*(n) to encompass vector-valued [216] and tensor-valued [217] movie titles.
The study of h* is a terribly important problem that will only be solved through intensive international cooperation and interdisciplinary research. Properly managed and marketed, it could be used to swindle huge quantities of grant money from governing agencies. Its challenging intellectual aspects could fill a lifetime --- or at least a rainy afternoon. So what are you waiting for?
Links to movie descriptions are a courtesy of The Internet Movie Database [MD], which was essential for this research. Initial data gathering for this project was supported by petty cash from my Dad.
Original text written January 15, 2004.
Minor corrections made on May 15, 2005.
Some broken links fixed on May 25, 2007.
Last edited on 2007-05-25 21:15:25 by stolfi