which may be thought of as a generalization of a harmonic number. In the limit as N approaches infinity, this becomes the Hurwitz zeta functionζ(q,s). For finite N and q = 0 the Zipf–Mandelbrot law becomes Zipf's law. For infinite N and q = 0 it becomes a Zeta distribution.
If one plots the frequency rank of words contained in a large corpus of text data versus the number of occurrences or actual frequencies, one obtains a power-law distribution, with exponent close to one (but see Gelbukh and Sidorov 2001).
References and links
B. Mandelbrot (1965). "Information Theory and Psycholinguistics", in B.B. Wolman and E. Nagel: Scientific psychology. Basic Books. Reprinted as
B. Mandelbrot [1965] (1968). "Information Theory and Psycholinguistics", in R.C. Oldfield and J.C. Marchall: Language. Penguin Books.