Science  People  Locations  Timeline
Index: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Home > Zipf-Mandelbrot law


 

The Zipf-Mandelbrot law (also known as the Pareto-Zipf law)

is a power-law distribution on ranked data , named after the Harvard linguistics professor George Kingsley Zipf ( 1902- 1950) who suggested regularity in texts, and the mathematician Benoit Mandelbrot (born November 20, 1924), who generalized it.

The distribution of words ranked by their frequency in a random

corpusIn law a corpus ( Latin: "body") is a set, a collection of documents and sources. See Corpus Juris Civilis. In linguistics, corpus (plural corpora is a large and structured set of texts (now usually electronically stored and processed). A corpus may conta of textIn language, text is something that contains words to express something. The term usually has broader meaning. In linguistics text enters at least two types of contrasts. One is that between system and text, system being understood as the ability of the s is generally a power-law distribution, known

as Zipf's lawOriginally the term Zipf's law meant the observation of Harvard linguist George Kingsley Zipf ( SAMPA: [zIf]) that the frequency of use of the n''th-most-frequently-used word in any natural language is approximately inversely proportional to n''. Zipf's l.

If one plots the frequency rank of words contained in a large

corpusIn law a corpus ( Latin: "body") is a set, a collection of documents and sources. See Corpus Juris Civilis. In linguistics, corpus (plural corpora is a large and structured set of texts (now usually electronically stored and processed). A corpus may conta of text data versus the number of occurrences or actual frequencies, one obtains a power-law distribution,

with exponent close to one (but see Gelbukh and Sidoro 2001).

External links

Probability distributions

Read more »

Non User