Science  People  Locations  Timeline
Index: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Home > String (computer science)


 Contents
In various branches of mathematics and computer science, strings are sequences of various simple objects (symbols, tokens, characters, etc.). One may consider sequences of characters (a character string, string literal, or literal string), tokens in a formal grammar, states in automata, representations of DNA, or bits (a binary string).

1 Formal theory

One starts with a non-empty finite set Σ called an alphabet. Elements of this alphabet are called characters. A string (or word) over Σ is any finite sequence of characters from Σ.

A particularly important string is the sequence of no characters, called the empty string. The empty string is often denoted ε or λ. Note that one does not allow infinite sequences of characters.

For example, if Σ = {0, 1}, strings over Σ are of the form

ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, …

The set of all strings over Σ is denoted Σ*. One can define a binary operation on Σ* called string concatenation. If s and t are two strings, their concatenation, denoted st, is defined as the sequence of characters in s followed by the sequence of characters in t.

For example, if s = bear and t = hug then st = bearhug and ts = hugbear.

String concatenation is an associative, but non- commutative operation. The empty string serves as the identity elementIn mathematics, an identity element (or neutral element is a special type of element of a set with respect to a binary operation on that set. It leaves other elements unchanged when combined with them. The term identity element is often shortened to ident. In algebraicAbstract algebra Abstract algebra is the field of mathematics concerned with the study of algebraic structures such as groups, rings and fields. The term "abstract algebra" is used to distinguish the field from " elementary algebra" or "high school algebr terms, the set Σ* forms a monoidIn abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single, associative binary operation and an identity element. In other words, it is a unital semigroup. Definition A monoid is a magma (M, ), i. a set M with binary op under string concatenation. In fact, Σ* is the free monoid generated by Σ.

The length of a string is the number of characters in the string. The length can be any natural numberNatural number can mean either a positive integer ( 1, 2, 3, 4,. or a non-negative integer ( 0, 1, 2, 3, 4,. Natural numbers have two main purposes: they can be used for counting ("there are 3 apples on the table"), or they can be used for ordering ("this. The length of the empty string is 0. Algebraically speaking, the length function defines a monoid homomorphism from Σ* to N (the natural numbers with addition).

A string s is said to be a substring of t if there exist two strings u and v such that t = usv. One, or both, of u and v can be empty. The relationIn mathematics, the concept of binary relation is exemplified by such ideas as "is greater than" and "is equal to" in arithmetic, or "is congruent to" in geometry, or "is an element of" or "is a subset of" in set theory. Definition Formally, a binary rela "is a substring of" defines a partial order on Σ*, the least element of which is the empty string.

More often, especially in computing applications, one is interested in a different kind of ordering on the set of strings. If the alphabet Σ is well-ordered (cf. alphabetical order) one can define a well-ordering on Σ* called lexicographical order. The empty string is also the least element with respect to this ordering.

A set of strings over Σ (i.e. a subset of Σ*) is called a formal language over Σ. Note that while the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings. In fact, Σ* itself is always an infinite language. Important examples of formal languages include regular expressions and formal grammars.



Read more »

Non User