The Alphabets, Words, and Languages of Finite State Machines

The idea of a finite state machine comes from an interdisciplinary branch of mathematics called formal language theory. Its roots are also in computer science and linguistics.

Looking at the structure of our own written English language, we can describe it as being made up of building blocks called words. Words, when they are written down, are made up sequences or strings of the 26 letters in the alphabet. Not every sequence or string of letters that we put together forms a word that we recognize or accept as part of the English language.

The finite state machine can be used to check strings of letters (or any symbols) and determine whether they belong to a language or not. English, like all other natural languages that people speak, is very complex. Some other languages are simpler to think about.

When we sit at a computer keyboard and type strings of letters such as "copy"or "save", the reason that the computer complies with these requests is not because the computer recognizes what you want to do and is trying to help you out. The computer has been programmed to recognize certain strings of letters as part of its language, and to perform certain electronic functions on the basis of which string was typed in. If we misspell "save" and type "saev" or type in "keep" instead, the computer cannot guess what we mean. Instead it puts out a sometimes frustrating message to the effect of, "That word is not in my language."

Languages that are made of "words" which are "strings" of "letters" from an "alphabet" are found in many areas of science. Biologists, for example, know that proteins are made up of discrete building blocks called amino acids that can only occur in certain combinations. A DNA molecule is a long chain made up of only four building blocks, but the patterns and ordering of the elements of this alphabet is used to write out the "words" that describe the genetic material of all living things.

Whenever pattern recognition is vital to scientific inquiry, finite state machines may play an important role. Although a finite state machine cannot identify and recognize all types of patterns, it is still very powerful. It is the simplest and most basic pattern-recognizer and pattern-describer that computer scientists use.

The input to a finite state machine is always a string of symbols from the alphabet that the machine recognizes. We usually represent these symbols by a, b, c, etc, although any other symbols could be used.

The finite state machine has a finite number of states . The states are represented by the circles in many pictures of finite state machines.

The lines with arrows are called transitions . They show how the next letter of input will cause the state to change. Or, if you were an ant walking on the lines of the finite state machine, the letters would tell you which line to take as you left one state to head to another.

In the picture of Machine 1, the states or circles marked with a large dot are accept states. If this is the final state that the machine is in after all the letters of the input have been read, then the input has been accepted. When a the input is accepted, that particular string of symbols is said to be a word in the language that this particular machine recognizes.

Computer scientists team up with scientists from other disciplines and design ways to use the computer to analyze their data to recognize and identify patterns. The diagrams that depict the finite state machine are mathematical models that the scientists use to figure out how to design a system that will tell them what they need to know. Then these systems are turned into computer programs that analyze the data for the scientists.

The best way to experience the workings and power of finite state machines is to experiment with them, trying a variety of inputs on a variety of machines and seeing what happens. When the finite state machine is drawn on the floor large enough so children walk around on it, they can experience the abstract concept of transitions from state to state as moving from circle to circle.

After experimenting for a while, it will become evident that for any finite state machine, the symbols of the alphabet , the number of states, and the possible transitions are fixed and finite. Even so, an infinite number of words can be tried as input. Not only that, for some finite state machines, of the infinite number of input words you can try, an infinite number of those will be accepted, and an infinite number of them won't! This is one aspect of the power and versatility of finite state machines, as well as one of the paradoxes associated with infinity.

Article from the Los Alamos National Laboratory MegaMath Site.

The Alphabets, Words, and Languages of Finite State Machines

Jobs saved by The Boss Button: