# Talk:Entropy of a probability distribution

Fixed by me! --Robert W King 11:04, 27 June 2007 (CDT)

## question

H is the science symbol for entthalpy, S the symbol for entropy. I think this H should in reality be an S, as that is the way it is in my books for prob. distribs. Unless it changed within the last year I think it still remains the S. I can even understand using A, F or G but not H, according to the universal definition of entropy it is unitless. Robert Tito | **Talk** 19:58, 4 July 2007 (CDT)

- Interesting. I hadn't thought of that. Traditionally,
*H*is used for the entropy function in coding theory. I'm not sure about the history behind that, but now that I think back to my physics classes, you are right about*S*being entropy. Greg Woodhouse 20:38, 4 July 2007 (CDT)- To put is further: A is the free energy os a system, F the free enthalpy, and G the Gibbs energy (but that is solely used in thermodynamics). What struck me as odd is the unit, bit is no unit I know of, only a quantity of information and for that reason energy/entropy. And as far as I teach statistics I use S as entropy but then I use it in statistical chemistry/pfysics - and I prefer consistency in units and symbols. Robert Tito |
**Talk**21:00, 4 July 2007 (CDT)

- To put is further: A is the free energy os a system, F the free enthalpy, and G the Gibbs energy (but that is solely used in thermodynamics). What struck me as odd is the unit, bit is no unit I know of, only a quantity of information and for that reason energy/entropy. And as far as I teach statistics I use S as entropy but then I use it in statistical chemistry/pfysics - and I prefer consistency in units and symbols. Robert Tito |

- Well, entropy is used in coding theory as a measure of information content. A channel is defined as a potentially infinite sequence of symbols from a fixed alphabet. We assume there is a probability distribution telling us how likely each symbol is to occur. The entropy is then (unsurprisingly):

- where the logarithm is taken to a base of 2. (If the logarithms are taken to base
*e*, the units of entropy are sometimes called "nats".) Intuitively, this is just a measure of how much the bit stream can be compressed due to redundancy. The logarithm to base 2 of*1/p*is the number of bits that are needed to encode a symbol that occurs with probability_{i}*p*. People care about this, among other things, because it gives you a framework for calculating the data rate you can hope to get from a channel with a given bandwidth, assuming you use the best possible encoding algorithm (this is the famous Shannon limit). I completely sympathize with your dislike of the inconsistency here. Greg Woodhouse 21:34, 4 July 2007 (CDT)_{i}

- Shannon in his 1946 and 1948 papers defined the perfect enigma as the enigma that has 0 as Shannon information left in the encoded result. In this the analogy with entropy differs as entropy always seeks to maximize to reach the lowest energy (sometimes called) equilibrium state. Robert Tito |
**Talk**

- I'm not familiar with the Shannon quote, but I wonder if what he meant is a channel consisting of purely random bits has no more capacity to convey information than one consisting of all 0's (pure noise is no more meaningful than silence). It's not at all clear to me how far the metaphor of entropy can really be pushed, but in physics/chemistry, I think of entropy as quantifying the extent to which energy in a system (motion of molecules) can be harnessed to do useful work, and in information theory - the flip side of coding theory - I think of it as quantifying the extent to the "motion" (fluctiation) of bits can be harnessed to convey useful information. ERven more intriguing (to me) is that you can take this all a step further and ask about the capacity to do useful computational work. (This was the topic, for example, of The
*Feynman Lectures on Computation*). Greg Woodhouse 09:16, 5 July 2007 (CDT)

- I'm not familiar with the Shannon quote, but I wonder if what he meant is a channel consisting of purely random bits has no more capacity to convey information than one consisting of all 0's (pure noise is no more meaningful than silence). It's not at all clear to me how far the metaphor of entropy can really be pushed, but in physics/chemistry, I think of entropy as quantifying the extent to which energy in a system (motion of molecules) can be harnessed to do useful work, and in information theory - the flip side of coding theory - I think of it as quantifying the extent to the "motion" (fluctiation) of bits can be harnessed to convey useful information. ERven more intriguing (to me) is that you can take this all a step further and ask about the capacity to do useful computational work. (This was the topic, for example, of The

## Shannon information

Shannon defined as a perfect enigma that enigma that produces a result (nowadays most likely a file) that contains no Shannon information. Meaning no link can be made to the originating document or from the originating document. In laymen terms: a shannon enigma encrypts in a way to produce perfect white noise. That was what he meant to say. Hope this is more clear now. Robert Tito | **Talk** 09:59, 5 July 2007 (CDT)

## that is the entropy S :)

the definitions are identical, hence I wonder if *H* **is** the actual symbol. Robert Tito | **Talk**