1. Originally Posted by tashirosgt Consider a communication channel which transmits a data stream of bits. Each "message" of N bits is generated according to a stochastic process. The random variable in question has values that are the possible messages and let's say we know the probabilities of each of these values. Then we compute the entropy of that random variable. That would allow us to say something about the information content of a message.

Take N = 6. It is not correct to say that a particular message, like ( 1,0,1,1,0,1) has more or less information than another message like (1,1,1,1,1,1). In fact, suppose that the six bits are chosen according to the layman's idea of "randomly", i.e. suppose that each bit is chosen independently of the others and that 0 and 1 are each chosen with probability 0.5. Knowing the first 5 bits of a message are 1 does not make it more or less probable that the last bit is 1.

Likewise it is not correct to say that the sequence (1,1,1,1,1,1) is more compressible than the sequence (1,0,1,1,0,1) unless you specify a method of compression. For example a person might use a code where "X" is transmitted to represent (1,0,1,1,0,1) and other mesages are represented by transmitting a "Y" to represent each 0 and a "Z" to represent each 1. (The use of acronyms in common speech is a good example of how such conventions arise in a non-mathematical setting.)

To say that a completely black image has zero information conveys an idea but it is not precise. Perhaps what is meant is that if you are dealing with a population of images where where each pixel tends to be the same color as other pixels then the amount of information per pixel in the data stream is less than in a situation where pixels are independent random variables. Or perhaps what is meant is that if we are dealing with a population of images, each of which is all black, the transmitting the image data conveys zero information.

Thermodynamic entropy has a different definition than Shannon entropy. However, is there a way to analyze thermodynamic entropy as a Shannon entropy? I thought statistical physics had found a way to do this.
Ok, look, you argue that one must have a way to predict a pattern.

Indeed one must. The autocorrelation function, otherwise dual to the power spectrum, TELLS YOU EXACTLY WHAT KIND OF PREDICTOR CAN BE USED AS WELL AS HOW MUCH INFORMATION THERE WILL BE AFTER YOU USE THE PREDICTOR.

Yes, I'm shouting. You've missed this point, oh, 6 or 12 times already and I'm kinda fed up with it. We're past that. We've been past that since the 1960's if not the 1950's.

As to showing individual, short sequences, you're simply confused again, information and compression, etc, can be measured only over an ensemble of data, i.e. a complete image, a waveform of some length, a bit pattern of some length, etc. ONE and only one message in and of itself has zero information because it's the only message, and you convey no further information by sending it because you already knew it. Only if you have the possibility of sending one of 'n' messages can you define information flow. And if you are going to send one of many decorrelated messages, -p log2(p) for each message describes the amount of information you've actually sent, and if you use Huffman coding to send it, within 1 bit of exactly how many bits you WILL use to send it in the real world. If you're sending correlated messages, you need something like LZW if it's bits, or something like LPC (or spectral coding, VQ, etc) if it's a waveform.

Jayant and Noll, "Waveform Coding" would be an interesting thing for you to digest in this regard, but it is quite dated in terms of modern algorithms, as it has to be. (Look at the date of publication.)  Reply With Quote

2. Established Member
Join Date
Jun 2009
Posts
1,875
Do I get to be "fed up" too? I think it's fair to point out that you too are being stubborn in avoiding the correct foundation for discussing Shannon entropy. "Ensemble of data" is about as far as you go. As to what is mainstream thinking on Shannon entropy, look at the Wikipedia articles on it. You will see it defined in the context of random variables.

It is an elementary concept of probability theory that a set of data (e.g. a "complete image") , however large, is not a probability distribution. It is an elementary point of statistics that functions computed from a set of data are values of "estimators", they are not parameters of a probability distribution.

No doubt there is a probability model that implies that one should perform the waveform analysis that you recommend. The standard approach to developing prescriptions for calculations that are performed on "ensembles of data" is to assume a probability model that generates the data. Then we investigate the computations that are good estimators for the parameters of the model. What you are doing is quoting a collection of results that come from such an analysis, but you refuse to acknowledge (or state) the underlying model. The results you are quoting are conclusions of mathematical theorems but you have never stated the premises of the theorems. You are stating the conclusions as if they are self-evident universal truths that involve no assumptions about the stochastic process being analyzed.  Reply With Quote

3. Originally Posted by tashirosgt Do I get to be "fed up" too? I think it's fair to point out that you too are being stubborn in avoiding the correct foundation for discussing Shannon entropy.
Since I'm not doing that, well, I don't see how we can communicate further.  Reply With Quote

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•