Bollocks to Infinity

speakertoanimals
God

Posts: 1,547

Bollocks to Infinity Jan 28, 2011 13:08:21 GMT 1

Quote

Post by speakertoanimals on Jan 28, 2011 13:08:21 GMT 1

I could give you YET AGAIN the quote from the Shannon paper, that says probabilities have to be computed from the ensemble of all possible messages, NOT just for the particular message you are going to send.

You are still misunderstanding the basics of information theory.

Now as regards your point about CHANGE. Suppose we tried to use that strategy to transmit the result of N coin tosses from a fair coin.

SO, we would first have to say what the first toss was H or T. Except let's suppose we don't actually care, and we will be treating HTT to be the same as THH -- it's only 1 bit overall anyway.

So, change or no change? This is still a BINARY state that needs to be transmitted. 0 or 1 again, except that 0 now means change (why not? I can encode it however I like!), and 1 means no change.

Except for a coin, the state of change (next toss is different from the previous one) is AS LIKELY (over the ensemble of all possible coin toss sequences) as the state of no change.

Hence we find, yet again, that the information content in a string of N coin tosses is given by 1 bit for initial H or T, then N-1 bits for change/no change. We hence make no gain whatsoever compared to just sending H or T at each toss.

The ONLY case where we can make a gain is where we have correlation, and where a long sequence of no change is more probable than we might expect given a totally random process. This is the case in telephone communication, which is probably why you are under the mistaken impression that it is the ONLY case.

ANd yet again, I refer you back to the quote from SHannon -- perhaps you would care to explain WHY your computation of the information content of a stirng of zeros was correct given that quote?

Except you can't! you're just plain wrong, don't have the intelligence or the balls to see why you are wrong, and admit you were wrong and learn something.

I pity your poor blody students, they aren't getting taught information theory properly.

The point about a string of zeros containing as much information as any other random string of 0s and 1's is an important one, because it is WHY some people have triede to go beyind Shannons definition of information content, pointing out precisely that a repeated string we intuitively feel contains less information. I quote:

Thus, we have discovered an interesting phenomenon: the description of some strings can be compressed
considerably, provided they exhibit enough regularity. However, if regularity is lacking, it becomes more
cumbersome to express large numbers. For instance, it seems easier to compress the number “one billion,”
than the number “one billion seven hundred thirty-five million two hundred sixty-eight thousand and three
hundred ninety-four,” even though they are of the same order of magnitude.
We are interested in a measure of information that, unlike Shannon’s, does not rely on (often untenable)
probabilistic assumptions, and that takes into account the phenomenon that ‘regular’ strings are compress-
ible. Thus, we aim for a measure of information content of an individual finite object, and in the information
conveyed about an individual finite object by another individual finite object. Here, we want the information
content of an object x to be an attribute of x alone, and not to depend on, for instance, the means chosen
to describe this information content. Surprisingly, this turns out to be possible, at least to a large extent.
The resulting theory of information is based on Kolmogorov complexity

homepages.cwi.nl/~paulv/papers/info.pdf

Page 11, end of section 2.1

As regards the point about the probabilioty from the source (as opposed to what you did which was the probabilioty FROM the MESSAGE, the same paper states (Page 2, first paragraph):

In the Shannon approach, however,
the method of encoding objects is based on the presupposition that the objects to be encoded are outcomes
of a known random source—it is only the characteristics of that random source that determine the encoding,
not the characteristics of the objects that are its outcomes.

And the same paper then quotes SHannon again:

Frequently the messages have meaning; that
is they refer to or are correlated according to some system with certain physical or conceptual
entities. These semantic aspects of communication are irrelevant to the engineering problem.
The significant aspect is that the actual message is one selected from a set of possible messages.
The system must be designed to operate for each possible selection, not just the one which will
actually be chosen since this is unknown at the time of design

Which is what I said before, probability from source, NOT from particular message, according to Shannon.

QED and all that, and you can kiss my shiny metal arse..................

Progenitor A
God

Posts: 1,987

Bollocks to Infinity Jan 28, 2011 14:02:41 GMT 1

Quote

Post by Progenitor A on Jan 28, 2011 14:02:41 GMT 1

Jan 28, 2011 13:08:21 GMT 1 speakertoanimals said:

I could give you YET AGAIN the quote from the Shannon paper, that says probabilities have to be computed from the ensemble of all possible messages, NOT just for the particular message you are going to send.

You are still misunderstanding the basics of information theory.

Now as regards your point about CHANGE. Suppose we tried to use that strategy to transmit the result of N coin tosses from a fair coin.

SO, we would first have to say what the first toss was H or T. Except let's suppose we don't actually care, and we will be treating HTT to be the same as THH -- it's only 1 bit overall anyway.

So, change or no change? This is still a BINARY state that needs to be transmitted. 0 or 1 again, except that 0 now means change (why not? I can encode it however I like!), and 1 means no change.

Except for a coin, the state of change (next toss is different from the previous one) is AS LIKELY (over the ensemble of all possible coin toss sequences) as the state of no change.

Hence we find, yet again, that the information content in a string of N coin tosses is given by 1 bit for initial H or T, then N-1 bits for change/no change. We hence make no gain whatsoever compared to just sending H or T at each toss.

The ONLY case where we can make a gain is where we have correlation, and where a long sequence of no change is more probable than we might expect given a totally random process. This is the case in telephone communication, which is probably why you are under the mistaken impression that it is the ONLY case.

ANd yet again, I refer you back to the quote from SHannon -- perhaps you would care to explain WHY your computation of the information content of a stirng of zeros was correct given that quote?

Except you can't! you're just plain wrong, don't have the intelligence or the balls to see why you are wrong, and admit you were wrong and learn something.

I pity your poor blody students, they aren't getting taught information theory properly.

The point about a string of zeros containing as much information as any other random string of 0s and 1's is an important one, because it is WHY some people have triede to go beyind Shannons definition of information content, pointing out precisely that a repeated string we intuitively feel contains less information. I quote:

Thus, we have discovered an interesting phenomenon: the description of some strings can be compressed
considerably, provided they exhibit enough regularity. However, if regularity is lacking, it becomes more
cumbersome to express large numbers. For instance, it seems easier to compress the number “one billion,”
than the number “one billion seven hundred thirty-five million two hundred sixty-eight thousand and three
hundred ninety-four,” even though they are of the same order of magnitude.
We are interested in a measure of information that, unlike Shannon’s, does not rely on (often untenable)
probabilistic assumptions, and that takes into account the phenomenon that ‘regular’ strings are compress-
ible. Thus, we aim for a measure of information content of an individual finite object, and in the information
conveyed about an individual finite object by another individual finite object. Here, we want the information
content of an object x to be an attribute of x alone, and not to depend on, for instance, the means chosen
to describe this information content. Surprisingly, this turns out to be possible, at least to a large extent.
The resulting theory of information is based on Kolmogorov complexity

homepages.cwi.nl/~paulv/papers/info.pdf

Page 11, end of section 2.1

As regards the point about the probabilioty from the source (as opposed to what you did which was the probabilioty FROM the MESSAGE, the same paper states (Page 2, first paragraph):

And the same paper then quotes SHannon again:

Frequently the messages have meaning; that
is they refer to or are correlated according to some system with certain physical or conceptual
entities. These semantic aspects of communication are irrelevant to the engineering problem.
The significant aspect is that the actual message is one selected from a set of possible messages.
The system must be designed to operate for each possible selection, not just the one which will
actually be chosen since this is unknown at the time of design

Which is what I said before, probability from source, NOT from particular message, according to Shannon.

QED and all that, and you can kiss my shiny metal arse..................

You are being extremely foolish in demonstrating that you have
1. Not read past page 1 of Shannon’s paper that you posted
2. Have not understood what you read on page 1
3. Do not understand the maths that you have posted from page 1

This quite silly approach leads you to make the ridiculous statement that a store of 20GB full of zero's contains as much information as a store containing the coded Encyclopaedia Britannica.
Anyone with an ounce of sense would see that there is something dreadfully wrong with such a statement and would reconsider their position.
Instead you seek to entrench your position with insults and half-understood nonsense and that tells much about you.

You state from an earlier post that Shannon stated in his paper that:

Jan 26, 2011 12:58:56 GMT 1 speakertoanimals said:

.....we are considering binary strings of N bits, the total number is 2^N, hence the information content of ANY one string (N 0's, say), is defined to be - log(p), where p = 1/2^N. Which hence gives the information content of a string of zeros as the SAME as any other bit string of the same length.

In fact he says no such thing! He was simply talking about the length of a binary store that stores information, not the information content. (In 1948 binary information stores were very new)
Of course if you assume foolishly that he was talking about the information content of a store of that length then you would conclude that the information content is equal to the store length irrespective of whether the store was holding 0’s or 1’s,and from there it follows that a 20GB store holding all 0's stires as much information as one holding an Encyclopaedia'a worth!

On that basis you call me an idiot!

In fact had you read beyond page 1 you will find that Shannon discusses the Entropy of a stored signal in order to quantify the information stored
He says (effectively, it is hidden in Sigmas) that:

Entropy in a binary store
H=-(p₀log₂p_o +p₁log₂p₁)

In other words I was using Shannon’s definition of information in my prior examples to calculate the information content of a store filled with 0’s. Like a fool you seemed to imagine that I had invented the equation.
The entropy is a direct indication of the information content of a signal and varies from 0 (no information) to 1(maximum information). He goes on to state that the channel size C needed to accommodate a signal is:
H x No of signals

But as you have totally misinterpreted page 1 of his paper, I doubt that you will recognise this part of his paper

The result using Shannons equations for information content of a store holding only 0's just happens to be 0, which oddly enough goes with the common sense thinking that a store full of zero’s is empty of information

Now go back and pay attention to what I have written

And will you then kindly stop exposing your ignorance and foolishness and refrain from hurling insults at me?

Last Edit: Jan 28, 2011 14:05:58 GMT 1 by Progenitor A

speakertoanimals
God

Posts: 1,547

Bollocks to Infinity Jan 28, 2011 14:29:16 GMT 1

Quote

Post by speakertoanimals on Jan 28, 2011 14:29:16 GMT 1

In fact he says no such thing! He was simply talking about the length of a binary store that stores information, not the information content. (In 1948 binary information stores were very new)

Nope. He was talking about transmission of messages.

Let me give you the EXACT quote again:

These semantic
aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual
message is one selected from a set of possible messages. The system must be designed to operate for each
possible selection, not just the one which will actually be chosen since this is unknown at the time of design.
If the number of messages in the set is finite then this number or any monotonic function of this number
can be regarded as a measure of the information produced when one message is chosen from the set, all
choices being equally likely.

'Measure of the information', same as infomation content to those capable of understanding slightly tehcnical english involving words of more than two syllables........................

This quite silly approach leads you to make the ridiculous statement that a store of 20GB full of zero's contains as much information as a store containing the coded Encyclopaedia Britannica.

Same page of the same paper:

Frequently the messages have meaning; that is they refer
to or are correlated according to some system with certain physical or conceptual entities. These semantic
aspects of communication are irrelevant to the engineering problem.

In fact, all you are doing is criticising the whole Shannon approach, which considers only statistical properties of the source, and NOTHING to do with meaning at all. Which is why some people developed the Kolmogorov complexity line, as I said.

Bluster all you will, people can go look at the paper themsleves, and judge for themnselves what they think those simple sentences mean.

The rest of the paper is a different matter, BUT that is irrelevant since it all rests on the basic logarithmic measure of information contant defined on the first page, and the understanding of what probability that is talking about -- all of which SHannon makes abundantly clear in the quotes I gave..................

In fact had you read beyond page 1 you will find that Shannon discusses the Entropy of a stored signal in order to quantify the information stored
He says (effectively, it is hidden in Sigmas) that:

Entropy in a binary store
H=-(p0log2po +p1log2p1)

My, my you HAVE got confused! On page 10, Shannon defines the entropy OF the SOURCE as sum over -p log p. whereas the information content on page 1 refers to the information content os a single message from a source.

In fact, it is TRIVIAL to show these concepts are related.

I have a random source == generates possible messages, message has has probability pi.

So, on average, if I generate M messages (my ENSEMBLE of messages), message i occurs Mpi times. Each instance of that message has information content -log pi (Page 1 again).

Hence the average information generated by the message SOURCE is given by:

(M pi)x(-log pi)/M summed over i, which is -pi log pi summed over i, which is the entropy of the SOURCE, defined using the concept of AVERAGE information content of generated messages, BASED on the information content of a SINGLE message (page 1) being - log pi.

So, rather than show I'm wrong, you've instead shown that you are totally wrong. You've also shown you can't even use the correct language WHEN YOU HAVE THE PAPER itself to refer to. I suspect its the usual story of half-understood concepts, and just pick and choose equations at random with no real understanding.

Hence your total misuse of entropy.

The entropy is a direct indication of the information content of a signal and varies from 0 (no information) to 1(maximum information).

WRONG, wrong wrong wrong WRONG! It's the average information content of messages generated by a source -- NOT the content of one message. Its the average over an ensemble of all possible messages that could be generated by that source.

Hence when you got this wrong, and tried to use it to calculate the information content of a string of zeros, you instead computed the entropy of a source thatr produced nothing but zeros, EVER, hence found the unsurprising result that it was zero!

The entropy of the source is what you need to know for communication purposes, you need to know about the ensemble of possible messages. But you confused it with the information content of a single message -- and shown that you don't know what the probabilities actually mean

1/10 (you at least managed to find the reference material), but you still to understand the basics, and use the correct language. Try again!

speakertoanimals
God

Posts: 1,547

Bollocks to Infinity Jan 28, 2011 14:30:47 GMT 1

Quote

Post by speakertoanimals on Jan 28, 2011 14:30:47 GMT 1

And will you then kindly stop exposing your ignorance and foolishness and refrain from hurling insults at me?

The only here making themselves look a total prat is you! And when you realise, you're going to be soooo embarrassed....................If any of this is genuine, of course. Could be just another elaborate wind-up attempt, but then that's almost as daft............................

Progenitor A
God

Posts: 1,987

Bollocks to Infinity Jan 28, 2011 14:50:04 GMT 1

Quote

Post by Progenitor A on Jan 28, 2011 14:50:04 GMT 1

Jan 28, 2011 14:29:16 GMT 1 speakertoanimals said:

The entropy is a direct indication of the information content of a signal and varies from 0 (no information) to 1(maximum information).

Hence when you got this wrong, and tried to use it to calculate the information content of a string of zeros, you instead computed the entropy of a source thatr produced nothing but zeros, EVER, hence found the unsurprising result that it was zero!

Had you bothered to read what I have written, I calculated the entropy of the store when it contained a variety of messages. The entropy was 0 when the store contained all 0's and was 1 when the store contained alternate 0's and 1's
I could have gone on and calcualted the entropy for all possible 64 messages, but the entropy would never have exceeded 1 and that indicates the maximum information that the store can hold

Now take this bit of nensense that you just posted:
I say

The entropy is a direct indication of the information content of a signal and varies from 0 (no information) to 1(maximum information).

You say
WRONG, wrong wrong wrong WRONG! It's the average information content of messages generated by a source -- NOT the content of one message.

Ok lets take your position

Now if the source generates just one message of length 64 bits, you say elsewhere the information content is 64 bits

Now you say that the entropy is 'the average information content of messages generated by a source -- NOT the content of one message.'

Please then calculate the entropy and demonstrate that the entropy H is 1 if the single message from a source is 1111111111 followed by 54 0's (as happens often in communication systems)

I will be very interseted in your result

In fact I guarantee that you do NOT perform the calculations

No more communiciation. You really are an obsessive idiot, and I will not waste any more time with you

Goodbye

Last Edit: Jan 28, 2011 15:20:51 GMT 1 by Progenitor A

speakertoanimals
God

Posts: 1,547

Bollocks to Infinity Jan 28, 2011 15:35:21 GMT 1

Quote

Post by speakertoanimals on Jan 28, 2011 15:35:21 GMT 1

Had you bothered to read what I have written, I calculated the entropy of the store when it contained a variety of messages. The entropy was 0 when the store contained all 0's and was 1 when the store contained alternate 0's and 1's
I could have gone on and calcualted the entropy for all possible 64 messages, but the entropy would never have exceeded 1 and that indicates the maximum information that the store can hold

Except if you recall, where we STARTED was the information contant os a SINGLE message of all zeros, when all possible states included all possible strings of 1's and zero's.

YOU have just mistakenly done what I said you had done -- said that if the process can generate ONLY zeros, then there is zero information, which is NOT the same case as the case I was considering, which was where you had a string of zeros, but COULD have had something else.

You DON'T compute the entropy for a single message, but for the message-generating PROCESS (see SHannon page 10 where he makes this distinction clear).

Now if the source generates just one message of length 64 bits, you say elsewhere the information content is 64 bits

Now you say that the entropy is 'the average information content of messages generated by a source -- NOT the content of one message.'

Please then calculate the entropy and demonstrate that the entropy H is 1 if the single message is 1111111111 followed by 54 0's (as happens often in communication systems)

No, because you CAN'T compute the entropy of the source from ONE particular message, only from the ensemble of ALL POSSIBLE messages that could be generated from that source. You have consistently FAILED to appreciate this distinction, which is whyn you can't actually get past page one of SHannons paper, where (as I keep telling you!), he says:

As regards a single message:

The significant aspect is that the actual
message is one selected from a set of possible messages. The system must be designed to operate for each
possible selection, not just the one which will actually be chosen since this is unknown at the time of design.

As regards the source process:

We have represented a discrete information source as a Markoff process. Can we define a quantity which
will measure, in some sense, how much information is “produced” by such a process,.................

I say again, if message (or event, Shannon calls it an event) i is produced by the source with probability pi (the prior probability, NOT the posterior you seem to keep using which gives it as 1, hence zero information content!), then:

information content of event i is (Page 1): -log pi

Since event i occurs with probability pi, we have the AVERAGE information created by the source as just information content weighted by probability, or sum over i of -pi log pi, which is the entropy OF THE SOURCE.

Information content of an event, entropy of the source.

In fact I guarantee that you do NOT perform the calculations

No shit Sherlock, especially since my position is that you CAN'T (or shouldn't) compute the entropy of a message, because that is just the entropy of a source which can only ever produce one, and only one, message -- hence whatever that one message is that it produces whenever you press the switch, the entropy of that source is zero.

No more communiciation. You really are an obsessive idiot, and I will not waste any more time with you

And you're an obsessive fool, who just hasn't got the balls to learn, or even admit that they might be mistaken. You'd rather disagree with the founder of the subject, than admit your own fallibility. Your ego must be either enormous, or very fragile.........................

Progenitor A
God

Posts: 1,987

Bollocks to Infinity Jan 28, 2011 15:49:32 GMT 1

Quote

Post by Progenitor A on Jan 28, 2011 15:49:32 GMT 1

I guaranteed that that STA would not calculate the entropy of the message!
She simply cannot do it!
She cannot show what channel width is necessary to transmit the message!
Any telecomunivations engineer or information sytems specialist could calculate it and hence the bandwidth necesary to transmit it

speakertoanimals
God

Posts: 1,547

Bollocks to Infinity Jan 28, 2011 16:04:05 GMT 1

Quote

Post by speakertoanimals on Jan 28, 2011 16:04:05 GMT 1

I guaranteed that that STA would not calculate the entropy of the message!
She simply cannot do it!
She cannot show what channel width is necessary to transmit the message!
Any telecomunivations engineer or information sytems specialist could calculate it and hence the bandwidth necesary to transmit it

Except as I said -- that's an empty prediction!

If all telecommunications engineers are as daft as you, then perhaps we are forced to conclude that physicists and mathematicians know enough to derive the results of information theory from first principles, whilst engineers just know enough to hopefully pick the right equation and apply it, but with very little idea of where it came from or why..................

Mind you, everybody also knows that you can tell an Oxford-trained physicist by the fact they pick the soldering iron up by the wrong end.

I note that whilst enjoying your empty delusion of a victory, that you have never attempted to actually address any of the many examples I have posted. Would you care for me to post another example explaining what entropy of the SOURCE is, since you are still under the mistaken impression that entropy applies to a single message....................

And that you can calculate the relevant probabilities for 0 or 1's (the probabilities that the source prodcues a 0 or 1), by looking at their frequency in the particular message you wish to transmit................(This is a subtle point BTW, but unless you get it, it just shows that you don't get the distinction between one particuar message, and the ensemble of all possible messages of a given length from that source).

I dunno why O keep trying to explain, the idiot had already shown he won't listen, not even to Shannon himself! O engineers, what have you come to.........................