Encyclopaedia Britannica

abacus9900 God Posts: 973	Encyclopaedia Britannica Feb 4, 2011 0:12:13 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by abacus9900 on Feb 4, 2011 0:12:13 GMT 1 Well, we've dealt with logs so now we can go on to examine Shannon's mathematical ideas regarding the probabilities inherent in information transmission. UKIP? Colourful.
	Last Edit: Feb 4, 2011 0:17:51 GMT 1 by abacus9900

speakertoanimals
God

Posts: 1,547

Encyclopaedia Britannica Feb 4, 2011 2:44:28 GMT 1

Quote

Post by speakertoanimals on Feb 4, 2011 2:44:28 GMT 1

The basic idea is fairly straightforward. Suppose you and the receiver have agreed a codebook beforehand. What you want to send is one of some agreed set of messages. Suppose the possible messages MEAN:

tea, coffee, orange juice, cereal, or nothing.

It's your breakfast order! The point is, you want to send an order everyday, but you usually pick tea only 1/8 of the time, orange juice 3/8 of the time, cereal 1/4, and nothing 1/4.

Suppose I now ask what you want, but only using yes/no questions.

If the questions were:

do you want tea?
do you want coffee?
cereal?

That's 3 questions. If the answer to the first is yes, I don't have to ask any more -- I can just send 1 for yes. So tea is encoded as just 1. And that happens 1/8 of the time.

If coffee, the answers are no then yes, hence 01, 3/8 of the time.

Cereal is no, no, yes -- 001 1/4 of the time.

And nothing takes no no no, 000, 1/4 of the time.

So we can work out the average length message I have to send:

1/8 + 2x3/8 + 3x1/4 + 3x1/4 = 19/8 = 2 3/8 bits on average.

Except I can do better than that!

1) Is it one of coffee or tea?
2) If yes to that, do you want coffee, or if no, do you want cereal?

More complicated questions, but now I can do it with just the answers to these 2 questions.

So, for tea, I answer yes no. For coffee, yes, yes.

For cereal, no, yes, and for nothing no, no.

Hence I only ever have to send 2 answers -- 10, 11, 01, or 00. Hence on average 2 bits.

What Shannon did was work out the shortest possible average messge length (how many answers I had to send to order), based on the probabilities of each choice. Hamming worked out how to calculate the best possible code.

Shannon said that the way to structure the questions depends on the probabilities of each choice. So, if I had tea 99% of the time, the best first question would just be -- do you want tea, because 99% of the time I would say, yes, and just have to send 1. Only 1% of the time would I have to go onto further questions, and send a longer string.

The EXACT way that best-possible codeword length depends on the probability of each choice is given by the log -- minus log p to be exact. So more probable choices get SHORTER codewords, with the longer codewords reserved for choices I don't make very often.

Progenitor A
God

Posts: 1,987

Encyclopaedia Britannica Feb 4, 2011 8:47:32 GMT 1

Quote

Post by Progenitor A on Feb 4, 2011 8:47:32 GMT 1

Feb 4, 2011 0:12:13 GMT 1 abacus9900 said:

Well, we've dealt with logs so now we can go on to examine Shannon's mathematical ideas regarding the probabilities inherent in information transmission.

UKIP? Colourful.

Hi Abacus

Do you wish to continue with the thread we were following or divert toward the interjected explanation?
I do not mind.
There is simply no point in trying to follow 2 separate explanations simultaneously,especially when one explanation will arrive at the conclusion that for information transmission there is as much information in a 20GB string of zeros as there is in a 20GB string of encoded Encyclopaedia Britannica.
(Perhaps you can see some difficulty in the 'simplified' interjected explanation in comprehending exactly what information is to be sent [some might say impossibility- why on earth would you send a daily breakfast order specifying coffee 1/8 of the time for example?] not to mention the the errors ? At least that indicates it is not copied from a book)

Last Edit: Feb 4, 2011 10:58:05 GMT 1 by Progenitor A

abacus9900
God

Posts: 973

Encyclopaedia Britannica Feb 4, 2011 11:41:03 GMT 1

Quote

Post by abacus9900 on Feb 4, 2011 11:41:03 GMT 1

I think I get what STA is saying. It's kind of like playing a game of twenty questions in the most efficient way possible based on the probabilities of the answers. It also reminds me of a way a doctor would ask questions of a patient in order to reach a dignosis. The doctor would know the most probable answers based on experience. That right, STA?

Don't you agree naymissus?

Last Edit: Feb 4, 2011 12:30:45 GMT 1 by abacus9900

Progenitor A
God

Posts: 1,987

Encyclopaedia Britannica Feb 4, 2011 13:10:26 GMT 1

Quote

Post by Progenitor A on Feb 4, 2011 13:10:26 GMT 1

Feb 4, 2011 11:41:03 GMT 1 abacus9900 said:

I think I get what STA is saying. It's kind of like playing a game of twenty questions in the most efficient way possible based on the probabilities of the answers. It also reminds me of a way a doctor would ask questions of a patient in order to reach a dignosis. The doctor would know the most probable answers based on experience. That right, STA?

Don't you agree naymissus?

The principle is fine
The questions to be asked in the example do not make sense though and the number of bits is not minimised and the result has, of course no practical use, and Hamming codes add redundancy, (do not minimise the number of bits but add more). She is probably thinking of the Huffman code which guarantees to encode a string of bits within the entropy (plus one bit) of the string , but only generally for very long strings of digits.
As you can see for this example, the minimum number of bits required is 2, but that was self-evident as there were 4 states and 2²=4.
In theory we could transmit the info with 1 3/8 bits but there is no such thing as 3/8 bit - which is why the example has no practical utility. In fact to get the full minimisation in practice we would need a substantial string length

speakertoanimals
God

Posts: 1,547

Encyclopaedia Britannica Feb 4, 2011 13:56:17 GMT 1

Quote

Post by speakertoanimals on Feb 4, 2011 13:56:17 GMT 1

Feb 4, 2011 11:41:03 GMT 1 abacus9900 said:

I think I get what STA is saying. It's kind of like playing a game of twenty questions in the most efficient way possible based on the probabilities of the answers. It also reminds me of a way a doctor would ask questions of a patient in order to reach a dignosis. The doctor would know the most probable answers based on experience. That right, STA?

That's EXACTLY right!
You and the questioner know before you start WHAT the questions are, and have checked that they cover all possibilities in this case. You have agreed how the answers to the questions will be sent -- that is what your codebook is, in effect -- 001 means cereal, or whatever it was.

speakertoanimals
God

Posts: 1,547

Encyclopaedia Britannica Feb 4, 2011 14:06:17 GMT 1

Quote

Post by speakertoanimals on Feb 4, 2011 14:06:17 GMT 1

Feb 4, 2011 13:10:26 GMT 1 Progenitor A said:

The questions to be asked in the example do not make sense though

Rubbish! they may sound long and complicated, BUT there is a unique sequence of answers for each choice, which is all that is required.

[/quote]and the number of bits is not minimised and the result has, of course no practical use, and Hamming codes add redundancy, (do not minimise the number of bits but add more). She is probably thinking of the Huffman code which guarantees to encode a string of bits within the entropy (plus one bit) of the string , but only generally for very long strings of digits.
As you can see for this example, the minimum number of bits required is 2, but that was self-evident as there were 4 states and 2²=4.
In theory we could transmit the info with 1 3/8 bits but there is no such thing as 3/8 bit - which is why the example has no practical utility. In fact to get the full minimisation in practice we would need a substantial string length[/quote]

A strawman. ANY mathematical expression used in information theory has to deal with the case that maths used is continuous, whilst answers in the real world (such as the length of a codeword), have to be an integer. Hence your supposed objection is NONSENSE.

Hamming worked out how to calculate the best possible code.

Well, it was nearly 3am, I misremembered Huffman as Hamming -- both begin with H, lucky I didn't think of Top Gear and type Hammond.....................

Point is, the example IS useful, because it helps with:

how do we go from -- I want tea -- to a string of binary.

THe fact that codewords can be related to probability, and how we decide on a set of unambiguous codewords. The twenty-questions scenario, is, of course, just a friendly version of a binary tree.

So, I think just sour grapes on your part............

There is simply no point in trying to follow 2 separate explanations simultaneously,especially when one explanation will arrive at the conclusion that for information transmission there is as much information in a 20GB string of zeros as there is in a 20GB string of encoded Encyclopaedia Britannica.

Except that is precisely what the twenty-questions case IS telling you! 000 is as valid and as informative a set of answers as 001, BOTH enable the receiver to send the right breakfast!

Just add more choices and more questions, and you inevitably end up with a long string of zeros that contains information.

come on NM, if it was so EASY to PROVE it was nonsense, you would have done so. Except you can't, and instead just resort to a favourite tactic of creationists -- argument from incredulity. It SOUNDS daft that a string of zeros can carry information, hence I'll use that. Because you are still unable to admit that your specific background has led you to a mistaken interpretation of the basics of information theory, as my many posts have demonstrated.

Progenitor A
God

Posts: 1,987

Encyclopaedia Britannica Feb 4, 2011 15:02:03 GMT 1

Quote

Post by Progenitor A on Feb 4, 2011 15:02:03 GMT 1

Well Abausus, the start of this thread was that absurd proposition that a 20GB string of 0's contains as much information as a 20GB string of encoded Encyclopaedia Britannica.

The only way to show that this is false is to look at the concept of entropy, which we started to do, and an essential part of that was manipulation of log₂

We were going along that path

Now you have STA giving her take on it.
It is no good mixing the two. The only way to proceed is to build the picture step by step.

It is fruitles trying to consider 2 opposing views at once when learning a subject. You must decide to go with STA's explanation or or mine. I do not mind. Come to me after STA has finished if you wish

I am not interested in getting into squabbles with STA.
I can build an irrefutable picture in small understandable steps. That is the art getting across complex concepts

However you must choose.

I really do not mind

abacus9900 God Posts: 973	Encyclopaedia Britannica Feb 4, 2011 15:17:03 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by abacus9900 on Feb 4, 2011 15:17:03 GMT 1 Please continue, naymissus. I'm sorry if I appeared to 'go off' - was simply trying to appear open to all ideas.

Progenitor A
God

Posts: 1,987

Encyclopaedia Britannica Feb 4, 2011 15:33:49 GMT 1

Quote

Post by Progenitor A on Feb 4, 2011 15:33:49 GMT 1

Feb 4, 2011 15:17:03 GMT 1 abacus9900 said:

Please continue, naymissus. I'm sorry if I appeared to 'go off' - was simply trying to appear open to all ideas.

No problem, I can understand that.
Unfortunately I am off now - an early meal and then a concert in town

We will come back together tomorrow and consider how the entropy of a source is related to the amount of information that is transmitted over a channel of capacity C, and how we can guarantee (through Huffman coding) that a source of a certain H can send its information over a channel C with the minimum amount of information bits seing sent

We will see how the amount of information transferred dpends upon the changes in a string

abacus9900 God Posts: 973	Encyclopaedia Britannica Feb 4, 2011 15:55:30 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by abacus9900 on Feb 4, 2011 15:55:30 GMT 1

speakertoanimals
God

Posts: 1,547

Encyclopaedia Britannica Feb 4, 2011 16:45:28 GMT 1

Quote

Post by speakertoanimals on Feb 4, 2011 16:45:28 GMT 1

I am not interested in getting into squabbles with STA.
I can build an irrefutable picture in small understandable steps.

I suspect he's gone off to look at his books again, although today he DOES seem to be correctly talking about entropy OF THE SOURCE, rather than in previous posts where he was using entropy of a single message.

As I keep saying, the entropy of the SOURCE involves how often the various events occur (like tosses of a coin). The entropy of a message involves how often particular symbols (like H or T) occur in that particular message, which NEED NOT be the same frequency as they occur at the source, although they can be used as an estimate of it.

I think NM is wrong as well pushing logs -- you don't NEED to know what a logarithm is exactly to get this, just simple things such as the more frequent an event, the shorter the codeword (or the fewer questions in the 20 questions set-up). You don't NEED to know that the exact relation is a log................

We will see how the amount of information transferred dpends upon the changes in a string

Which I can directly contradict with my breakfast scenario. ONe breakfast involved all no answers (000 in binary). That certainly contained information, enough to get me the correct breakfast! If I'd had a wider menu, more questions and longer strings -- BUT I would still have strings of no answers to all questions, and I'd still get the right breakfast.........

abacus9900 God Posts: 973	Encyclopaedia Britannica Feb 4, 2011 17:14:36 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by abacus9900 on Feb 4, 2011 17:14:36 GMT 1 Ok, STA, I get the general principle of what you are saying but it might be helpful if you could provide an actual 'real-world' example of the processes used.

Progenitor A
God

Posts: 1,987

Encyclopaedia Britannica Feb 5, 2011 14:43:50 GMT 1

Quote

Post by Progenitor A on Feb 5, 2011 14:43:50 GMT 1

Well Abacus, I was going to look at the entropy of a source and apply it to some examples, but STA has come up with the basis of Huffman coding, so I will stick with her example (even though it is not at all clear what message she wishes to send to the kitchen!)

Huffman coding followed upon Shannons statement that a source of entropy H can be accommdated by a channel C given that HN<=C where N is the number of digits in the channel

So the race started to get an encoding system that would suit that purpose, and Huffman came up with the neart -perfect coding system that guarantees to meet that Shannons requirement

All this sounds a bit abstruse and no need to take it in but investigating Huffman coding tells us lots of lovely things

Now, although she did not apparaently realise it, STA was using the basis of Huffman coding to predict a minimum number of bits for each encoded word- the words being Coffee, Ceral, tea, and nothing. It is self-evident that 2bits will suffice a there are 4 states and 2²=4, but is that minimum number of information bits that must be transmitted per word?

She gave probabilities for including these words in a unspecified message that includes Coffee (probability) 3/8, Cereal 1/4, Nothing 1/4, Tea 1/8

Now from our discussions of old we know that the more probable (or predictable) an event, the less information is needed to convey that information.(Weather tomorrow, India, same as today)

So in the context of this messaging, coffee needs less information to transmit than Nothing (is required for breakfast).

Indeed the Huffman coding for Coffee is 1 and for Nothing is 001

So the very probabilities about the words STA used is already telling us something about the information content necessary to transmit them, and Huffman simply makes a code for each probability

This is not really surprising as Huffman was using Shannons mathematics to derive his code.

Because Shannon tells us that if we know the probabilities of a set of words (or symbol set, or letters) then we can calculate a brand new quantity which he called H, the entropy.

And Entropy, H simply quantifies the information content necessary to be transmitted to send a word or symbol set or letter.

With the probabilities in this example

H=1.905625 bits per symbol

(Note that this is very close to the common-sense conclusion that we need 2 bits per word to send 4 words, as 2²=4.

The reason that Shannon's result is smaller is because some words are sent more often than others and hence we have unequal probabilities of words; if all the words had equal probabilities (1/4), the we require 2 bits per word on average)

In other words the average number of bits we need to transmit per word to get any of these words with these probabilities across a communication link is 1.9 bits per word. (Some words require more bits others less)

Now Huffman coding tells us nearly exactly the same thing and works out a coding that achieves near that average number of bits per word that Shannons entropy tells us is necessary
BUT ONLY IF THE CODE WORDS ARE LONG.
In our case as the code words have a maximum length of 3 bits, then the Huffman approximation is not close to the Shannon prediction
If we work out the Huffman coding dor these for words then it tells us that the average number of bits per word is much higher than the Shannon average, but as we make code words longer and longer Huffman approaches Shannon on the information we neeed to transmit to convet a word or a selection of words.

We are in quite interesting territory here

However Huffman coding does not always give us the most efficient coding for inform,ation transfer as we shall see.

enough for today I think. Som epractical examples of Huffman coding tomorrow

The essentil point is, the less probable an event, the more information is needed to transmit it, and the transmited information is the number of changes in a transmitted signal

Last Edit: Feb 7, 2011 13:37:50 GMT 1 by Progenitor A

abacus9900 God Posts: 973	Encyclopaedia Britannica Feb 5, 2011 18:44:03 GMT 1 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by abacus9900 on Feb 5, 2011 18:44:03 GMT 1

Encyclopaedia Britannica

Post by abacus9900 on Feb 4, 2011 0:12:13 GMT 1

Post by speakertoanimals on Feb 4, 2011 2:44:28 GMT 1

Post by Progenitor A on Feb 4, 2011 8:47:32 GMT 1

Post by abacus9900 on Feb 4, 2011 11:41:03 GMT 1

Post by Progenitor A on Feb 4, 2011 13:10:26 GMT 1

Post by speakertoanimals on Feb 4, 2011 13:56:17 GMT 1

Post by speakertoanimals on Feb 4, 2011 14:06:17 GMT 1

Post by Progenitor A on Feb 4, 2011 15:02:03 GMT 1

Post by abacus9900 on Feb 4, 2011 15:17:03 GMT 1

Post by Progenitor A on Feb 4, 2011 15:33:49 GMT 1

Post by abacus9900 on Feb 4, 2011 15:55:30 GMT 1

Post by speakertoanimals on Feb 4, 2011 16:45:28 GMT 1

Post by abacus9900 on Feb 4, 2011 17:14:36 GMT 1

Post by Progenitor A on Feb 5, 2011 14:43:50 GMT 1

Post by abacus9900 on Feb 5, 2011 18:44:03 GMT 1