|
Post by abacus9900 on Jan 26, 2011 18:06:58 GMT 1
I don't pretend to know anything about the transmission of information but on a quick look it seems to have something to do with sending the maximum amount of information (originally over the telephone) in terms of 'bits' while reducing the random interference or 'noise.' Did Shannon find the optimum method for achieving this? Yes, basically that is it. It is optimising a transmission channel (by suitable encoding) to get the maximum amount of information down a channel C with the minumum of errors. in a 'noisy' environment In order to do that, Shannon analysed the nature of information and from that analysis, we know just how much 'information' is contained in a message source - he performs this analysis on binary messages - most message sources are not binary but we can easily convert them to a binary signal as any information source can be changed into a binary signal. So by examining a binary sequence of digits we can calculate the information content of that sequence and hence know the channel capacity C necessary to transfer that information from between two (or more) places. Thus an information string of all zero's or all 1's or any variation between these two can be analysed for its information content. Now, if you wish me to go further I can do so and then you will be able to calculate the information content of any binary sequence of any finite length. Note that 'Information' in this context is devoid of meaning - for example if an encrypted message is converted to a binary sequence, we will be able to calculate the amount of information in the sequence, but we will not have a clue what the information is saying or means! Ok, you've whetted my appetite naymissus. How do you work out the information encoded in binary?
|
|
|
Post by speakertoanimals on Jan 26, 2011 18:25:46 GMT 1
Except the quote I gave from Shannon (the chap who invented the subject!), already gave you the answer (for arbitrary random strings)!
Stop playing silly buggers.
In fact for binary RANDOM strings, the answer is OBVIOUS. Imagine a series of coin tosses, each is random, heads as likely as tails, and knowing the previous 10 or 10 million tosses tells you NOTHING about the next toss.
Hence the ONLY way to send a sequence of N tosses is by sending the N results, the string of N heads or tails (or zeros and ones). Hence any such string contains as much information as any other (N tosses), and each message is exactly the same length -- N bits.
|
|
|
Post by Progenitor A on Jan 26, 2011 18:26:25 GMT 1
Yes, basically that is it. It is optimising a transmission channel (by suitable encoding) to get the maximum amount of information down a channel C with the minumum of errors. in a 'noisy' environment In order to do that, Shannon analysed the nature of information and from that analysis, we know just how much 'information' is contained in a message source - he performs this analysis on binary messages - most message sources are not binary but we can easily convert them to a binary signal as any information source can be changed into a binary signal. So by examining a binary sequence of digits we can calculate the information content of that sequence and hence know the channel capacity C necessary to transfer that information from between two (or more) places. Thus an information string of all zero's or all 1's or any variation between these two can be analysed for its information content. Now, if you wish me to go further I can do so and then you will be able to calculate the information content of any binary sequence of any finite length. Note that 'Information' in this context is devoid of meaning - for example if an encrypted message is converted to a binary sequence, we will be able to calculate the amount of information in the sequence, but we will not have a clue what the information is saying or means! Ok, you've whetted my appetite naymissus. How do you work out the information encoded in binary? Well I have demonstrated how Shannon tells us how to calculate the information content of our 64-bit signal (or computer stored data) when we have an equal number of 0's and 1's Lets see what he tells us for all zeros in our store If all the bits are 0 then probability p 1 = 0 and probability p 0 = 1 Shannon tells us that the entropy of our data is : H(entropy)= -(p o log p o + p 1 log p 1) [ note: logarithms base 2 are used] and substituting : H(entropy)= -(1 log 1 + 0 log 0) H = -(1x 0 + 0 x log 0) H = 0 bits per symbol Information content = H x Number of symbols = 0 x 64 = 0 There is no information content in this data The size of the channel C needed to transmit this data is 0bits/s Hence modern comunications systems do not send data if it is in this state. One or two bits are sent to indicate the state 0 information until the stored data changes when the new value is sent
|
|
|
Post by abacus9900 on Jan 26, 2011 18:44:13 GMT 1
Ok, you've whetted my appetite naymissus. How do you work out the information encoded in binary? Well I have demonstrated how Shannon tells us how to calculate the information content of our 64-bit signal (or computer stored data) when we have an equal number of 0's and 1's Lets see what he tells us for all zeros in our store If all the bits are 0 then probability p 1 = 0 and probability p 0 = 1 Shannon tells us that the entropy of our data is : H(entropy)= -(p o log p o + p 1 log p 1) [ note: logarithms base 2 are used] and substituting : H(entropy)= -(1 log 1 + 0 log 0) H = -(1x 0 + 0 x log 0) H = 0 bits per symbol Information content = H x Number of symbols = 0 x 64 = 0 There is no information content in this data The size of the channel C needed to transmit this data is 0bits/s Hence modern comunications systems do not send data if it is in this state. One or two bits are sent to indicate the state 0 information until the stored data changes when the new value is sent Thanks for trying naymissus but frankly it seems way above my head. It's the maths that I can't handle.
|
|
|
Post by speakertoanimals on Jan 26, 2011 18:45:54 GMT 1
Wrong. you have ASSUMED that the ONLY messages you will ever want to send are strings of zeros. Hence then you can take p1 = 0, po=1, discover there is no information to be sent!
However, if you want to send arbitary strings (such as a sequence of coin tosses from a fair coin), then you have to use instead the fact that p0 = 1/2 and p1 = 1/2. So the probability for ANY string of length N is just 1/2^N, which when you take -log p as information content, gives you N as the information content of ANY random string of length N, whether alll zeros or not.
thank you for finally demonstrating that you have NO IDEA what the probabilities used by Shannon actually ARE. In short, you have to consider not just the message you are sending NOW, but what other possible messages you might like to send. If you want the average length , over many such messages, to be as small as possible, then you use the shannon codeword-length rule -log P, but the probability MUST be computed from the entire ensemble of possible messages, NOT any particular single message.
Thankyou again for finally making totally clear how little you understand about the subject!
As Shannon himself says:
|
|
|
Post by speakertoanimals on Jan 26, 2011 18:53:48 GMT 1
Wrong. As I explained earlier, telephone message differ from the case considered by Shannon, in that they have correlations, AND periods of talking interspersed with periods of silence. The silence ISN'T random bits of silence amongst the talking, if you like, but extended periods of silence.
Hence you should not confuse 0 with silence, for starters.
Now you are in a position to say -- I can reconstruct the signal exactly IF I knew what was happening in talking bits, and how long the silences were in between. I can using the fact that silence is correlated with silence. Hence there is LESS actual information to transmit that a naive analysis would indicate. And I discover this correlation by looking at MANY actual conversations (the ensemble of possible messages).
In effect, it says -- the silences contain NO SPEECH, hence I need not send them, just the actual speech parts then how far apart to space them.
You can't do the same trick with H's and T's from a coin toss, because unklike silence, heads don't appear like that, but instead randomly dispersed amongst the T's. no correlations, and a different ensemble probability.
|
|
|
Post by Progenitor A on Jan 26, 2011 18:55:20 GMT 1
Wrong. you have ASSUMED that the ONLY messages you will ever want to send are strings of zeros. Hence then you can take p1 = 0, po=1, discover there is no information to be sent! However, if you want to send arbitary strings (such as a sequence of coin tosses from a fair coin), then you have to use instead the fact that p0 = 1/2 and p1 = 1/2. So the probability for ANY string of length N is just 1/2^N, which when you take -log p as information content, gives you N as the information content of ANY random string of length N, whether alll zeros or not. thank you for finally demonstrating that you have NO IDEA what the probabilities used by Shannon actually ARE. In short, you have to consider not just the message you are sending NOW, but what other possible messages you might like to send. If you want the average length , over many such messages, to be as small as possible, then you use the shannon codeword-length rule -log P, but the probability MUST be computed from the entire ensemble of possible messages, NOT any particular single message. Thankyou again for finally making totally clear how little you understand about the subject! As Shannon himself says: You really are a fool Shannon's equations tell us that the information content in a 64-bit binary stream varies from a maximum value to a minimum value therefore a channel C of that can handle the maximum information can accommodate any data combination that the 64-bit stream can handle. Why do you continue to make foolish , ill informed statements that are arrant nonsense? I am tired of your nonsense
|
|
|
Post by Progenitor A on Jan 26, 2011 18:59:00 GMT 1
Well I have demonstrated how Shannon tells us how to calculate the information content of our 64-bit signal (or computer stored data) when we have an equal number of 0's and 1's Lets see what he tells us for all zeros in our store If all the bits are 0 then probability p 1 = 0 and probability p 0 = 1 Shannon tells us that the entropy of our data is : H(entropy)= -(p o log p o + p 1 log p 1) [ note: logarithms base 2 are used] and substituting : H(entropy)= -(1 log 1 + 0 log 0) H = -(1x 0 + 0 x log 0) H = 0 bits per symbol Information content = H x Number of symbols = 0 x 64 = 0 There is no information content in this data The size of the channel C needed to transmit this data is 0bits/s Hence modern comunications systems do not send data if it is in this state. One or two bits are sent to indicate the state 0 information until the stored data changes when the new value is sent Thanks for trying naymissus but frankly it seems way above my head. It's the maths that I can't handle. It really isn't so bad. Trouble is that you do not have the incentive to take in this (quite tedious stuff). I did have the incentive - it paid my wages! If I can assist you further then please ask
|
|
|
Post by robinpike on Jan 26, 2011 19:04:49 GMT 1
Yes. It describes how to get maximum information down a noisy channel. Also, how much information is actually in a signal or data, which is why lossless data compression software such as Winzip work. And why data compression for pictures, video files, and music tracks works. All rather significant in todays world, we'd be stuffed without it. So hopefully SOME people do understand it, even if some who think they do don't......................... Right, fine, so far so good. But how can information be represented by just a string of zeros? Here is just one example: Suppose i want to transmit a message about colours used in a picture of which I am able to measure 2 million different possible colours. Whether a colour is used or not, I will mark a bit on or off in the message string and each possible colour of the 2 million colours has a particular position in the message string. A zero means that the colour was found in the picture, and a 1 means that the colour was not found in the picture. so now a message of 2 million zeros indicates that every one of the 2 million measured colours were found in the picture. That to me, looks like information represented by a string of zeros.
|
|
|
Post by Progenitor A on Jan 26, 2011 19:29:09 GMT 1
Right, fine, so far so good. But how can information be represented by just a string of zeros? Here is just one example: Suppose i want to transmit a message about colours used in a picture of which I am able to measure 2 million different possible colours. Whether a colour is used or not, I will mark a bit on or off in the message string and each possible colour of the 2 million colours has a particular position in the message string. A zero means that the colour was found in the picture, and a 1 means that the colour was not found in the picture. so now a message of 2 million zeros indicates that every one of the 2 million measured colours were found in the picture. That to me, looks like information represented by a string of zeros. Good point Robin. Yes, a string of zeros can represent an analogue value. Do you want to discuss how Shannon deals with this case?
|
|
|
Post by abacus9900 on Jan 26, 2011 19:30:06 GMT 1
Right, fine, so far so good. But how can information be represented by just a string of zeros? Here is just one example: Suppose i want to transmit a message about colours used in a picture of which I am able to measure 2 million different possible colours. Whether a colour is used or not, I will mark a bit on or off in the message string and each possible colour of the 2 million colours has a particular position in the message string. A zero means that the colour was found in the picture, and a 1 means that the colour was not found in the picture. so now a message of 2 million zeros indicates that every one of the 2 million measured colours were found in the picture. That to me, looks like information represented by a string of zeros. Ooooohhhh, I see. You are using binary in a different way. Thanks.
|
|
|
Post by speakertoanimals on Jan 26, 2011 20:32:37 GMT 1
You're still talking nonsense and disagreeing with Shannon! Did you miss the exact thing I quoted. I'll give it to you AGAIN:
So, when you calculated probabilities of zeros based on a SINGLE message (all zeros), to give probability 1 for zeros, you directly contradicted exactly what Shannon said. If all possible messages are ALL and ANY random strings of 0's and 1's, then you MUST use p=1/2 for computing the information content of any ONE message -- which gives you the value of N bits for any one of them.
The probability is for the ENSEMBLE of messages, NOT for the single message.
And you very obviously don't understand the -log P formula, and WHY it is that!
But lets take the SIMPLEST possible case -- a fair coin thrown N times, and values sent as a message.
The only logical assignment of information is that ALL such strings have equal information content, because the information they do contain is N independant and totally random tosses of a coin. They CANNOT contain less, because if they did, then that would be saying, for example, that 8 coin tosses actually only contain 4 bits of information, which is equivalent to saying I could ALWAYS predict the exact result of 8 coin tosses given only 4!
Note the ALWAYS, that the kicker that Shannon tells you you must not forget!
And IN GENERAL (when considering ALL possible sequences), you CANNOT do that! As Shannon says, you have to consider ANY message from the ensemble when designed the system, not just one!
If I only ever sent strings of heads, then of course I get zero information content, because I can predict every head! But ONLY if you NEVER send any other messages, because if I wanted to send HTTTHHHH my transmission scheme is screwed, because the scheme would instead try to send HHHHH........ which would be wrong.
Don't let your pride blind you to the fact that you have MISUNDERSTOOD. Don't let the fact that you have accepted money for teaching what now turns out to be wrong. Just READ bloody Shannon, even if you think I'm a total obnoxious idiot -- even obnoxious idiots can be right once in a while (with the possible exception of abacus.......)
|
|
|
Post by speakertoanimals on Jan 26, 2011 20:57:44 GMT 1
To put it another way, -log P gives us the optimum codeword length for an event that occurs with probability p.
BUT we must use the probability overall, NOT just per message. Else if we had coin tosses (head or tail probability 1/2, its a fair coin), we KNOW ensemble probability of H is 1/2. THAT is the codeword we use for ANY sequence that then occurs.
Calculating the probability PER MESSAGE is in effect using a DIFFERENT coding scheme for any one message, specifically NOT what shannon was talking about, he said you could only design the system by looking at the ensemble, because until you set it up, you don't know what exact message you will be required to send. ANd your method of computing probabilities would have you transmuting the design for EVERY message, which is just plain wrong -- who's going to send the receiver the new codebook for starters, because they'll get screwed if they try to decode using the old one...................
By saying 00000 is zero information, you are in effect using the codebook --- everything is a head, don't bother. Which won't give the right message when what you actually want to send is HTTTHHHHTTTT................
I KNOW your result seems to make sense intuitively, I understand the possibility of confusion with the telephone case where there are correlations and long periods of silence, but it is still WRONG for the coin case.
I might not care about you, but I do care about your students and any readers of this board!
Anyone out there back me up? DO my arguments make sense, is anyone else convinced that for coin tosses, a string of heads contains just as much information as any other string of tosses of the same length, from a fair coin?
|
|
|
Post by Progenitor A on Jan 27, 2011 18:31:18 GMT 1
No-one to help a lady in distress! Shame on you all! Still if the lady ventures out of her depth why should you risk your reputation trying to stop her from drowning?
Now Robin, as you have not further commented upon a sequence such as 00000000 possibly representing an analogue value then I will undertake to comment myself.
Now Shannon tells us that, for example a sequence of 64 0's contains no information. And as Shannon was a quite clever chap it is wise to accept that, just for the sake of argument if need be, because interesting things pop out
But if it has no information content, can it have meaning?
Let us accept that this sequence of 64 0's represents the colour red, as Robin suggested it might be a colour and red is convenient
Well, lets take Shannon at his word, accept that sequence has no information and transmit it anyway (we can do that of course, if we wish).
So we send a sequence 64 0's and the other end decodes and receives 64 0's
It knows that this value represents red, so shows red wherever the red is displayed.
Looks as if we have sent information doesn't it?
Now lets send the sequence for red again, bit this time (because Shannon tells us we do not have to) we do not send any of the 0's.
The far end doesn't receive any of these 0's so decides the missing values must be 0's (binary has only two values,0 and 1)
So the decoder thinks it has received 0000000...,'red' and red is displayed.
How can no information =red?
Well it is commonplace for the absence of information to have meaning.
For example a WW2 SOS agent in France might be told to transmit a message back to London every Thursday at 7pm. He might be told that it is essential to transmit a message, any message at that time, because the absence of a message will mean that something has gone dreadfully wrong and the operation must be closed down
So one Thursday at 7pm London does not receive a message and that means that the operation must be closed down immediately
Note this: no information has been transferred but the absence of information has meaning.
This is the essential difference, in Shannon's view between information and meaning. He is not concerned with transferring meaning, just information, and does not care if no information has meaning. Therefore he says the sequence of all 0's contains no information.
Now lets look at this transfer of 'red' (a string of 64 sequential 0's) in practical terms. If we represent an LED shining On as '1' and an LED Off as '0' (this is nearly what happens in optical fibre transmission), and send a sequence of 64 alternate 0's and 1's down the fibre ...0101010101.... and look with your eye (not recommended) at the far end of the fibre we will see pulses of light - ON, OFF, ON, OFF.....as information is transferred. Now if we send in the next sequence the coding for RED 000000000... , when we look at the far end, what will we see?
Why nothing of course. We will see no information being transferred
Colloquially we would say that the transmitter is not sending any information. That is what Shannon would say too. But when the fiber is connected to a decoder the decoder will interpret 64 'LED OFF' pulse periods as RED and RED would be displayed on a monitor
To repeat, no information may have meaning.
Now there are some practical difficulties to overcome, so we include control bits to remove ambiguities.
For example if the transmitting LED failed we would have continuous RED decoded at the far end and we can't have that now, can we?
There is one other apparent paradox (it isn't really). If the sequence of 64 On OFF pulses is all 1's, Shannon tells us that it contains no information. 64 1's therefore need not be transmitted. But then the far end will decode the 64 'not 1's' as 64 '0's' and will display red.
If we decide not to transmit long strings of digits because they have no information content, we must tell the far end (with control bits) what state the NO INFORMATION is in.
|
|
|
Post by Progenitor A on Jan 28, 2011 9:22:58 GMT 1
Continuing in an attempt to illustrate Shannons distinction between meaning and information, I shall consider a 1 -byte (8-bits) store that is waiting to be transmitted
Below are succesive states of the store that are transmitted What Shannon Says 1st byte 00000000 (010) No information content 2nd byte 00000001 (110) yes information content 3rd byte 00000010 (210) same information content 4th byte 00000011 (310) yes information content 5th byte 00000100 (410) same information as 110 210
.
106th byte 10101010 (10610) max information 108th byte 10101100 (10810) <max information
Now something apparently odd is going on here. The information content is not realted to the value of the number.
Why should 510 have the same information content as 110 or 210. Why should 10610 have more information content than 10810?
Is 410 really more significant than 010 (No of course it is not -especially if 0 is the last number in the winning lottery ticket!) It simply takes more changes of state to transmit 410 than it does 010. Therefore Shannon states that 410 has more information content than 010.)
If we look at 10610 then we see it has the maximum of changes (transitions 0 to 1) so Shannon tells us that has the maximum information content
You see, Shannon is not concerned with the meaning of the information but the changes required to transmit it. The changes define the information content of a message not the significance of the message This is very sensible because the sigbnificanceof the mesage is only apparent to the two (or more) end users.
Ler me illustrate this by allocating deccimal numbers in a random fashion to those bytes we have just looked at. (We are free to encode our data as we wish)
1st byte 00000000 (1610) No information content (Shannon) 2nd byte 00000001 (4810) yes information content (Shannon) 3rd byte 00000010 (3210) same information content Shannon 4th byte 00000011 (10810) yes information content (Shannon) 5th byte 00000100 (410) same information as 110 25610
.
106th byte 10101010 (010) max information Shannon 108th byte 10101100 (110) <max information Shannon
Note that the information content of the ordered bytes has not changed but the meaning of each byte has changed.
We are not concerned with meaning , simply information content.
|
|