Ad Widget

**Mike Sulzer** · 05-11-2011, 05:44 PM

Originally posted by Steve Conner View Post

Well, what if they had linear ears with linear bins (like a FFT) instead of third octave ones? Then there would be no complicated beating patterns. The arithmetic progression of pitches might seem as musical as our geometric one does.

I suppose the objection is that the physics of the octave is the physics of harmonics, the octave is such a strongly consonant interval because all of the harmonics of the two tones match.

So in this sense you can argue that the geometric scale is not just a perceptual effect, it is a necessary consequence of the physics of instruments, as they are all non-linear devices that generate harmonic series. Maybe the linear music could only be played on ideal instruments that generated pure sine waves, or something like that.

On a related note: The more complicated jazz is, the more the hipsters like it. So why is white or pink noise not praised as the ultimate form of jazz? Being totally random, it has the maximum possible algorithmic complexity.

The bins from the mechanical resonance are gong to be fairly wide. The fine resolution comes from the processing after the filters and after the rectification. The beating comes from the harmonics in the signal, as you said. Any further analysis has to deal with them.

**Steve Conner** · 05-11-2011, 05:55 PM

Well, my point is, that if you claim to have a compressor that works on random data, you can't just pick the sequences that compress well and reject the rest. Maybe you think you can make some kind of scheme that picks out the compressible bits and sends the rest as is. (This is more or less how FLAC works.) But you can show from the statistics of random noise that any such scheme wouldn't make back its encoding overhead, it would actually make the average lump of random data bigger. Proof:

To achieve compression, a shorter sequence must be used to stand for a longer one, but if this sequence itself occurs in the data, it must be "escaped", by adding an escape character, which makes it longer. Since this second sequence is shorter, it must be more common, by the statistics of random noise. (If the system were the alphabet, making a sequence 1 character shorter makes it appear 26 times more often.) All appearances of the escape character must also be escaped. So the overhead in escaping wipes out the gain in compressing the long sequence.

Any sequence capable of being stored in a computer must be the output of some algorithm. Maybe compression could be achieved by calculating, say, which LFSR it is the output of. However I think this is the same problem as cracking RSA.

**Mike Sulzer** · 05-11-2011, 06:15 PM

Originally posted by Steve Conner View Post

Well, my point is, that if you claim to have a compressor that works on random data, you can't just pick the sequences that compress well and reject the rest. Maybe you think you can make some kind of scheme that picks out the compressible bits and sends the rest as is. But you can show from the statistics of random noise that any such scheme wouldn't make back its encoding overhead, it would actually make the average lump of random data bigger. Proof:

To achieve compression, a shorter sequence must be used to stand for a longer one, but if this sequence itself occurs in the data, it must be "escaped", by adding an escape character, which makes it longer. Since this second sequence is shorter, it must be more common, by the statistics of random noise. All appearances of the escape character must also be escaped. So the overhead in escaping wipes out the gain in compressing the long sequence.

Any sequence capable of being stored in a computer must be the output of some algorithm. Maybe compression could be achieved by calculating, say, which LFSR it is the output of. However I think this is the same problem as cracking RSA.

I think that this is the easiest way to look at it:

Compression takes advantage of patterns in the data. A particular sample of a random process such as white noise is unlikely to be sufficiently patterned to compress significantly.

**jmaf** · 05-11-2011, 06:33 PM

Originally posted by Steve Conner View Post

To achieve compression, a shorter sequence must be used to stand for a longer one

That is just one type of compression, of which there are several subkinds. What you call "escape sequence" is just a marker and may be several different things, even no marker at all(a map says where the character was thus - a bitmap). There are several ways of compressing data, sometimes just writing it differently may achieve compression(see Huffman encoding).

In all cases what you said is true, if the map, whatever the map is, takes more space than the data, the compression is inefficient and may even increase size. When compressing random data using a certain algo, you may get no compression or lots of compression, it depends on the distribution of your random bits and on the algorithm you're employing - there are tons of algorithms, one for each purpose.... There are folks in their labs now testing new compression algorithms for different purposes as we speak.

Originally posted by Steve Conner View Post

Any sequence capable of being stored in a computer must be the output of some algorithm.

Everything we are writing here is being stored in a computer and is not the output of an algorithm.

Originally posted by Steve Conner View Post

Maybe compression could be achieved by calculating, say, which LFSR it is the output of. However I think this is the same problem as cracking RSA.

Cracking RSA is a mathematical problem that involves the difficulty of factoring large numbers. The role of random data in RSA, or any assymetric key system, is just to make it difficult for password guessing(brute force attack), random data is not any part of the RSA algorithm at all. The reason the RSA algorithm requires you for random movements of your mouse, etc, is to generate enough random entropy so that the key cannot be replicated elsewhere by an algorithm.

If the key were generated by your computer without true random data, the state of your computer could be duplicated by algorithm, and the key generated elsewhere again, thus not needing to crack RSA, but just use the key. That is the role of random data in any assymetrical key system (and every procedure of generating keys in every encryption scheme that I know).

Random data is in fact unbreakable encryption. If you use random data as your key and nobody has ever used that same data for a key, just by shifting characters in another message using that key makes it an unbreakable message that can only be read by finding your key. That type of encryption is called a "one time pad" and is unbreakable by any algorithm, you need to torture the secret holder to death to get it. It was used on the "red phone" between the Kremlin and the White House and the key tapes would be transported by the military, so to break that encryption you'd need one hell of an army. Random data can be a powerful tool indeed.

**salvarsan** · 05-11-2011, 07:16 PM

Originally posted by jmaf View Post

Cracking RSA is a mathematical problem that involves the difficulty of factoring large numbers.

Yup.

The actual difficulty is better sung to "Supercalifragilisticexpialidocious"
How to Give a Math Lecture at a Party.

> Superpolynomial subexponential runtimes.
> Even though in practice it would take you several lifetimes,
> If you ran it long enough you'd always find those two primes.
> Superpolynomial subexponential runtimes
>
> E to the root-log root-log-log [4x]

...and that's just the session key, i.e., the introduction.
The actual data stream uses a different encryption method.

**Steve Conner** · 05-11-2011, 07:27 PM

You still don't get it. There are lots of different compression algorithms, but they all work only in so far as the data is patterned, as Mike put it. Each one is tuned to take advantage of the kind of pattern in its data. Trivial example: ASCII text only uses about 70 of the 256 available codes.

Lossy compression also exploits the limitations of the human senses, so the algorithms are interesting in so far as they have to mimic our own perception mechanisms. Maybe the guts of the MP3 codec have something to tell us about music theory.

But random data can't be compressed, this is as fundamental to information theory as the laws of thermodynamics are to regular engineering. The lack of patterns means that any scheme you can dream up will fail in general. Patterns can appear by chance, but the odds of any pattern are so slim that they outweigh the coding overhead.

The meaning of "random" is a deeper puzzle still. Shannon's information theory was concerned with distinguishing signal from noise, which is a very similar idea. The encrypted data on the Kremlin hotline would have been indistinguishable from random noise by a wiretapper, but if you have the matching key tape, it's not random at all.

**salvarsan** · 05-11-2011, 08:30 PM

Originally posted by Steve Conner View Post

You still don't get it.

Who cares? *You're* the silly person who started this thread with a troll.

But random data can't be compressed, this is as fundamental to information theory as the laws of thermodynamics are to regular engineering. The lack of patterns means that any scheme you can dream up will fail in general. Patterns can appear by chance, but the odds of any pattern are so slim that they outweigh the coding overhead.

Okay, this is where you don't understand.

The Turing quote was taken out of context and refers to generating putative random sequences by arithmetic methods. His contention was simple: purely deterministic algorithms do not generate random sequences; they make long deterministic sequences. Example: the linear congruence rand() function in Microsoft's VC++ Express release.

In practice, most random sequences are less random than they seem. That is, perfect randomness does not exist except as a mathematical abstraction and should not be assumed as absolute save for the purposes of discussion.

Data resistant to RLE, Huffman, or LZW compression will often squeeze down a little with bzip2 or a fractal algorithm. It depends on the timescale and how much computational smarts you want/need to throw at the problem.

Does it matter? Not at all in a pickup makers forum.

**Steve Conner** · 05-11-2011, 09:19 PM

OK, I agree with that, and the doubtful sanity of starting this thread in the first place. It was useful for me though, I bought a copy of the Harold McGee book and am loving it.

Would you accept this milder version of my original claim:
Perfectly random data would be incompressible if it existed, except it doesn't.

And the corollaries:
The closer data is to being perfectly random, the less it can be compressed by any conceivable algorithm.
The better a compression algorithm is, the harder its output is to distinguish from random data. (Unless you have the decompressor.)

In information theory terms, you could say that lossless compression must preserve the total entropy of the data, so if you want to reduce the number of bits, the entropy per bit must increase. However, perfectly random data already has the maximum possible.

I am open to suggestions to move this whole mess to the Parking Lot.

**Joe Gwinn** · 05-11-2011, 09:38 PM

Originally posted by Mike Sulzer View Post

I am reminded of Freud. Starting from scratch and working with a tiny subset of highly neurotic people in a particular European subculture, he set out to explain the workings of the human mind.

OK, so I exaggerate a little bit.

No, it wasn't new-age fluff. Helmholtz was a very well regarded physicist in his day, and he solved a problem that had been unsolved for millennia, since the ancient Greeks at the least - why are octaves spaced with 2:1 frequency ratio?

The ancients knew this in the form that the lengths of strings yielding musical tones were the ratios of small integers. What they didn't know was why.

**fieldwrangler** · 05-11-2011, 11:43 PM

Originally posted by Joe Gwinn View Post

why are octaves spaced with 2:1 frequency ratio?

What was Helmholz' answer to this?

**Joe Gwinn** · 05-12-2011, 12:08 AM

Originally posted by fieldwrangler View Post

What was Helmholz' answer to this?

The short form: All musical notes have harmonics (and have a pitch), and the harmonics of the various strings that are sounded together must land on top of one another.

The long form: Gets pretty complex, especially why some chords sound bad. But Helmholtz's theory of this seems to be standing up even in this day of computer music.

I'd suggest reading (or at least skimming) the book, but Helmholtz was a German Professor, and I'll grant the book can be a bit of a slog.

I found his experimental methods quite interesting and inventive. Remember, this was decades before the invention of the vacuum tube - there was no such thing as electronics in Helmholtz's day.

**jmaf** · 05-12-2011, 05:44 AM

A message written from an alphabet of infinite symbols is what I think you want... If that is the case, it should be simple to prove that a message of infinite size made from an alphabet of infinite symbols is probably incompressible. I don't know of a source of more in that theory to point you to though. It's not the case of white noise, images, music or text, all of which have finite alphabets.

Every message must be made of a finite number of symbols = the alphabet. Using the English alphabet, the main Twitter feed is a source of perfectly random data.

If every person on Twitter would play musical notes with the same intensity, rather than write English, it'd probably sound like white noise but I just made that up and have no proof to offer

**Mike Sulzer** · 05-12-2011, 12:04 PM

Originally posted by jmaf View Post

It's not the case of white noise, images, music or text, all of which have finite alphabets.

White noise, the analog signal, has infinite bandwidth, infinite power (variance), and infinite peak power with vanishing probability. It contains no correlations of non-zero length. Practical signals that we often refer to as "white noise" have finite power and finite bandwidth in the sense that the power as a function of frequency decreases fast enough to meet the first condition. The correlation functions of these signals go to zero with a time scale related to the inverse of the bandwidth. Digital signals we refer to as "white noise" are related to the analog version; they have been sampled in time and quantized. The second operation means that the alphabet is finite; that is, they have finite resolution and limited peak power. If a "white noise" signal that actually has finite bandwidth is sampled arbitrarily fast, it can be compressed in a sense to a degree determined by the bandwidth. (There is correlation between the original samples, and hence a pattern of a statistical type.) However, the usual meaning of a "digital white noise" signal is a stream of samples with no correlation. Obviously a true Gaussian distribution is impossible, but it might often be assumed that it is approximately so. This cannot be compressed without losing resolution or bandwidth and so the ability to recover the original samples is gone. Of course, one can use some form of lossy compression.

There are some subtleties relating to types of patterns and lossy or lossless compression, but enough of this.

**Joe Gwinn** · 05-12-2011, 12:12 PM

Originally posted by Steve Conner View Post

On a related note: The more complicated jazz is, the more the hipsters like it. So why is white or pink noise not praised as the ultimate form of jazz? Being totally random, it has the maximum possible algorithmic complexity.

The fundamental problem is that random noise conveys no information beyond the fact of its presence. Nor does a pure sine wave, for that matter.

By contrast, music, natural languages, genome sequences, and the like, all follow 1/f statistics, also known as pink noise.

Jazz may be a paler shade of pink, but it's still pink.

**jmaf** · 05-12-2011, 01:06 PM

Originally posted by Mike Sulzer View Post

White noise, the analog signal, has infinite bandwidth, infinite power (variance), and infinite peak power with vanishing probability.

That'd mean infinite information as well.

Originally posted by Mike Sulzer View Post

but enough of this.

Why is everyone running away from math and computers?! It's such a *fun* thing to talk about in parties!

Ad Widget

Announcement

Pickups- physics or cooking?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment