PDA

View Full Version : How long can digitally stored information last?



Philippe Lemay
2010-May-15, 07:22 AM
I remember hearing somewhere that a person who was (hypothetically) stored in cryostasis would have a limited shelflife. After a few hundred to thousands of years, the naturally decaying isotopes in his/her body would cause radiation damage to his/her genetic material.

This got me to thinking... a computer, when it's unplugged is kind of like a person. Only instead of cognitive and genetic information, you have semiconductors and digital information. So I just wanted to know, would a computer be subject to similar long term limitations? After a number of years, would a computer that had just been sitting iddly there suffer from data loss? Would keeping it plugged in help somehow?


That's the first part of my reasoning, taking it further, I started to wonder about machine-immortality.
We all know the concept, if our bodies will eventually die why not craft artificial bodies that can live forever. I wonder if the limitation on computer storage might mean that, even if we could download ourselves into computers or machine bodies, our minds, our sense of self, our "souls" would gradually decay and be lost.

And if that is the case, which is likely to be more resilient to the passage of time. Genetically, neurologically, or digitally stored information. Or maybe... something like analog? lol, storing a human mind on nanoscopically small spools of tape.

Geo Kaplan
2010-May-15, 08:25 AM
I remember hearing somewhere that a person who was (hypothetically) stored in cryostasis would have a limited shelflife. After a few hundred to thousands of years, the naturally decaying isotopes in his/her body would cause radiation damage to his/her genetic material.

This got me to thinking... a computer, when it's unplugged is kind of like a person. Only instead of cognitive and genetic information, you have semiconductors and digital information. So I just wanted to know, would a computer be subject to similar long term limitations? After a number of years, would a computer that had just been sitting iddly there suffer from data loss? Would keeping it plugged in help somehow?

Data storage devices are indeed the most volatile. Both "flash" memories (based on charge storage) and magnetic memories (disk drives) degrade even when not powered. Current design practice for consumer devices is to provide a 10-year lifetime at elevated temperature (eg, 100 or 125C). A fundamental limiting error mechanism is thermally-driven charge loss in flash devices, and thermally-driven demagnetization in hard drives. Longer lifetimes are possible if larger memory cells are used (larger data signal-to-thermal noise ratio), the operating temperature is lowered, and additional error correction (including redundancy) is employed. Extrapolated lifetimes of centuries have been achieved for some devices.

Philippe Lemay
2010-May-15, 05:25 PM
That's really interesting...
So, coming back to my cryo-stasis analogy, keeping a computer very very cold would help it to preserve it's held information for far longer?

Thanks for clarifying those details, should help my thought process. Especially that lifetime of centuries estimate, that'll really help me.

dgavin
2010-May-15, 07:09 PM
Holographic storage (HDSS) that IBM has been working on for years now, has longer life time, and more resistant to thermal variences, but eventualy the medium the holographs are store in will degrade. It's expected though a holocube will have about a thousand+ year life on data retention.

However IBM hasn't released thier HDSS for comercial use even, though it;s been in operation on some of thier systesm for about 5 years. While the holocubes are small (about the size of a grain of sugar to a 3/4 inch by 3/4 inch cube) the aparatus for storing and retriving the information out of them is about the size of a desk.

clint
2010-May-15, 07:25 PM
I wonder if the limitation on computer storage might mean that, even if we could download ourselves into computers or machine bodies, our minds, our sense of self, our "souls" would gradually decay and be lost.
In a scenario where we can download ourselves into artificial machines, decay would only limit the lifetime of our current "container", wouldn't it?
If it couldn't be repaired or updated anymore, we could just transfer to the next one.

This could still be a problem, if for any reason there were no replacement containers available when needed.

Philippe Lemay
2010-May-15, 08:14 PM
There's also the problem of signal degradation, diminishing returns I think the technical term is called.

Every time you're transferring from one body to another you are effectively making a copy of your mind and downloading it into the new body. And as you know, after making a copy of a copy of a copy, signal degradation starts to creep up. You might be able to make some kind of lossless format for your mind's information, but then you would probably need so much storage capacity that your artificial "body" would be the size of a building.

01101001
2010-May-15, 09:36 PM
And as you know, after making a copy of a copy of a copy, signal degradation starts to creep up.

I don't know that. Digital copies are bit for bit identical -- unless the copier is careless.

Philippe Lemay
2010-May-15, 10:53 PM
Are you sure? They're perfectly identical?

What about things like JPG damage, would that fall under careless copying?

Grashtel
2010-May-15, 10:56 PM
Are you sure? They're perfectly identical?

What about things like JPG damage, would that fall under careless copying?
Yes, digital copies, assuming that the copier is reliable (and there are ways to cope with unreliable copiers), will be perfectly identical.

The damage from JPEG compression is because it uses a lossy algorithm, in other words it works by deliberately throwing away information, which is why they can be so so much smaller than the same image compressed losslessly, there is simply less information to encode.

Geo Kaplan
2010-May-16, 01:52 AM
There's also the problem of signal degradation, diminishing returns I think the technical term is called.

The wonderful thing about the digital paradigm is that it is possible to make "perfect" copies. More precisely, there is a definite prescription for achieving an arbitrarily low (but nonzero) error rate. The tradeoff is in cost, size, power, etc. Standard practice in consumer goods is to make the thing as cheap as possible, given some minimally acceptable reliability. If the goal were to shift to "Assure an rms error of less than 1 part in 10^big in a millennium", that could be accommodated too, with appropriate design choices. So, it's not so much a question of "is it possible", but rather of "how much would it cost?"

Philippe Lemay
2010-May-16, 03:39 AM
So... assuming we wanted to keep a digitally stored database intact for a few millenniums, rather than designing a big complex mechanism that is super-cooled to prevent that thermal loss thing, one could just swap the information into a fresh hard-drive (or flash-drive) every few years. In order to stave off the 10 year "thermally-driven demagnetization" or "thermally-driven charge loss".

Or, if you wanted to be REALLY sure that no data would be lost, you could use the bulky super-cooled storage device, AND swap it out every few years.

Edit:
Do you think the process could be automated somehow? Assuming a computer has enough spare internal memory to switch it's files around, could it do so when it detects that a file has gone unused for a few years, and is approaching the high probability of thermally-drive loss. Might they already have implemented this aspect into modern computers..?

01101001
2010-May-16, 04:00 AM
[...] one could just swap the information into a fresh hard-drive (or flash-drive) every few years.

It's pretty much required with modern digital media. Formats change. Devices change. Infrastructure changes. It's really hard to read old media without the old machines. Hand in glove.

mugaliens
2010-May-16, 05:52 AM
When digital information is stored in certain types of RAID arrays, the data, both duplicated and distributed over multiple drives, is subjected to integrity algorithms which can both detect errant data and restore correct data.

On simple RAID arrays, there's still a small chance of a catastrophic event causing corrupt data. However, the concept can be taken to the extreme, including multiple/redundant servers, power systems, backups, locations, protection software and hardware, and even file systems. At some point, you have a greater chance of a neuron in your brain going supernova when struck by the God (or is it OMG?) particle than permanently loosing a single bit.

Bearded One
2010-May-16, 05:53 AM
Modern hard drives are actually quite error prone. They store error correction data that allows them to rebuild the original data if read errors occur. They also monitor the error levels for sectors and if they exceed a preset threshold they map out those sectors and replace them with spares. If they run out of spares you start to get those "S.M.A.R.T." disk warnings telling you your drive may be about to fail. Then you are supposed to buy a new drive. Add mirror copies and shadow reads then repeat. Indefinite storage.

Geo Kaplan
2010-May-16, 06:16 AM
Modern hard drives are actually quite error prone. They store error correction data that allows them to rebuild the original data if read errors occur. They also monitor the error levels for sectors and if they exceed a preset threshold they map out those sectors and replace them with spares. If they run out of spares you start to get those "S.M.A.R.T." disk warnings telling you your drive may be about to fail. Then you are supposed to buy a new drive. Add mirror copies and shadow reads then repeat. Indefinite storage.

Indeed; without error correction, CDs, DVDs, hard disks and flash drives would either perform badly or be prohibitively expensive. Only about 50-70%% of a hard disk surface contains user data. The rest is used for servo and error correction. As I said earlier, these devices are optimized for cheapness. They would look quite a bit different if longevity were the goal instead. But the main point is that, thanks to error correction (merci, M. Galois), extremely good reliability is possible.

harkeppler
2010-May-16, 09:17 AM
Several libraries have problems with commercial bought CD-roms of the 80ies now: they are decomposing because the plastic is somewhat instabil. Magnetic tapes from the 60ies are demagnetized and decay to dust. Plastics often are not very long-lived. In most cases, chemicals in the plastic drive havoc. A gramophone record has a better probalility of survival.

Maybe for "extremly long term use" another methode should be developed. Assuming very small platin points embedded in a quarz-matrix produced by chemical vapor deposition methods which are read out by a laser will last very long, perhaps billion of years.

Hornblower
2010-May-16, 10:58 AM
In a nutshell, it appears that ongoing maintenance is a must. A good grade of archival paper and printer's ink may outlast our high-teck media.

HenrikOlsen
2010-May-16, 01:21 PM
There's a parallel in movies, the very oldest movies are preserved because copyright legislation at the time required a paper copy to be filed with the library of congress, so contact negatives on paper were made of the movies and saved. (For information retention you can think cuneiform on fired clay)
Paper photos last a long time compared to celluloid, if the latter is simply put in film cans and put on a shelf it'll crumble to dust in a decade or two.
As a result there's a middle section of movie history which is basically gone, with no possibility of recovery.

cjameshuff
2010-May-16, 05:58 PM
That's really interesting...
So, coming back to my cryo-stasis analogy, keeping a computer very very cold would help it to preserve it's held information for far longer?

Not too cold. Changes in the electrical and magnetic properties with temperature could lead to data loss as well.

Also, as mentioned, the error rate on modern high density hard drives is substantial. They are usable because they incorporate redundant storage and error detection and correction codes, and are thus able to achieve high reliability on an unreliable medium, while also achieving far higher storage densities. The same could be done with other storage media to greatly extend useful lifetime, putting off data corruption until the hardware is almost completely unusable. I suspect the estimate of centuries is for inert, unpowered flash devices using these techniques.

In flash memory, the stored charge that determines what a cell stores dissipates over time. If the memory is in a powered device, reading and re-writing it would refresh these charges...flash has limited read/write cycles it can endure before starting to fail, but if you're just doing it every few years to refresh data, that's enough that the eventual failure mode will be something unrelated.

There are also devices such as FeRAM, which store data in a different manner (it's essentially integrated-circuit magnetic core memory) and tolerate far more read/write cycles. They have data retention of 10 years, but perform a refresh operation as part of their read operation. With redundant storage and error correction, any data not kept refreshed will be that which hasn't been accessed in decades.

And yes, digital copies are absolutely identical. You can copy a JPEG file as many times as you want. It's the encoding step that loses data.

If long-term maintenance is not practical or desired, dedicated archival data formats could be made that would achieve extreme longevity. Etch pits on sheets of some stable metal foil (stainless steel, perhaps), with lasers or photolithographic techniques, store in dry nitrogen underground, where it's protected from temperature swings. For archival purposes, write-once storage is not a problem, and something like this could be made relatively easy to read, at low densities a common scanner and some special software could do the job. This could achieve indefinite storage lifetimes, mainly limited by the geological stability of the chamber it's stored in. It isn't "live" storage, though, it's specifically built for long-term archival with low access rates.

The electronics will age as well, especially if powered. Semiconductors rely on trace dopant materials being embedded in extremely fine patterns in the substrate. These will diffuse through the substrate over time, faster at higher temperatures. Worse, when powered, electromigration effects will drive diffusing atoms to move along electrical fields. These effects will be stronger in higher density circuitry and in higher power circuitry, and will eventually cause failure. Lower density, lower power circuitry will last longer.

neilzero
2010-May-16, 09:44 PM
If we store the digital data on stone tablets, it has a shot at readable in 10,000 years, even longer on platinum tablets. From a practical stand point, recovery is expensive after about 5 years. Few of us own a working drive to read floppy disks, we might have made 5 years ago, just before we retired our 1999 computer. Neil

Geo Kaplan
2010-May-16, 10:20 PM
If we store the digital data on stone tablets, it has a shot at readable in 10,000 years, even longer on platinum tablets. From a practical stand point, recovery is expensive after about 5 years. Few of us own a working drive to read floppy disks, we might have made 5 years ago, just before we retired our 1999 computer. Neil

Yes, there is indeed the separate (and serious) problem of longevity of standards, on top of the one of longevity of the bits themselves. Even if the data in my flash memory remains intact for a millennium, will our descendants be able to understand the file formats? That question is a big one for archivists.

harkeppler
2010-May-16, 10:54 PM
It would be useful to store information on the data formats not on media due to decay...

Mayby a large black stone comes handy to inscribe all the format information in plain text and to put it in an unproblematic environment - mayby the lunar regolith.

swampyankee
2010-May-17, 01:25 AM
There are two problems. One is the possibility of deterioration, which can be minimized by proper choices of substrates and storage conditions. The second is that digital data can only be recovered if the proper hardware and software exists to read the data. Those IBM HDSS cubes are completely useless, regardless of how long they can keep data, if the machinery to read them is no longer available.

Jens
2010-May-17, 02:09 AM
That's the first part of my reasoning, taking it further, I started to wonder about machine-immortality.


I think a difficult issue here is that we naturally suffer from data loss in everyday life, and are constantly gaining new information, so in a sense we are not exactly the same person that we were ten minutes earlier. So the idea that you need to be perfectly preserved to be immortal may be an unnecessary assumption. Perhaps you can experience data loss but still gain new data and keep a stream of self.

mugaliens
2010-May-17, 06:30 AM
In a nutshell, it appears that ongoing maintenance is a must. A good grade of archival paper and printer's ink may outlast our high-teck media.

While you're right, the equivalent of what's stored in my system is probably more than enough to fill my entire apartment. Meanwhile, I can copy from my dual RAID arrays to to new ones every couple of years, and two 1 TB drives providing a rotating offsite storage answers issues with natural disasters.

Philippe Lemay
2010-May-17, 01:30 PM
Maybe for "extremly long term use" another methode should be developed. Assuming very small platin points embedded in a quarz-matrix produced by chemical vapor deposition methods which are read out by a laser will last very long, perhaps billion of years. Hey, hold on a minute. Quartz is basically just silicon oxide... so is a quartz crystal also a semiconductor, like regular silicon?


It would be useful to store information on the data formats not on media due to decay...

Mayby a large black stone comes handy to inscribe all the format information in plain text and to put it in an unproblematic environment - mayby the lunar regolith.lol, 2001 monolith refference?

mike alexander
2010-May-17, 01:59 PM
Active maintenance of information with error correction and redundant storage has proven remarkably versatile. Life has outlasted the collisions of continents.

01101001
2010-May-17, 02:19 PM
It would be useful to store information on the data formats not on media due to decay...

The big brains at the Long Now Rosetta Project (http://rosettaproject.org/about/) seem to have gone gaga for laser-etched glyphs on nickel under glass for a prototype. It's just a small amount of data though, just textual information about all the world's languages, 14000 pages.

jj_0001
2010-May-17, 07:06 PM
Are you sure? They're perfectly identical?


The bits are identical, unless the copier is bad. Period. No if, and, but.




What about things like JPG damage, would that fall under careless copying?

This is "lossy encoding damage" and has nothing at all to do with digital storage robustness or copying.

jj_0001
2010-May-17, 07:38 PM
If I may butt in for a minute, this (digital signal processing) being my livelihood and all that...

Analog storage (by which is meant a continuous time/space, continuous level/whatever analog) incurs damage at every copy, and thermal noise, etc, will degrade it as well over time on any kind of medium as well. Perfect copying is impossible, due to simple, obvious stuff like the charge on the electron and the fact we do not operate at zero degrees K. All it takes, by the way, is the electron charge, and you're sunk. There is a similar problem, say, for audio in the atmosphere. The momentum and quantity of air molecules sets a limit to the noise floor, and one that isn't too far below human hearing ability, too. Ditto for photon capture, there the energy per photon is a limit, etc. At the lowest level, everything is quantized, probabilistic, and therefor noise. Ergo, no exact analog copy, and yes, the noise levels are entirely high enough to matter in everyday life.

Digital Storage (by which is meant a sampled time/space, quantized level/whatever analog of an original) exists as discrete points. Even as discrete points in an inefficient representation these are more error-immune. Converted to bits, the error immunity goes up. In either case, they can be copied over and over and over without any loss whatsoever BECAUSE THE SAMPLING POINTS AND LEVELS THAT ARE PERMITTED ARE KNOWN IN ADVANCE. You can "correct" the data every pass though. There is, however, a gotcha, in that information is lost (bandwidth constraints and noise floor constraints) AT THE FIRST CAPTURE, and you're stuck with that forever.

Bits, by themselves, are a very robust storage mechanism, because each one uses the entire range of the voltage/magnetism/whatever that encodes it. However, in order to get space-efficient storage, one reduces the "range" until the storage becomes less robust again (see the discussion of data on disc above for an example), and then "FEC" (forward error correction) is added to restore the original data.

The effect of FEC, etc, is that until you have a certain amount of noise growth, you get the data back perfectly. Exceed this by a teensy-tiny bit, and it's all gone, so long, farewell. The good news is that the analog signal would be seriously hurting by that point as well.

Basically, what happens in digital capture of a signal (or a signal that originates in bits) is that the amount of degredation required to cause any loss beyond the original capture is much, much much more than that for an analog capture, HOWEVER, once you hit that level, it's gone. Poof.

The key to keeping digital storage working is occasional copying, that way you get all the data (bits) back to known levels. Some kind of optical storage, perhaps with a better mask than aluminium, ought to offer a substantial robustness, especially if a much greater diversity (both in interleaving and FEC) than present CD/DVD/Blu-Ray is used. (using buffering and interleaving data out means you don't take losses from scratches, holes, etc, by spreading out each block of data so that the FEC can still work even if 'x' amount is missing).

For instance, if you take at least one 32 bit RSC and use only the first 16 bits of it, keeping the other 16 bits fixed in a given pattern, you can detect and correct up to 25% total errors in a given block. So if you lose 24% you're cool. 26% and you have no idea what your data was, but at least you'll detect that you lost it. So, you store twice as much data as you have information in order to make sure you get it back.

This kind of system leads, in modem setups, to a situation where a drop of .1dB in SNR causes you to go from lossless to 1 mistake/minute and a drop of another .1dB means "there is no data here". The cliff is very, very steep. Either you get it all, or you just don't get anything.

There is something called the Shannon Bound that describes the maximum information you can transmit, good modems and storage mechanisms get pretty close, but it takes infinite time and processing to actually hit the Shannon Bound, of course. :)

mugaliens
2010-May-18, 02:56 AM
You can "correct" the data every pass though. There is, however, a gotcha, in that information is lost (bandwidth constraints and noise floor constraints) AT THE FIRST CAPTURE, and you're stuck with that forever.

If you're simply downloading the data, say, from online, various algorithms have been used since 1986's zmodem protocol (http://www.omen.com/zmdmev.html) to ensure the integrity of what's saved to one's hard drive matches what existed on the originating drive.


For instance, if you take at least one 32 bit RSC and use only the first 16 bits of it, keeping the other 16 bits fixed in a given pattern, you can detect and correct up to 25% total errors in a given block. So if you lose 24% you're cool. 26% and you have no idea what your data was, but at least you'll detect that you lost it. So, you store twice as much data as you have information in order to make sure you get it back.

Similarly, at a higher level, if you XOR two groups of data, whether byte-wise or block-wise, and store the result, known as the parity set, a 33-1/3% hit in terms of storage space, you possess the ability to recreate any one of the three groupts of data by performing an XOR function on the other two. Keeping each of the three on separate drives is the simplest version RAID 5, although more common implementations use four disks. Thus, the failure of any single drive results in zero loss to the data set. The similar, but more redundant RAID 6 can suffer the loss of two drives without loss of data.

cjl
2010-May-18, 03:05 AM
Indeed; without error correction, CDs, DVDs, hard disks and flash drives would either perform badly or be prohibitively expensive. Only about 50-70%% of a hard disk surface contains user data. The rest is used for servo and error correction. As I said earlier, these devices are optimized for cheapness. They would look quite a bit different if longevity were the goal instead. But the main point is that, thanks to error correction (merci, M. Galois), extremely good reliability is possible.

I'm almost positive (though I know someone in the hard drive industry, and I'll ask them just to make sure) that modern hard drives have far, far less than 30-50% of the disk reserved for servo and error correction. If I remember right, it's more like a few percent. I'll ask to make sure though.

Geo Kaplan
2010-May-18, 04:46 AM
I'm almost positive (though I know someone in the hard drive industry, and I'll ask them just to make sure) that modern hard drives have far, far less than 30-50% of the disk reserved for servo and error correction. If I remember right, it's more like a few percent. I'll ask to make sure though.

You might be confusing file system overhead with what I'm referring to. In any case, I'd be very interested in what your friend says. Even in the old days without sophisticated error correction, there was already a ~20% overhead for non-user bits/servo. It's only gotten worse as the industry has pushed for larger areal densities. There is no other consumer technology that walks to the edge of the cliff of death on so many fronts simultaneously. Hard drives are a miracle.

cjameshuff
2010-May-18, 12:08 PM
Hey, hold on a minute. Quartz is basically just silicon oxide... so is a quartz crystal also a semiconductor, like regular silicon?

And water's mostly oxygen by mass. Can you breathe it? And table salt's made of sodium and chlorine...sounds like pretty nasty stuff. And is aluminum oxide a soft metallic substance?
No, it's a hard, transparent insulator. And so is silica. One of the things that makes silicon valuable is not only its electrical properties, but it is easy to form insulating layers by oxidizing it.

Geo Kaplan
2010-May-18, 01:33 PM
Hey, hold on a minute. Quartz is basically just silicon oxide... so is a quartz crystal also a semiconductor, like regular silicon?

Quartz (crystalline silicon dioxide) is actually an excellent insulator. It also exhibits an extremely valuable property that silicon itself lacks: piezoelectricity. Discovered by the brothers Curie (before Pierre met Marie), piezoelectricity is what allows quartz to provide the precise timing needed by everything from laptops to wristwatches to cell phones. Quartz is chemically very stable, and so it can maintain oscillation frequencies to ppm accuracy over long periods of time.

In its amorphous form, silicon dioxide provides the critical insulation in integrated circuits. Arguably, the ease with which silicon forms this excellent insulator is a major reason why silicon, and not "superior" semiconductors (such as GaAs), is the favored material for electronics today.

jj_0001
2010-May-18, 06:07 PM
If you're simply downloading the data, say, from online, various algorithms have been used since 1986's zmodem protocol (http://www.omen.com/zmdmev.html) to ensure the integrity of what's saved to one's hard drive matches what existed on the originating drive.


This is a fairly primitive protocol, but one that works rather well (as do most). It has nothing to do with the information lost AT FIRST CAPTURE when you digitize something, so I'm not sure why you point this out????

Parity codes are, in my opinion, a simplification of things like RSC's, and RISC's. Each is good in its own place.

Ara Pacis
2010-May-19, 04:27 PM
Digital storage can last a long time and be easily translated. Consider the digital gesture of off-off-on-off-off. Most people in the world can understand that data with the Mark I eyeball.

Geo Kaplan
2010-May-19, 05:40 PM
Digital storage can last a long time and be easily translated. Consider the digital gesture of off-off-on-off-off. Most people in the world can understand that data with the Mark I eyeball.

You are conflating two things. Yes, discerning the bits themselves is not necessarily a big deal, and has been discussed in this thread, there are known paths to preserving them for essentially arbitrary lengths of time. But discerning bits, and discerning their meaning, are two very different things. If I give you a program written for, say, the PDP-1, you would find running it on your Windows PC a bit of a challenge, especially if I handed you the program on a roll of paper tape.

On top of that, file formats have changed a lot over the short lifetime of the digital age. Etc. The bits can live on forever, but whether or not we can do anything with them is a separate question entirely.

jj_0001
2010-May-19, 06:30 PM
On top of that, file formats have changed a lot over the short lifetime of the digital age. Etc. The bits can live on forever, but whether or not we can do anything with them is a separate question entirely.

This is a separate problem. It is indeed a problem, but it is not a fundamental question of how digital data can be captured and preserved, it is a question of maintaining hardware to do so.

While I don't have a paper tape drive any more, by the way, http://www.aracnet.com/~healyzh/pdp1emu.html will give you a good PDP 1 emulator.

Now, where's my DDP224 emulator?

Ara Pacis
2010-May-20, 04:23 AM
Too subtle?

Philippe Lemay
2010-May-20, 07:07 AM
And water's mostly oxygen by mass. Can you breathe it? And table salt's made of sodium and chlorine...sounds like pretty nasty stuff. And is aluminum oxide a soft metallic substance?
No, it's a hard, transparent insulator. And so is silica. One of the things that makes silicon valuable is not only its electrical properties, but it is easy to form insulating layers by oxidizing it.
Alright alright.. no need to get patronizing, I just liked the idea of super fancy future crystal computer circuits (of the Future!!).

How about carbon nanotubes? I hear those are looking interesting for integrated circuit applications, might they one day replace good ol' silicon?

Geo Kaplan
2010-May-20, 07:23 AM
How about carbon nanotubes? I hear those are looking interesting for integrated circuit applications, might they one day replace good ol' silicon?

Doubtful (how do you arrange for them to be where you want?), but their unrolled cousin, graphene, has a shot (at least as a supplement, if not as an out-and-out replacement). But I'm betting that silicon will dominate for a good while longer (decades).

cjameshuff
2010-May-20, 12:59 PM
Alright alright.. no need to get patronizing, I just liked the idea of super fancy future crystal computer circuits (of the Future!!).

How about carbon nanotubes? I hear those are looking interesting for integrated circuit applications, might they one day replace good ol' silicon?

Today's computers are built on crystalline silicon.
Carbon nanotube computers are unlikely. Even if we figure out how to make high quality carbon nanotube transistors, the nature of their manufacture is a problem. Silicon integrated circuits are a product of various etching and deposition processes done on big wafers containing many copies of a circuit. There's no analogous process for carbon nanotube circuitry...each circuit will have to be wired up a component at a time. Parallelizing the process involves duplicating the machinery. Production of nanotube circuitry will be quite expensive, and expense will scale with the complexity of the circuitry.

They'll probably find use in sensors and perhaps some other situations where a simple circuit is all that's needed, but I don't see computers being made from them. Maybe eventually a simple processor will be made as a lab stunt.

Graphene's a better candidate. It can be produced in layers on a substrate, where it may be feasible to cut it up and form circuitry. And diamond can perform as a semiconductor, and has benefits in heat dissipation. Both have the problem that carbon oxides are gaseous, so you can't just oxidize the surface to form an insulating layer.

cjl
2010-May-21, 12:21 AM
You might be confusing file system overhead with what I'm referring to. In any case, I'd be very interested in what your friend says. Even in the old days without sophisticated error correction, there was already a ~20% overhead for non-user bits/servo. It's only gotten worse as the industry has pushed for larger areal densities. There is no other consumer technology that walks to the edge of the cliff of death on so many fronts simultaneously. Hard drives are a miracle.

OK, I talked to the person I know, and she said that the current overhead is something like 20% of the disk's total space. It's quite a bit worse than I thought, but nowhere near 50% either. They definitely are impressive devices, that's for sure.

Ara Pacis
2010-May-22, 05:37 AM
Alright alright.. no need to get patronizing, I just liked the idea of super fancy future crystal computer circuits (of the Future!!).

How about carbon nanotubes? I hear those are looking interesting for integrated circuit applications, might they one day replace good ol' silicon?

How would you form them. I haven't heard about CNT for logic circuits. Can it be mass produced somehow with some sort of stamping or photo technique?

It's a little off-topic, but I heard Michio Kaku talking about using CNT for high-density batteries.

mugaliens
2010-May-24, 10:14 AM
"graphene" aka graphite has been around for a couple hundreds of years and has found all sorts of solutions. Carbon nanotube not so much, but stay tuned, as research is being done on all fronts.

jj_0001
2010-May-24, 07:39 PM
OK, I talked to the person I know, and she said that the current overhead is something like 20% of the disk's total space. It's quite a bit worse than I thought, but nowhere near 50% either. They definitely are impressive devices, that's for sure.

Except that the overhead isn't "bad", per se. In fact, in order to crowd in the greatest amount of "final" i.e. useful information, it is necessary to use higher densities than can be reliably decoded without FEC (Forward Error Correction) and then use the FEC to recover data reliably. It's an inevetable consequence of the Shannon Capacity Theorem, too. A bit hard to get around. :)

Geo Kaplan
2010-May-24, 10:51 PM
OK, I talked to the person I know, and she said that the current overhead is something like 20% of the disk's total space. It's quite a bit worse than I thought, but nowhere near 50% either. They definitely are impressive devices, that's for sure.

Thanks for relaying your friend's info. Please do note that I said "30-50%." So, the value she gave was close to the lower end of the range I gave, and roughly an order of magnitude larger than the value you cited. Also, it's not clear what's included in her calculation of 20%. If she's assuming headerless servo tracks, then 20% sounds (barely) reasonable to me. I don't know if those are now universally used, but in drives that don't, the overhead can exceed 30% rather easily.

And yes, disk drives are very impressive. It's hard to believe that they really work.

cjl
2010-May-25, 12:45 AM
True enough.

As for the overhead, the 20% figure included all disk space not visible to the user (I just confirmed it with her). Also, I just asked her about the "headerless servo tracks", and she was kind of perplexed as to what you meant - there aren't any dedicated servo tracks in modern drives, and the techniques used are fairly complex (and I don't understand all the details). It's definitely not 30% any more, though I didn't ask about past drives and overhead, so it might have been that high on previous models.

Geo Kaplan
2010-May-25, 12:55 AM
True enough.

As for the overhead, the 20% figure included all disk space not visible to the user (I just confirmed it with her). Also, I just asked her about the "headerless servo tracks", and she was kind of perplexed as to what you meant - there aren't any dedicated servo tracks in modern drives, and the techniques used are fairly complex (and I don't understand all the details). It's definitely not 30% any more, though I didn't ask about past drives and overhead, so it might have been that high on previous models.

There actually are still many drives (high-performance server class) that have a dedicated servo surface, but consumer drives almost always use embedded servos for cheapness. In the latter, the corresponding term of art would be "headerless servo packets" or some such thing.

cjl
2010-May-25, 04:45 AM
It has nothing to do with cheapness, and no modern drive has a dedicated servo surface (and they haven't for more than a decade). The track densities are high enough that if the head on one surface is dead on, the other heads could be dozens of tracks off. You have to use servo data interwoven with the disk data because of this.

Geo Kaplan
2010-May-25, 05:33 AM
{snip} no modern drive has a dedicated servo surface (and they haven't for more than a decade).

Thanks for bringing me up to date -- I do appreciate it. I appear to be stuck in the last decade.

jj_0001
2010-May-25, 05:08 PM
It has nothing to do with cheapness, and no modern drive has a dedicated servo surface (and they haven't for more than a decade). The track densities are high enough that if the head on one surface is dead on, the other heads could be dozens of tracks off. You have to use servo data interwoven with the disk data because of this.

Auto-synchronization is typical of any modern extremely-high-rate (meaning close to the Shannon Bound) modem signal, and requires overhead that can often also be used (at least most of it) for FEC. Modern discs effectively write a binary modem signal on to the disc, it's part of how the density is achieved. The small, fast processors that help the disc read actually do things like deconvolution of the head gap, etc, which also provides information on bit synchronization, before block, track, etc, synchronization are even involved, and long before interleaving and FEC get involved.

Philippe Lemay
2010-Jun-01, 12:40 AM
Question related to the initial topic of debate.

You know those external hard drives that are all so popular? I've been meaning to get one and I was wondering, I noticed how there are small ones that can fit in your pocket, and that need only be plugged into a USB. I also saw others that were a bit bulkier, that held much more data, but that needed to plugged into a USB and a wall socket. Strangely though, the bulky ones are cheaper than the USB only. I guess it's due to a smarter architecture or something that uses the power in the USB more effectively. And the pocket ones do seem more convenient.

So my question is this: Do the more compact USB only drives have a better/worse chance of loosing data since they aren't plugged in directly into the wall? Or does having a wall-plug not matter really?


Also, do the bigger socket-plug-ins have any extra features? Like maybe... password protection or encryption capabilities? Or maybe some of those animated file browsing programs. Sorry if it seems to be getting off-topic, but someone did mention earlier that keeping a charge in the flash-drive would help preserve the data. So I feel it important to ask, does the socket plug in really help.

Geo Kaplan
2010-Jun-01, 01:03 AM
So my question is this: Do the more compact USB only drives have a better/worse chance of loosing data since they aren't plugged in directly into the wall? Or does having a wall-plug not matter really?

Because the power available from a USB socket is rather limited, drives that derive power solely from that source have a tough constraint to satisfy. The chief concession is spindle speed, so performance suffers. Reliability stats are devilishly hard to come by (the MTBF numbers provided by manufacturers appear to be generated by trained monkeys), but I don't think that reliability is greatly affected by power-related concessions (it's a complicated tradeoff space, though).

As to the idea that keeping things plugged in helps, I am dubious, at least when it comes to flash drives. Minimizing read-write cycles (the latter, primarily) matters most, which would argue for keeping the thing powered off as much as possible. Flash drives intended as hard drive replacements implement sophisticated wear-leveling strategies, to avoid damaging a block of cells prematurely from repeated hammering. But I'm unaware of any "refreshing" going on in the background (but I am happy to learn otherwise) that would require constant power-on.

Philippe Lemay
2010-Jun-01, 02:16 AM
I'm also a little comforted in the thought of a USB only drive that I can unplug, do to the thought that in the unlikely even of a... thunderstorm or something, where a lightning bolt hits the power box causing a huge surge that blows up the tower, or whatever, lol. In that event, if my USB drive was unplugged, chances are the data would be safe. (Yes, I am a little paranoid by nature.)

But anything that can blow the tower through a power surge must be at least somewhat capable of damaging the plugged in USB drive. That's what I figure anyway.

cjameshuff
2010-Jun-01, 03:06 AM
But anything that can blow the tower through a power surge must be at least somewhat capable of damaging the plugged in USB drive. That's what I figure anyway.

You can unplug that one too. The USB connection is the same as it is for USB-only hard drives. Same goes for FireWire. The power connection just means there's more to unplug/plug in.

cjl
2010-Jun-01, 03:43 AM
Because the power available from a USB socket is rather limited, drives that derive power solely from that source have a tough constraint to satisfy. The chief concession is spindle speed, so performance suffers. Reliability stats are devilishly hard to come by (the MTBF numbers provided by manufacturers appear to be generated by trained monkeys), but I don't think that reliability is greatly affected by power-related concessions (it's a complicated tradeoff space, though).

As to the idea that keeping things plugged in helps, I am dubious, at least when it comes to flash drives. Minimizing read-write cycles (the latter, primarily) matters most, which would argue for keeping the thing powered off as much as possible. Flash drives intended as hard drive replacements implement sophisticated wear-leveling strategies, to avoid damaging a block of cells prematurely from repeated hammering. But I'm unaware of any "refreshing" going on in the background (but I am happy to learn otherwise) that would require constant power-on.

Generally, the smaller, USB-only drives are based around notebook hard drives, while the larger capacity and size ones that require a separate power plug are based around desktop drives. Both should be roughly equally reliable.

Murphy
2010-Jun-01, 05:12 PM
So modern Hard drives are wonders of engineering are they... Well not in my recent experience.

Just yesterday, I had a major problem with the hard drives on my PC. A totally unexpected fault appeared telling me that my RAID array had failed, the computer is currently unusable (writing this from a laptop), though thankfully all the data is ok and I've backed it up on an external hard drive.

Spent most of yesterday talking to Dell's Indian technicians about the problem, as is usual Dell policy if they can't find the source of the problem (which they couldn't) they just offer to replace the part. Seems it's easier for them to just give you a new one then try to investigate or fix the current one, so I'll be getting replacement drives in a few days hopefully.

And thatís not the only time something like this has happened. Last month my uncleís PC completely stopped working due to a total hard drive failure, luckily he didnít have much data on it and it was an old system he got second hand. Heís now got a laptop, but they seem to be prone to the same thing. Before that, the hard drive on my sisterís laptop (which was only a year old) also broke down. We tried reinstalling Windows and it seemed to work ok for about a week, but then stopped working again, so eventually we bought a new hard drive and that fixed it.

So damn you hard drives! You are the bane of my computing life!:mad:

Geo Kaplan
2010-Jun-01, 05:34 PM
So modern Hard drives are wonders of engineering are they... Well not in my recent experience.
{snip}
So damn you hard drives! You are the bane of my computing life!:mad:

Yes, these things fail, and it's a huge inconvenience when they do. But it's really miraculous that these work at all, so for me, the question is as much "why do these things ever actually work" as "why do these things fail?" :)

But sorry that you suffered a crash. It's no fun, I know. And this probably won't be the last time you experience one, sorry to say.

sabianq
2010-Jun-02, 12:18 PM
hi all,
i would like to point out that information is information and the way it is stored is not really indictive of the information that is stored..

what?

a bar graph is digital information.
so are punch cards
and digital audio signals (remember the (bzzzzzz brttt beep beep beep grzzzzzzzzzz) of the 300 baud modem attached to your TRS-80)
they can be recorded onto a vinyl or gold record and launched into space

i can carve a bargraph into a solid granite surface and the information could potentially be around for millions of years
i can use a concrete drill bit to drill holes into a rock face duplicating the punch card, that could be around for a very long time.

while at the same time a magnetic recording can easily be an analogue recording like an 8 track tape, or cassette tape.

the old laser disks were analogue..

so the medium of recording is completly independent of the information..

however, i think that the OP was asking how long information can stay in a memory device.
i have heard everything from 5 years for MLC SSD's to 1025 years for SLC SSD's
i doubt anyone knows exactly ho long information can last in such non volitile memory sticks.
only time will tell..

cheers!

cjameshuff
2010-Jun-02, 12:51 PM
so the medium of recording is completly independent of the information..

This is not true. There's numerous storage media that are inherently digital. Even hard drives are nearing that point, if not already there...densities are high enough that there's just not enough magnetic domains in the area a bit is recorded to support anything like a continuum of magnetization levels.

And bar graphs are typically analog representations of data...

jj_0001
2010-Jun-02, 11:14 PM
This is not true. There's numerous storage media that are inherently digital. Even hard drives are nearing that point, if not already there...densities are high enough that there's just not enough magnetic domains in the area a bit is recorded to support anything like a continuum of magnetization levels.

And bar graphs are typically analog representations of data...


First, bar graphs have nothing to do with the subject at hand, and little to do with "analog" or "digital" in any real sense.

Second, you're confusing the information (the values of the bits) with the medium they are written on. When you get down to small enough magnetic domains, as current disc drives do, their values are quite probabilistic, hence the use of FEC, etc, in order to recover data. When you get down to low levels, small distances, etc, there is a necessary probabilistic issue, which is neither analog no digital, but rather due to physics.

BUT, the information is the information, be it heads and tails stored in photographs in an album, or bits recovered from a disc.

cjameshuff
2010-Jun-03, 01:30 AM
First, bar graphs have nothing to do with the subject at hand, and little to do with "analog" or "digital" in any real sense.

sabianq mentioned them, claiming that they were digital representations. They are most often analog representations, though they can be either.



Second, you're confusing the information (the values of the bits) with the medium they are written on. When you get down to small enough magnetic domains, as current disc drives do, their values are quite probabilistic, hence the use of FEC, etc, in order to recover data.

No, I'm not. A static RAM cell is a bistable circuit, if it is not in one of two states it is just broken. Antifuse PROM cells are either blown or not blown. At the levels hard drives are approaching, only one of two states is possible, it is approaching an inherently digital medium. Forward error correction is for correcting errors. FEC is simply a way of dealing with the poor reliability of that state being the intended one. There are numerous other storage media that are fundamentally incapable of storing continuous values.

You can't store an analog value in the state of an antifuse, the magnetized direction of a toroidal core or magnetic dot. Modern hard drives even do something like a magnetic analog to the SRAM flip-flop cell in how they structure the magnetic films on the disk platters, making the magnetization stable in one of two states, because the magnetization patterns would otherwise "leak" and dissipate or move around.

sabianq
2010-Jun-03, 03:52 AM
this attachment is a barcode 2d and holds 1.9Kb of data.
how is this attachment, this 2d picture of a bunch of blocks not a way to store digital data?

is this not digital data?

my argumant is data is data regardless of how it is stored,
so depending on wheather it is carved into a stone or imprented onto a magnetic surface or floating as a flipped electron spin in a quantum pocket...

data is still data, and the medium for which it is written on will directly dictate its longevity..

is this incorrect?

sabianq
2010-Jun-03, 03:55 AM
i meant "barcode" and not "bar graph" BTW

sabianq
2010-Jun-03, 03:58 AM
this attachment is a bar code 2d and holds 1.9Kb of data.
how is this attachment, this 2d picture of a bunch of blocks not a way to store digital data?

is this not digital data?

my argument is data is data regardless of how it is stored,
so depending on whether it is carved into a stone or imprinted onto a magnetic surface or floating as a flipped electron spin in a quantum pocket...

data is still data, and the medium for which it is written on will directly dictate its longevity..

is this incorrect?

reading it is a different matter altogether..

sabianq
2010-Jun-03, 04:03 AM
well in the audio world, i could argue that 1bit or Direct Stream Digital sound recordings are indeed a virtual digital analogue of an analogue recording...

<snicker>
http://en.wikipedia.org/wiki/Direct_Stream_Digital

i love DSD..

jj_0001
2010-Jun-03, 10:05 PM
At the levels hard drives are approaching, only one of two states is possible, it is approaching an inherently digital medium. Forward error correction is for correcting errors. FEC is simply a way of dealing with the poor reliability of that state being the intended one.

Um, any storage has the ability to be an ambiguous state at any given instant. The smaller the storage medium, the more likely it is to be in an ambiguous state. Really. You can't avoid things like the size of molecules and the charge on the electron.

FEC is a way of recovering the DATA from a NECESSARILY NOISY signal. That's all it is. It's part of the process, and the 'errors' you refer to are intrinsic, necessary, required, obligatory, etc. That is, unless you've created smaller atoms and electrons with smaller charge...

Simply making an authoritive statement makes it hard for me to understand where you're making a mistake, by the way.

jj_0001
2010-Jun-03, 10:06 PM
well in the audio world, i could argue that 1bit or Direct Stream Digital sound recordings are indeed a virtual digital analogue of an analogue recording...

<snicker>
http://en.wikipedia.org/wiki/Direct_Stream_Digital

i love DSD..

DSD is noise shaped PCM.

No more, no less.

There's nothing special about DSD or SACD at all.

For some discussion on this issue please see www.aes.org/sections/pnw/ppt.htm and look for the convertor tutorial. I agree it's a touch terse without the 3 hour lecture that goes along with it.

jj_0001
2010-Jun-03, 10:07 PM
reading it is a different matter altogether..

Yep, and therein lies the problem with some forms of storage. Anyone got an 8" floppy drive? (btw, that's rhetorical, I long-since recycled my 8" media)

Van Rijn
2010-Jun-03, 11:00 PM
Yep, and therein lies the problem with some forms of storage. Anyone got an 8" floppy drive? (btw, that's rhetorical, I long-since recycled my 8" media)

A friend keeps an 8" floppy on the wall, and gets an occasional "What's that?" question. Or, less often, "I haven't seen one of those in decades!" (which is what I said).

I still have an Apple II 5 1/4" floppy drive around somewhere, and some Apple II floppies. I haven't touched it in years, but the last time I did, some of the floppies were still readable. Apparently the low data density helped.

(I had copied some off to images for an emulator on the PC.)

Ara Pacis
2010-Jun-06, 05:39 PM
this attachment is a barcode 2d and holds 1.9Kb of data.
how is this attachment, this 2d picture of a bunch of blocks not a way to store digital data?

is this not digital data?

my argumant is data is data regardless of how it is stored,
so depending on wheather it is carved into a stone or imprented onto a magnetic surface or floating as a flipped electron spin in a quantum pocket...

data is still data, and the medium for which it is written on will directly dictate its longevity..

is this incorrect?

Makes sense to me. The most basic definition of digital is discrete units versus continuously variable. Of course, most things that we think of as continuously variable can be broken down into piles of discrete units. So, a bar graph can be digital. It would seem to me that reality is actually digital since we do get down discrete states with particle physics and quantum physics, (probabilistic issues notwithstanding).

jj_0001
2010-Jun-08, 06:19 PM
It would seem to me that reality is actually digital since we do get down discrete states with particle physics and quantum physics, (probabilistic issues notwithstanding).

Except that it's those probabilistic issues that make everything probabilistic at the lowest levels, and as such, it's an odd kind of system, in that there are discrete events, but the chance of those events is probabilistic in a very easily visible way.

However, the Shannon Theorem and work around it shows us how to preserve information to more or less a given certainty in that situation.

And the other point, of course, is that once you recover the binary data and rewrite it, you've fixed any potential problems for the time being.

So, confusing the data with the storage mechanism is really a mistake.

Ara Pacis
2010-Jun-10, 04:06 AM
I was thinking about the plank length and the electron being in one orbit and then another but never in between. I wonder if there is a discrete level of resolution for waveforms.

jj_0001
2010-Jun-10, 06:09 PM
I was thinking about the plank length and the electron being in one orbit and then another but never in between. I wonder if there is a discrete level of resolution for waveforms.

Well, you mean "percentage chance of being in one orbital vs. another". Don't forget that the basic uncertainty means that you can't even begin to say "this is certainly in 'x' orbital". You can only say "most likely". This is, after all, part of how things decay to lowest energy orbitals.

Not sure what you mean by "discrete level of waveforms", as you get to low levels, all you can specify is the mean level, and the sigma of that rises with decreasing time intervals.