This article is being written in response to a growing need for data managers and explorationists to gain information regarding the storage of digital seismic field and stack data on optical disk. There are two distinct areas involved in this storage, one is the long-term archive of the data, the other is the distribution of the same.
A distinction must be made between archival and distribution of data. Large format WORM (write-once, read-many) optical disks have in essence become the storage warehouse for seismic data for many oil companies. WORM optical disks have the capacity to store hundreds of field lines and thousands of stack files on them.
This compaction of data requires a different data distribution scheme to be followed. It is not feasible to send the original media that stores the seismic line to another site for reproduction. No longer would one field line or stack file be sent out, but in reality, hundreds of lines and thousands of stacks would be on the move on that piece of media. We would all agree that this would not be wise in the least.
If one assumes the original data stored on the optical disk should not leave its secure storage site then we must distribute the data on some intermediate media.
At present Vector outputs data to 9-track tape, M/O (magneto-optical) disk. 8MM tape, 3480 tape, DC2120 tape, floppy disk, 4MM DAT tape. and CD-R (CD Recordable). The volume of data distributed by each media is as in the order listed. To date reusable media has been the media of choice for our clients to distribute data.
Some of the data distributed is of the non-volatile nature (e.g. stacks to load on a work station) while some is of a semi-volatile nature (e.g. copies of field data). In all cases, Vector's clients have chosen to store their volatile data (the original) on WORM Optical Disk. The majority of Vector's clients have taken the next step and discarded the original field and stack tapes.
It is obvious that the whole method of distribution will change over the next few years. Data will travel to its intended destination over telephone lines or fiber optic networks. No longer will a transfer media be important. data will be interchanged between storage facilities, seismic processing companies. and oil companies electronically to a destination hard disk and then to the customers archive media of choice or the appropriate seismic processing or interpretation system.
High compaction ratios for the archive site will be the order of the day, the storage footprint must be reduced. This means that the choice of media will come down to what can store the most at the least effective cost. Oil companies will make a simple choice, should they utilize large scale optical disk systems or tape systems.
The scope of this article is not to investigate the relative merits of optical disk versus large format tape systems but assumes the decision to go "Optical" has been made.
Once this decision is made it is important that oil companies interested in utilizing optical disks consider many factors before they commit to a particular system or media.
- The type of media best suited for long term critical data storage.
- The cost of the media (in real terms).
- The long-term integrity of the media
- The security of the data.
- The compaction of storage space and number of items stored.
- The ease of retrieval and seismic processing afterwards.
1. Type of Media
WORM (write-once, read-many) Optical Disk
This is a large format (6.4 gigabyte) WORM optical disk in wide usage in Calgary. The disks used by Vector are manufactured by SONY Corporation and have been in service at Vector for a period of five years. During this time-frame there has been one documented read error. To date over 10000 field tapes and in excess of 30000 stack files have been output and recreated bit for bit by this system. The disks can be written up to their capacity of 3.2 gigabytes per side in any number of sessions (number of times disk may be accessed). There are two software systems in use in Calgary at present. These systems have been developed by Lacey Digital Technologies Ltd. and Vector Archives Ltd.. Specific questions regarding their formats and systems can be directed to the above companies at the address and persons listed following:
Terry R. Lacey
Lacey Digital Technologies Ltd.
Box 45, Site 15, RR9
Calgary, Alberta T21 5G5
Vector Archives Ltd.
130, 840 7th Avenue S.W.
Calgary, Alberta T2P 3G2
M/O (magneto-optical) Disks
This is a smaller format (1.3 gigabyte) optical disk in wide usage in Calgary as a data input and output device on work stations. It is a completely reusable media in that it can be over written for a million cycles. It is generally not used in the oil industry as an archive device but mainly as a data transfer or system backup device. Generally speaking they hold SEG-Y stack files and are loaded using standard computer industry driver programs. In usage they have proven to be very reliable but not to the degree of WORM disks. This media has been used for five years, at first in 600 megabyte format at slow speed to the present 1.3 gigabyte capacity at high speed.
CD-R (Compact Disk Recordable) Disks
This is a smaller yet optical disk (540 to 640 megabytes) which simulates a WORM Disk. It is not to be confused with the CD-ROM (Compact Disk Read Only Memory) Disks in wide usage in the software distribution and music industries. CD-ROM Disks are a pressed disk that is manufactured in a plant for wide scale distribution. It is a mechanical stamping process, not a recording. CD-R was first introduced to enable software companies the ability to master their own CD-ROM disks so they could be tested before the expensive process to make the dies to stamp the CD-ROMs was undertaken. Some software companies then began to use them for low volume distribution of software, and now companies have started to use them for long-term storage of data. CD-R disks are not yet in wide circulation as a long- term archive medium and many factors may contribute to their success or failure. CD-R disks have been available for approximately two years.
2. Cost Per Megabyte
|WORM MEDIA||M/O Media||CD-R MEDIA|
|6.7 cents||10.9 cents||5.2 cents|
The costs were calculated based on certain criteria that are not necessarily apparent. The WORM disk and M/O disk are easily written to many times while the CD-R is basically a one pass operation where additional sessions are more difficult to obtain. In fact as more sessions are written space is wasted because of overhead and their total theoretical maximum capacity is diminished. The smaller the media the more waste is incurred as it becomes impossible to find the "right" piece of data to fill the last available space on the disk. The factor of 95%, 85%, and 70% fill ratios for WORM, M/O, and CD-R respectively have been used for the above comparison. If the cost of a caddy is included in the price of the CD-R it would increase its per megabyte cost to 7.3 cents (more than WORM).
3. Long Term Integrity Of The Data
This is a problematic topic to discuss at best. Media manufacturers quote life expectancy based on accelerated test cycles under very tightly controlled conditions. It is senseless to say a particular piece of media has a shelf life of some period of time provided that it is never handled, is stored at one temperature, one humidity and in the absence of any external influence.
Optical media should not be subjected to rapid temperature and humidity changes and definitely not to condensation and chemical contamination. There are specific guidelines for storage conditions for all media and these specifications differ from one manufacturer to another.
On the surface CD-R's may have the most potential for early failure due to handling and storage problems. The handling problems are due to the fact they are not stored in a caddy and need to be physically touched to be used. Even if placed in a caddy they would have to be removed to be read in many CD Disk Drives. WORM Disks and M/O Disks are stored in protective enclosures with interlocks that need to be defeated in order to touch the disks physically. In the end the most telling factor of reliability may be in the user's hands (literally).
It is because of this "handling problem" that CD-R manufacturers have gone to great lengths to test their media for scratch resistance. Manufacturers of CD-R media have documentation to show their resistance to scratching and the possible catastrophic failures caused by same.
CD-R disks in general seem to have a major problem when exposed to sunlight (they fail after some period of time). This time to fail in sunlight varies from CD-R manufacturer to manufacturer as based on their test results.
Yet another consideration is what speed CD-R's are recorded at, 1X, 2X or 4X (speed of recording is based on 1X being 63 minutes for a 63 minute corresponding audio CD-ROM, 2X twice as fast, 4X four times as fast). Does this make the recording less reliable with a corresponding shorter life span? Can a IX CD-ROM player read a 4X recorded disk reliably?
There is also data to suggest that certain types of CD-R media react differently to different recorders.
4. Security Considerations
Security of data involves many aspects some of which are:
- Can the data be stolen easily?
- Can the data be retrieved and output to the original media (e.g.: 9-track tape) on a bit for bit basis?
1. As we compact data to a greater and greater extent and if we follow our conventional methods of the past, we may well expose valuable assets to the threat of theft. Couriers could be traveling the streets of Calgary with literally hundreds of seismic lines or thousands of stack files in their backpacks.
If the data is stored in such a manner and on media that is not readily available to anyone the security risks will be lessened though not eliminated.
The worst possible case we could have is where we ship data around the city with all the information needed to identify it stored on the disk and having tens of thousands of people able to read the disks. It is a security nightmare in the making.
Data archived to the disk should not have any identifying information stored on it. The contents of the disk should be stored in a database on a secure computer system, backed up and not on the storage disk itself.
In any event when you compact data you should not send that piece of media out into the world anymore, if data is required to be sent out, the individual line or lines should be delivered on the appropriate transfer media.
Based on the preceding, remarks the relative security risks are:
|WORM MEDIA||M/O Media||CD-R MEDIA|
The assumptions made are based on the fact that it is highly unlikely many people have access to WORM Drives and the technology to read them, more have access to Mia drives and the general public can read and access CD-R disks. If data is stored on CD-R it will become available to anyone with a $1000.00 computer system.
Proprietary software systems that may cause concern for some oil companies in other areas have an added security benefit in that they are the most secure systems.
2. The matter of retrieval of data and its integrity and the ease of processing the data afterward should be investigated thoroughly. It is suggested that companies interested in data conversions to optical disk ensure whatever format or system is used can reproduce the original media on a bit for bit basis.
5. Compaction of Storage Space
Any current archive scheme should attain the highest possible compaction of data. The relative compaction of the optical disk media is as follows.
|WORM MEDIA||M/O Media||CD-R MEDIA|
This means that if 100 tapes would fit on a WORM then 18.2 tapes would fit on an Mia and 6.3 on a CD-R. On a 60.000 tape archive project if there was a 200 to I tape to WORM Disk compression ratio (typical) the number of disks required to hold the data would be as follows.
|WORM MEDIA||M/O Media||CD-R MEDIA|
|300 disks||1648 disks||4762 disks|
6. Data Integrity and Ease of Recovery
An initial assumption must be made that any data stored on optical disk must be able to be returned to its original state. This assumption has to be made as there is no SEG or CSEG standard that allows for the conversion of tape based seismic data to disk based data at the present time. In the absence of such a standard any system that cannot reproduce the input bit for bit should not be considered.
Altering data so that it does not match observer notes should be avoided at all costs. Seismic processing is based on the careful analysis of the observer notes as to what is contained in the headers and body of the data. If this relationship is lost or masked, processing will become more difficult, time consuming and costly.
The method of transcribing data from the original media to the optical disk should be uniform for all types of data (SEG-Y. SEG-B, SEG-A, SEG-C, SEG-X, IBM TRACE SEQUENTIAL, SEG-D (BOTH TRACE SEQUE TIAL AND MULTIPLEXED) ETC.). This is vitally important in the case of a transcription mishap whereby the data could be archived using the wrong format or program. In addition, if the data is stored on the optical disk in an homogenous state (all the data with the same structure) conversion from tape based seismic processing systems to disk based seismic systems would be simplified.
Any system based on random access media such as optical disk must be able to retrieve data in a true random fashion easily and effectively to the trace level.
Seismic lines should be stored in a concatenated state (one file for many tapes) on the optical disk with a minimum number of files per line. This achieves two objectives:
- Ease of cataloging the seismic data
- Ease of seismic processing whereby the data is in a single file.
So-called proprietary systems should be avoided unless caution is taken to ensure that the data can be recovered by someone other than the supplier of the proprietary system. Oil companies should receive cost increase limits and have specialized software put into escrow accounts so that if the vendor ceases operation or contract problems occur they have their own means available to effect the continued recovery of their data.
The author has written this article owing to the rapidly emerging interest in the use of optical disk technology as it relates to the seismic industry and a misconception by many people that CD-ROM and CD-R are the same product.
Optical disks that may be written to exist in different forms: WORM, Magneto-Optical, and CD-R. We believe these medias are useful for different situations. Listed following is a summary of the issues to consider with respect to data acquisitions, data distribution, project backup, and data archival.
- Rugged media, ability to withstand field conditions, shipping and handling
- Readable by any processing or archiving company.
- Easy to read by recipients (format and hardware compatibility).
- Reliable, durable media, able to survive shipping and handling.
Project Backup for Workstations:
- Long life and reliability.
- High capacity.
- Random access.
- Multi-session capable (no limit to number of updates).
- Long life.
- Cost effective.
- Rapid retrieval.
- Random access of data.
- High compaction ratio.
A Data Manager needs to equip himself with sufficient information to decide which media is best for which data situation. In this article I have attempted to supply the Data Manager with enough information so that he may know what questions to ask of potential suppliers of data archive and distribution services.
Any decision to transcribe conventional tapes to optical disk should not be undertaken lightly. What on the surface may appear to be a viable solution to data storage problems may actually be a path to greater problems in the future.
Investigate, investigate, and then investigate!!!
The writers offer the following suggestions for suitable technology (optical or otherwise) for the different data situations based on current technology. These suggestions are based on the preceding article and aforementioned issues and the knowledge that Vector Archives Ltd. has gained in becoming the largest optical disk based storage and distribution company in Calgary.
We can not emphasize enough however, that these recommendations are ours alone and you, the owner of the data has the responsibility to ensure its integrity.
- 9 track tape
- 3480 tape
- Magneto-Optical disk drive
- 1. 9 track tape.
- Electronic data transfer to M/O Rewritable disk.
- CD-R (with mention made of potential durability problems, expense of media due to limited multi- session recording capabilities and being non rewritable.
To its advantage, low cost CD Players are available.)
Project Backup (as on workstations)
- Magneto-Optical disk drives
- 8MM tape
- WORM Optical Disk
About the Author(s)
Garth Paterson has a Diploma in Electronic Technology from Red River Community College in Winnipeg, Manitoba. Garth has been involved in the seismic data processing business for almost 20 years. He worked for Raytheon Corporation and then its subsidiary, Seismograph Service Corporation, for several years, rising from Computer Maintenance Engineer to Canadian Manager over that period of time. He was the President and founder of Vector Technology Ltd., a seismic processing company which was sold in 1993. Vector Archives Ltd. was co-founded by Mr. Paterson and Mr. Joseph Wong in 1987, the company has been instrumental in putting optical disk archiving into the mainstream of exploration activities over the past six years. There are currently over 30 oil companies who have chosen Vector's system to archive their data.