[Stgt-devel] Thoughts on vtl filemarks

ronnie sahlberg ronniesahlberg
Sat Jul 19 08:19:20 CEST 2008


I do think that filemarks should be ahndled differently from how how
the individual blocks are.
Since I dont we would want to actually read the file sequentially
gigabyte after gigabyte to find where the filemark is.
We would want to be able to just do a fseek() and get there
immediately  but then we must know where all the filemarks are.

One way this can be done is using a fixed size array of filemarks that
specify the offset into the file.
uint64_t filemarks[1024]
And having filemarks[x] == 0 meaning no more filemarks.

This array could then either be stored in the header of the file or as
an extended attribute attached to the file.
This way this metadata is associated and tied to the data by belonging
to the same file.


There are probably SOME types of metadata which would be preferable to
share code with MMC to store.
Things like "barcode" and "media serial number" etc.
For mmc these can not be stored as a header in the file  since this
would mean that the file is no longer a
"normal iso image" and these would for mmc need to be stored in either
an extended attribute or in a separate file.


If there is not too much metadata I think it would be attractive to
use extended attributes since this guarantees that
the data and metadata are never separated or out of sync.
Using mulitple files always has the risk that you rename one of the
files but forget to rename the other, or such .
They always end up getting out of sync.


On Sat, Jul 19, 2008 at 8:00 AM, Mark Harvey <markh794 at gmail.com> wrote:
> On Sat, Jul 19, 2008 at 3:52 AM, Albert Pauw <albert.pauw at gmail.com> wrote:
>> Just a few thoughts on the vtl stuff.
>>
>> Filemarks. As filemarks are usually marked by a gap in the recording signal
>> on tape it imposes
>> a problem when using a file as a virtual tape. Since there needs to be some
>> metadata to be retained
>> when writing a virtual tape (on a file) I guess the best way is to use a
>> header containing various
>> information. Partly it will contain the same stuff as the usual chip build
>> in eg an LTO tape.
>> One could think of max size, barcode, tape type, compression on/off, etc.
>> Another addition to the header could be a table
>> of filemarks (sorted in increasing order). This table could be done in a
>> fixed style, ie so many
>> filemarks but no more, or a variable table. This last possibility is a bit
>> more flexible, but gives
>> the extra disadvantage of a possibly increasing table positioned before the
>> actual data in the file
>> (which needs to be moved when it increases). Since the size of indexpointer
>> in a file is of size off_t (long)
>> and 64 bits (if I am correct) every filemark will add 8 bytes. So the
>> question will be how many
>> filemarks do you want on the tape?
>
> The other thing that needs to be tracked is the original block size
> written (for tape drives in variable block mode).
>
> The sense code needs to return the number of under/over run bytes in a
> subsequent read.
>
> e.g. NetBackup tape format:
> BOT == Beginning of Tape
> FM == Filemark
> EOD == End of data
>
> [BOT][1k media header][FM][x byte block of data][x byte block of
> data][FM][FM][EOD]
>
> When NetBackup loads a tape, it first requests a 'read 64k' data. If
> it does not receive 1k with appropriate sense code, it assumes the
> media is NOT a NetBackup written tape and rejects the media.
>
>
> BackupExec and Legato NetWorker both use fixed block writes so this is
> not an issue.
>
>
>>
>> Another point is when writing over a filemark (ie erasing the previous
>> filemark), how do we
>> keep track of this in an easy way?
>
> As already pointed out, the last write indicates the EOD and any data
> beyond this point is not readable (for real tape drives anyway)..
> So all data (and pointers) beyond the current EOD should be invalid.
>
> Having an extra metadata file means trying to keep track of which
> pointers are valid/invalid.
>
>>
>> Of course set up correctly, skipping over filemarks would then be a doddle.
>> Using a special character(s)
>> to mark a filemark would be easier, but a) it needs to be very special and
>> never appear in normal data,
>> b) skipping filemarks would be a full search through the virtual tape,
>> generating lots of IP traffic on its way.
>>
>> To create such a header for an empty tape a small separate program is needed
>> to "format" the new tape, ie creating a file with the header info.
>
> This is the way I have previously handled this. i.e. The 'backing
> store' is sort of like a double-linked-list of headers.
>
> struct blk_header {
> int blk_type;  /* type of header - BOT, EOD, Filemark etc... */
> int blk_size;  /* Size of data block */
> long blk_number /* Keep track of block numbers for position */
> struct blk_header *blk_prev; /* Pointer to previous block header */
> struct blk_header *blk_next; /* Pointer to next block header (NULL for EOD) */
> }
>
>
> So the backing store datafile looks like :
>
> [blk_header (BOT)][MAM data][blk_header (block 0)][data][blk_header
> (block 1)][blk_header (filemark)][blk_header (EOD]
>
> i.e.
>  I've squeezed in the MAM data before the first block of 'real' data.
>  There is no need to read the data, just the blk_headers when positioning
>  Identifying the 'blk type' as filemark/setmark/<insert type here/ is simple.
>
>
> Disadvantages with this is it is only suitable for SSC device - i.e.
> It would be nice to mount an ISO image with the MMC device. Adding an
> extra metadata file along with the ISO image would be better for
> CD/DVDs.
>
>>
>> Any thoughts?
>>
>>
>> Albert
>>
> _______________________________________________
> Stgt-devel mailing list
> Stgt-devel at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/stgt-devel
>



More information about the stgt mailing list