Skip to content

How To Build An Archive

Paul Joiner edited this page Nov 11, 2022 · 3 revisions

💡 NOTE: Starting with version 0.4.0 it is possible to create archives using existing data.

Building an archive is a complex process involving multiple steps to generate data files, meta data and compression.

Classes use a fluent builder pattern to make the code readable and easier to create. The archive may be built up using the following steps.

  1. Build Field Meta Data.
  2. Build File Meta Data
  3. Build Row Data and Write to File
  4. Build Archive

The following example creates a simple archive containing one data file with the following data. Full examples may be found in the ArchiveWriterTests.cs file under unit tests.

Taxon.txt

taxonID,vernacularName,language
"123","sperm whale","english"
"123","cachalot","french"
"124","gray whale","english"

Build Field Meta Data

Use the FieldMetaDataBuilder class to build meta data for a field. The index indicates what order the field should appear in the file.

var fieldMetaData = FieldMetaDataBuilder.Field()
    .Index(0)
    .Term(Terms.Identification)
    .Build();

Use the FieldsMetaDataBuilder class to build a collection of field meta data to represent and entire file. A FieldMetaDataBuilder object is passed to the AddField method to fill in terms as follows. The AutomaticallyIndex method indexes fields in the order they are added.

var fieldMetaDataBuilder = FieldsMetaDataBuilder.Fields()
    .AutomaticallyIndex()
    .AddField(_ => _.Term(Terms.taxonID))
    .AddField(_ => _.Term(Terms.vernacularName))
    .AddField(_ => _.Term(Terms.language));

Build File Meta Data

Here we define the metadata for the file such as field separators, encoding and filename. Also, add the field metadata created above using the AddFields method.

var fileMetaData = CoreFileMetaDataBuilder.File("taxon.txt")
    .FieldsEnclosedBy("\"")
    .FieldsTerminatedBy(",")
    .LinesTerminatedBy("\\n")
    .IgnoreHeaderLines(1)
    .Encoding(Encoding.UTF8)
    .Index(0)
    .RowType(RowTypes.Taxon)
    .AddFields(fieldMetaDataBuilder);

Build Row Data and Write to File

Next we need to define what data is written to the file and how. First create a FileBuilder object and pass the FileMetaDataBuilder object that was created in the previous step. Finally, create a delegate function to write the data. In the following example whales is an enumeration of whale objects that contains the data to be written to the file. Iterate over the enumeration and build each row using the supplied RowBuilder object.

var coreFileBuilder = FileBuilder.MetaData(fileMetaData)
    .BuildRows(rowBuilder => BuildCoreRows(rowBuilder));

IEnumerable<string> BuildCoreRows(RowBuilder rowBuilder)
{
    foreach (var whale in whales)
    {
        yield return rowBuilder.AddField(occurrence.OccurrenceID)
                    .AddField(whale.TaxonId)
                    .AddField(whale.VernacularName)
                    .AddField(whale.Language)
                    .Build();
    }
}

Alternatively, if the data file already exists the UseExistingFile method of the FileBuilder object can be used to indicate which file to add to the archive. Just make sure the metadata definition matches the file contents.

var fileBuider = FileBuilder.MetaData(fileMetaData)
    .UseExistingFile("./resources/whales/whales.txt");

Build Archive

Finally, assemble the above objects using the ArchiveWriter class and provide a file name as follows. Additional data files may be added using the AddExtensionFile method and other files, such as, eml.xml or license.txt may be added using AddExtraFile.

ArchiveWriter.CoreFile(fileBuider, fileMetaData)
    .Build("whales.zip");

Setting a Builder Context

By default the data files and meta.xml files are created in a temporary directory which is deleted after the files are archived to the final zip file. If you would like to keep the uncompressed files that are created or want to create a custom working directory then create a custom BuilderContext class and pass that instance to the FileBuilder and ArchiveWriter classes.

var context = new BuilderContext(".", false);

var coreFileBuilder = FileBuilder.MetaData(coreFileMetaDataBuilder)
    .Context(context)
    .BuildRows(rowBuilder => BuildCoreRows(rowBuilder));

ArchiveWriter.CoreFile(coreFileBuilder, coreFileMetaDataBuilder)
    .Context(context)
    .Build(archiveName);