Hi,
I have an xml importer utility to insert "people" records and FK-related child records into my database. We've defined an xml structure (for this import xml file) that contains:
FirstName
LastName
PeopleBadgeId
...
Photos
FilenameOfPhoto1
FilenameOfPhoto2
...
...where photos is a collection of images...up to like 1MB in size...that are stored in the database in a "binary" column (e.g. varcharmax).
In the XML, all the text data (FirstName, LastName, etc.) is held in text, and the photos are referenced by .jpg filename held in the xml (photos files all assumed to be in the same directory).
The importer works fine with a home-brew data layer, and we're trying to convert it to llblgen.
With the home-brew, we initiate a transaction at the very beginning, then one-by-one INSERT a person record (using the handed-in transaction object). In addtion to INSERTing the person record, all child pictures are pulled (per the file name in the xml for that employee's pictures) from .jpg files on hard disk and INSERTed into the pictures table.
Once all employees are processed, the transaction COMMIT is done to complete the process. If a failure is found, a ROLLBACK is requested.
So the efficiency question:
I can have 100,000 or 200,000 people in an xml import, so I'm concerned about machine memory.
With home-brew, the xml text (not photos) is all pulled into memory, then each person (and their pictures) is added to a transaction. Accordingly, aside from the xml memory requirements, the data layer is not taking much memory.
If I create an entity for each person and add their pictures to the entity, then add all of these entities to a unit-of-work, I believe that would cause a huge amount of memory to be consumed.
Thoughts on the best way to be somewhat efficient (efficient enough I can run this on a resonable memory equipped machine)?
Thanks!