Fetching POCO classes instead of entity classes

Posts   
 
    
ddp74
User
Posts: 9
Joined: 04-Jan-2024
# Posted on: 19-May-2025 23:10:58   

V5.11. .net 4.7.2.

We load a lot of data and I've been experimenting with QuerySpec and POCO classes and the difference memory consumption is night and day.

For example, loading 12 million full entities via an EntityCollection (using adapter) for a particular call (I know - it's a rediculous amount of data) is causing our worker processes to jump to >10GB of memory. The table is just 5 IDs and a string column.

// Build the bucket
IRelationPredicateBucket bucket = new RelationPredicateBucket();
bucket.Relations.Add(CustomListOptionMappingValueEntity.Relations.CustomListEntityUsingListIdListVersion);

// Handle the sub query
HandleSubQuery(bucket);

// Get the collection
EntityCollection<CustomListOptionMappingValueEntity> collection = [];
_adapter.FetchEntityCollection(collection, bucket);

If I use QuerySpec, the memory barely grows more than a few hundred MB.

IEnumerable<CustomListOptionMappingValueRecord> collection = ((DataAccessAdapter)_adapter).FetchQuery<CustomListOptionMappingValueRecord>(
       "SELECT ListId, ListVersion, CostSetId, MappingId, OptionId, MappedValue " +
       "FROM RP_CustomListOptionMappingValue"
);

However, this code is simplified for the test and it actually goes through a factory that previously would generate a bucket with a sub query and then be re-used for whatever predicates and expressions we provided to it.

Is there a way to use an IRelationPredicate bucket with FetchQuery? My POCO class CustomListOptionMappingValueRecord has identical property names to the CustomListOptionMappingValueEntity, just without all the LLBLGen stuff.

I also tried a projection, which works, but takes about 18 seconds vs 4 seconds for the FetchQuery:

List<CustomListOptionMappingValueRecord> collection = new List<CustomListOptionMappingValueRecord>();

ResultsetFields fields = new ResultsetFields(6);
fields[0] = CustomListOptionMappingValueFields.CostSetId;
fields[1] = CustomListOptionMappingValueFields.ListId;
fields[2] = CustomListOptionMappingValueFields.ListVersion;
fields[3] = CustomListOptionMappingValueFields.MappedValue;
fields[4] = CustomListOptionMappingValueFields.MappingId;
fields[5] = CustomListOptionMappingValueFields.OptionId;

List<IDataValueProjector> valueProjectors = new List<IDataValueProjector>(
[
    new DataValueProjector(CustomListOptionMappingValueFields.CostSetId.Name, 0),
    new DataValueProjector(CustomListOptionMappingValueFields.ListId.Name, 1),
    new DataValueProjector(CustomListOptionMappingValueFields.ListVersion.Name, 2),
    new DataValueProjector(CustomListOptionMappingValueFields.MappedValue.Name, 3),
    new DataValueProjector(CustomListOptionMappingValueFields.MappingId.Name, 4),
    new DataValueProjector(CustomListOptionMappingValueFields.OptionId.Name, 5),
]);

DataProjectorToCustomClass<CustomListOptionMappingValueRecord> projector = new DataProjectorToCustomClass<CustomListOptionMappingValueRecord>(collection);
_adapter.FetchProjection(valueProjectors, projector, fields, bucket, 0, false);

Naively in my mind, if I have a bucket and an open adapter, there must be a way to generate the SQL that LLBLGen would generate and then pass that SQL to FetchQuery without having to hand write the SQL myself?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39877
Joined: 17-Aug-2003
# Posted on: 20-May-2025 09:05:22   

15M is a lot smile With all the extra machinery required to keep track of changes for all these entities, memory is indeed going to be hammered. We're not doing that bad tho... https://github.com/FransBouma/RawDataAccessBencher/blob/master/Results/20231218_net8.txt#L104 but yeah it's not ideal if you just want to consume the data.

Besides the plain sql api, you can also use the poco projections using just the class to project into. The core difference with the projection you defined is that it generates a projection lambda that's much faster than the interpreted projectors you defined.

You likely want to use the ProjectionLambdaCreator, see: https://www.llblgen.com/Documentation/5.12/LLBLGen%20Pro%20RTF/Using%20the%20generated%20code/QuerySpec/gencode_queryspec_projections.htm#projectionlambdacreator

If your target type has more fields, you can also specify the list of fields (it's documented right above that on the same page).

QuerySpec allows you to build predicates in memory using extension methods, so you don't have to write a single SQL statement. The queryspec query you'll be defining has an extension method .Where(predicate) which accepts a predicate (so any predicate expression) which you can build with either the oldskool low level API or queryspec's extension methods.

So your query becomes something like this:

var qf = new QueryFactory();
var q = qf.CustomListOptionMappingValueEntity
            .From(QueryTarget.InnerJoin(qf.CustomListEntityUsingListIdListVersion)
                            .On(... on clause predicate))
            .Where(... where predicates)
            .Select(ProjectionLambdaCreator.Create<CustomListOptionMappingValueRecord, CustomListOptionMappingValueFields>());

Where you have to fill in the on clause predicate for the relationship and the where predicate. You can specify a relation object in the .From() call as well btw. Of course you can specify other sql constructs as well without the need to fall back to plain sql simple_smile

Our Linq provider can handle all that as well btw. Both queryspec and Linq effectively generate code for the projection at runtime (once, the projection lambda code is cached) which is very fast and has a low memory footprint

Frans Bouma | Lead developer LLBLGen Pro
ddp74
User
Posts: 9
Joined: 04-Jan-2024
# Posted on: 20-May-2025 09:51:23   

Thanks for the reply! I was going down the route of writing one IGeneralDataProjector per POCO class when I saw your message to see if that was faster.

I've given this a go and I'm just getting the error: "No projection elements specified"

var qf = new QueryFactory();
var q = qf.Create()
        .Select(ProjectionLambdaCreator.Create<CustomListOptionMappingValueRecord, CustomListOptionMappingValueFields>())
        .From(bucket.Relations)
        .Where(bucket.PredicateExpression);

This doesn't seem to be valid syntax from your example:

var q = qf.CustomListOptionMappingValueEntity

This seems to work, but more than twice as slow as using FetchQuery with raw SQL - 12 seconds for the below vs 4/5s with FetchQuery(string sql):

var qf = new QueryFactory();
var q = qf.Create()
        .Select(() => new CustomListOptionMappingValueRecord()
        {
            CostSetId = CustomListOptionMappingValueFields.CostSetId.ToValue<System.Int32>(),
            OptionId = CustomListOptionMappingValueFields.OptionId.ToValue<System.Int32>(),
            ListId = CustomListOptionMappingValueFields.ListId.ToValue<System.Int32>(),
            ListVersion = CustomListOptionMappingValueFields.ListVersion.ToValue<System.Int32>(),
            MappingId = CustomListOptionMappingValueFields.MappingId.ToValue<System.Int32>(),
            MappedValue = CustomListOptionMappingValueFields.MappedValue.ToValue<System.String>(),
        })
        .From(bucket?.Relations)
        .Where(bucket?.PredicateExpression);

IEnumerable<CustomListOptionMappingValueRecord> collection3 = ((DataAccessAdapter)_adapter).FetchQuery(q);

Is there any way to just get the raw SQL out of the DynamicQuery q?

ddp74
User
Posts: 9
Joined: 04-Jan-2024
# Posted on: 20-May-2025 11:02:58   

Just an update on the IGenericDataProjector approach. Still takes around 12 seconds, so nothing seems to beat the raw performance of FetchQuery<CustomListOptionMappingValueRecord>(string sql), which consistently takes around 3-4 seconds and returns the same data.

public class CustomListOptionMappingValueProjector : IGeneralDataProjector
{
    private IList<CustomListOptionMappingValueRecord> _destination;

    public CustomListOptionMappingValueProjector(IList<CustomListOptionMappingValueRecord> destination)
    {
        _destination = destination ?? throw new ArgumentNullException(nameof(destination));
    }

    public bool RequiresNewObjectArrayPerRow => false;

    public Stopwatch Stopwatch = new Stopwatch();

    public void AddProjectionResultToContainer(List<IDataValueProjector> valueProjectors, object[] rawProjectionResult)
    {
        Stopwatch.Start();

        CustomListOptionMappingValueRecord row = new CustomListOptionMappingValueRecord();
        for (int i = 0; i < valueProjectors.Count; i++)
        {
            IDataValueProjector column = valueProjectors[i];

            object result = rawProjectionResult[i];

            switch (column.ProjectedResultName)
            {
                case "ListId":
                    row.ListId = (int)rawProjectionResult[i];
                    break;
                case "ListVersion":
                    row.ListVersion = (int)rawProjectionResult[i];
                    break;
                case "CostSetId":
                    row.CostSetId = (int)rawProjectionResult[i];
                    break;
                case "MappingId":
                    row.MappingId = (int)rawProjectionResult[i];
                    break;
                case "OptionId":
                    row.OptionId = (int)rawProjectionResult[i];
                    break;
                case "MappedValue":
                    row.MappedValue = result == DBNull.Value ? null : (string)rawProjectionResult[i];
                    break;
            }
        }
        _destination.Add(row);

        Stopwatch.Stop(); // 2-3 seconds to build 12 million records, 10s of unexplained time
    }

    public void Initialize(List<IDataValueProjector> valueProjectors, IDataReader dataSource)
    {
    }
}

I've also tried DataTable t = _adapter.FetchAsDataTable(q); as that takes a DynamicQuery, but that on it's own without doing the conversion of the DataRows to my records takes 19 seconds consistently.

ddp74
User
Posts: 9
Joined: 04-Jan-2024
# Posted on: 20-May-2025 11:25:05   

Right! I've worked out the time difference. It's because I was using a record with fields and not using gets/sets. So FetchQuery(string sql) was only faster as it was never populating any of the records. It was only creating empty instances.

And the reason for the "No projection elements specified" when using ProjectionLambdaCreator.Create<CustomListOptionMappingValueRecord, CustomListOptionMappingValueFields>()was the same thing! Changed to properties and now it's working simple_smile.

Thanks for the help!

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39877
Joined: 17-Aug-2003
# Posted on: 20-May-2025 13:32:57   

Glad you got it working! according to our benchmarks (which fetch around 32000 rows) the plain sql route is a tiny bit faster but shouldn't be that noticeable... simple_smile With 15M rows, the limiting factor is the network anyway simple_smile

Frans Bouma | Lead developer LLBLGen Pro
ddp74
User
Posts: 9
Joined: 04-Jan-2024
# Posted on: 23-May-2025 15:41:31   

I have noticed a slight problem with using:

ProjectionLambdaCreator.Create<TRecord, TFields>()

I think because it caches the expression tree, the SelectListAlias and can't be changed. I noticed some queries where we're aliasing the FROM table kept complaining about being unbound and the SQL showed an alias that wasn't even set anywhere in that fetch.

I changed the code to not use ProjectionLambdaCreator.Create and instead just do the projections manually:

public static Expression<Func<CustomListOptionMappingValueRecord>> Projection(string selectAlias)
{
    return () => new CustomListOptionMappingValueRecord()
    {
        CostSetId = CustomListOptionMappingValueFields.CostSetId.SetObjectAlias(selectAlias).ToValue<int>(),
        ListId = CustomListOptionMappingValueFields.ListId.SetObjectAlias(selectAlias).ToValue<int>(),
        ListVersion = CustomListOptionMappingValueFields.ListVersion.SetObjectAlias(selectAlias).ToValue<int>(),
        MappedValue = CustomListOptionMappingValueFields.MappedValue.SetObjectAlias(selectAlias).ToValue<string>(),
        MappingId = CustomListOptionMappingValueFields.MappingId.SetObjectAlias(selectAlias).ToValue<int>(),
        OptionId = CustomListOptionMappingValueFields.OptionId.SetObjectAlias(selectAlias).ToValue<int>()
    };
}

Which I pass to my generic method:

public IEnumerable<TRecord> FetchQuery<TRecord>(IRelationPredicateBucket bucket,
    Func<string, Expression<Func<TRecord>>> projection)
{
    DynamicQuery<TRecord> query = new QueryFactory().Create()
        .Select(projection(bucket?.SelectListAlias))
        .Distinct()
        .From(bucket?.Relations)
        .Where(bucket?.PredicateExpression);

    return this.FetchQuery(query);
}

This does mean I now need to generate all the projections, which is a bit annoying. So is there a way to keep using ProjectionLambdaCreator.Create?