LINQ Contains using nested full text search subquery (bug?)

Posts   
 
    
Posts: 3
Joined: 24-Oct-2007
# Posted on: 23-Jun-2008 18:40:53   

Hello,

I am a long-time user of the LLBLGen Pro product and am starting to play around with getting a simple application up using the new LINQ-to-LLBLGenPro functionality. I seem to have run into a problem in using the Contains operators in order to perform an "IN" query using subqueries. I am using the latest runtime libraries available on the site (2.6.8.619).

Here is the problem:

I have two tables Project and Task defined as such:

Project table [Id] int identity [Description] text

Task table [Id] int identity [ProjectId] int (FK to project table) [Description] text [Comments] text

The Project.Description, Task.Description, and Task.Comments fields are all indexed using SQL Server 2000 full text indexing. My application presents the user with a single textbox to enter a search term and then displays a list of projects where either the project or any of its tasks contains the search term.

I followed the instructions in the LLBLGen manual to create a FullTextSearch function mapping so I could use that in my queries.


Imports SD.LLBLGen.Pro.ORMSupportClasses

''' Class which is used to specify the call to the Contains construct
Public Class MyCustomFunctions
    Public Shared Function FullTextSearch(ByVal fieldToSearch As String, ByVal toFind As String) As Boolean
        ' empty body, as it's just here to make the query compile. The call is converted to a SQL function.
        Return True
    End Function
End Class

''' Class which defines the custom mapping between ServiceCenterFunctions.FullTextSearch and CONTAINS
Public Class MyCustomFunctionMappings
    Inherits FunctionMappingStore

    Public Sub New()
        ' FullTextSearch(2) on 1 field
        Me.Add(New FunctionMapping(GetType(MyCustomFunctions), "FullTextSearch", 2, "CONTAINS({0}, {1})"))
    End Sub
End Class

Due to a performance problem in SQL Server 2000 in performing multiple CONTAINS() predicates OR'd together in the same query, I am using a slightly less straightforward method to perform this task.

Basically, if I run this query against QA, it returns almost instantly:


SELECT Id FROM Project WHERE CONTAINS(Description, 'someSearchTerm');

However, if I run this query, it takes about 30 seconds:


SELECT Project.Id FROM Project
LEFT JOIN Task ON Project.Id = Task.ProjectId
WHERE
CONTAINS(Project.Description, 'someSearchTerm')
OR
CONTAINS(Task.Description, 'someSearchTerm')
OR
CONTAINS(Task.Comments, 'someSearchTerm')
;

After googling to research this, it seems this is a well-known limitation in SQL Server 2000 and due to this I will need to approach the problem slightly differently.

My idea was to use the new LINQ functionality to create 3 queries that each return a list of ProjectIds that match on the 3 different columns I am searching and then use a LINQ query that does a Contains call to the other queries in order to only return matching projects. The general SQL I am trying to get to is something like this:


SELECT * FROM Project
WHERE Id IN (
  SELECT Id FROM Project WHERE CONTAINS(Project.Description, 'someSearchTerm')
) OR Id IN (
  SELECT ProjectId FROM Task WHERE CONTAINS(Task.Description, 'someSearchTerm')
) OR Id IN (
  SELECT ProjectId FROM Task WHERE CONTAINS(Task.Comments, 'someSearchTerm')
)

Translating this to the actual LINQ code I am using:


        Public Shared Function GetProjects() As IQueryable(Of ProjectEntity)
            Dim metaData As New LinqMetaData()
            metaData.CustomFunctionMappings = New MyCustomFunctionMappings()
            Dim q = metaData.Project
            Return q
        End Function

        Public Shared Function GetTasks() As IQueryable(Of TaskEntity)
            Dim metaData As New LinqMetaData()
            metaData.CustomFunctionMappings = New MyCustomFunctionMappings()
            Dim q = metaData.Task
            Return q
        End Function

        Public Function Search(ByVal searchTerm As String) As IQueryable(Of ProjectEntity)

            Dim q1 = From p In GetProjects() _
                     Where MyCustomFunctions.FullTextSearch(p.Description, searchTerm) _
                     Select p.Id
                
            Dim q2 = From t In GetTasks() _
                     Where MyCustomFunctions.FullTextSearch(t.Description, searchTerm) _
                     Select t.ProjectId
        
            Dim q3 = From t In GetTasks() _
                     Where MyCustomFunctions.FullTextSearch(t.Comments, searchTerm) _
                     Select t.ProjectId
        
            Dim q4 = From p In GetProjects() _
                     Where q1.Contains(p.Id) Or q2.Contains(p.Id) Or q3.Contains(p.Id) _
                     Select p

            Return q4

        End Function

When I run this code in the debugger, I can see q1, q2, and q3 all can execute successfully when I expand them out and they contain the appropriate list of project ids. However, when I attempt to execute/expand the q4 query, it fails with a SqlClient.SqlException with a message of "Cannot use empty object or column names. Use a single space if necessary. Line 1: Incorrect syntax near '.'."

Below is the QueryExecuted from the above which shows some alias weirdness:


SELECT [LPLA_1].[Id] AS [Id], [LPLA_1].[Description] AS [Description] FROM [MyDatabase].[dbo].[Project] [LPLA_1]  WHERE ( ( ( ( [].[LPFA_4] = @LPFA_41))))

I've been able to work around this for now by doing some in-memory processing of the first 3 queries and turning them into arrays, unioning them together, and passing them to the final query in the Contains clause but it is less than ideal from a performance perspective. Ideally, all the processing would happen on the server.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39872
Joined: 17-Aug-2003
# Posted on: 23-Jun-2008 18:55:56   

Will look into it.

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39872
Joined: 17-Aug-2003
# Posted on: 24-Jun-2008 11:18:53   

With the latest build (not yet released) I can't reproduce it:


[Test]
public void CustomFullTextSearchWithMultipleQueriesAndContains()
{
    LinqMetaData metaData = new LinqMetaData(null, new NorthwindFunctionMappings());
    var q1 = from e in metaData.Employee
             where NorthwindFunctions.FullTextSearch(e.Notes, "BA")
             select e.EmployeeId;
    var q2 = from e in metaData.Employee
             where NorthwindFunctions.FullTextSearch(e.Notes, "English")
             select e.EmployeeId;
    var q3 = from e in metaData.Employee
             where NorthwindFunctions.FullTextSearch(e.Notes, "BS")
             select e.EmployeeId;

    var q4 = from e in metaData.Employee
             where q1.Contains(e.EmployeeId) || q2.Contains(e.EmployeeId) || q3.Contains(e.EmployeeId)
             select e;

    int count = 0;
    foreach(var v in q4)
    {
        count++;
        Assert.IsTrue(v.Notes.Contains("BA") || v.Notes.Contains("English") || v.Notes.Contains("BS"));
    }
    Assert.AreEqual(6, count);
}

SQL query produced:


SELECT  [LPLA_1].[EmployeeID] AS [EmployeeId], [LPLA_1].[LastName], [LPLA_1].[FirstName], 
        [LPLA_1].[Title], [LPLA_1].[TitleOfCourtesy], [LPLA_1].[BirthDate], [LPLA_1].[HireDate], 
        [LPLA_1].[Address], [LPLA_1].[City], [LPLA_1].[Region], [LPLA_1].[PostalCode], [LPLA_1].[Country], 
        [LPLA_1].[HomePhone] AS [Phone], [LPLA_1].[Extension], [LPLA_1].[Photo], [LPLA_1].[Notes], 
        [LPLA_1].[ReportsTo], [LPLA_1].[PhotoPath], [LPLA_1].[RegionID] AS [RegionId] 
FROM    [Northwind].[dbo].[Employees] [LPLA_1]  
WHERE (((
(
    EXISTS 
    (
        SELECT [LPA_L1].[EmployeeId] 
        FROM 
        (
            SELECT  [LPLA_2].[EmployeeID] AS [EmployeeId] 
            FROM    [Northwind].[dbo].[Employees] [LPLA_2]  
            WHERE (((CONTAINS([LPLA_2].[Notes], @LOce09b7de2))))
        ) [LPA_L1] 
        WHERE ([LPA_L1].[EmployeeId] = [LPLA_1].[EmployeeID])
    ) 
    OR EXISTS 
    (
        SELECT [LPA_L2].[EmployeeId] 
        FROM 
        (
            SELECT  [LPLA_4].[EmployeeID] AS [EmployeeId] 
            FROM    [Northwind].[dbo].[Employees] [LPLA_4]  
            WHERE (((CONTAINS([LPLA_4].[Notes], @LOd0a055074))))
        ) [LPA_L2] 
        WHERE ([LPA_L2].[EmployeeId] = [LPLA_1].[EmployeeID])
    )
) OR  EXISTS 
(
    SELECT [LPA_L3].[EmployeeId] 
    FROM 
    (
        SELECT  [LPLA_6].[EmployeeID] AS [EmployeeId] 
        FROM    [Northwind].[dbo].[Employees] [LPLA_6]  
        WHERE (((CONTAINS([LPLA_6].[Notes], @LOce17b7de6))))
    ) [LPA_L3] 
    WHERE ( [LPA_L3].[EmployeeId] = [LPLA_1].[EmployeeID])
))))

I'll pack and attach the latest build of linq support classes and ormsupportclasses. There was an important bugfix in how derived tables were handled in subqueries inside predicates. I think you ran into this bug as well. The new build is released later today.

(edit) I've attached the build of the ormsupportclasses + linqsupportclasses I've used with this test.

Frans Bouma | Lead developer LLBLGen Pro