There are many approaches how we can join the data in the memory with the data in database. But you should know the advantages and disadvantages for each one.

Test Environment

Hardware components

  • Test Machine: Intel i7-3517U 1.9 GHz max. frequency 2.4 GHz
  • Memory: 16 GB
  • HDD: Samsung SSD mzmpc256hbgj-000

Software components

Preconditions

  • Count of entities in the Employees table: 11 060 573
  • Count of entities in the memory: 50 000 / 1 000 000

Postconditions

  • Excepted count of rows to be returned: 50 000 / 1 000 000

I created this context based on CompanyDb database, which was created especially for this sample.image

image

Use Case

Each test takes the collection of type LookupEmployee and is searching for the same entities based on the Id in the database. It the other words i’m testing Join based on the Id.

private static void Main(string[] args)
{
	const int countOfEntries = 50000;
	var employeesLookupTable = CreateEmployeesLookupTable(countOfEntries);

	var stopwatch = Stopwatch.StartNew();
	const int chunkSize = 10;
	IEnumerable foundEmployees = new List();

	Console.WriteLine("Test ready, press any key to continue...");
	Console.ReadKey();
	Console.WriteLine("Test started at {0}", DateTime.Now);

	try
	{
		//Test method call
	}
	catch (Exception e)
	{
		Console.WriteLine("Exception was thrown: {0}", e);
		BeepWhileAnyKey();
		PressAnyKeyToContinue();
	}
	stopwatch.Stop();

	Console.WriteLine("Elapsed time: {0}", stopwatch.Elapsed);
	Console.WriteLine("Found employees: {0}", foundEmployees.Count());
	Console.WriteLine("Test finished at {0}", DateTime.Now);

	BeepWhileAnyKey();
	PressAnyKeyToContinue();
}

In each Test Case are monitored following performance indicators:

  • Execution time
  • % Processor Time (\Process(JoiningInMemoryDataWithDBTable)\% Processor Time)
  • Working Set – Private (in bytes) (\Process(JoiningInMemoryDataWithDBTable)\Working Set – Private)
  • Transactions/seconds (\MSSQL$SQLSERVER2012:Databases(CompanyDb)\Transactions/sec)
  • Total Server Memory (in kilobytes) (\MSSQL$SQLSERVER2012:Memory Manager\Total Server Memory (KB))

Test Cases

TEST CASE 1 – Using LINQ Join Operator

The first test is using the LINQ Join operator.

private static IEnumerable FoundEmployeesWithJoin(IEnumerable employeesLookupTable)
{
	using (var companyDbContext = new CompanyDbContext())
	{
		return employeesLookupTable.Join(companyDbContext.Employees, st => st.Id, e => e.Id, (st, e) => e).ToList();
	}
}

Results TEST CASE 1

  • Execution time(HH:mm:ss:fffffff): 01:11:15.8002779
  • Avg. % Processor Time: 32.438
  • Avg. Working Set – Private (bytes): 13 007 237 288.548
  • Avg. Total Server Memory (kilobytes): 321 706.498
  • Avg. Transactions / seconds: 0.049

imageThe LINQ Join operator is in this scenario disastrous. The test application consumed in short time whole memory and the CPU was also pretty busy. The problem of the Join operator is that he can’t produce useful query when creating join with the memory collection. The result is that the Join operator is loading the whole table data in the memory (in this case about 11 000 000 entities) and performs the join operation there. Keep this in mind when are you using Join with memory collection on one and database on other side.

TEST CASE 2 – Using PredicateBuilder

The next test is based on the Joe’s and Ben’s Albahari’s famous PredicateBuilder included in the LINQKit.

private static IEnumerable FoundEmployeesWithSimplePredicateBuilder(IEnumerable employeesLookupTable)
{
	int i = 0;
	var predicate = PredicateBuilder.False();
	foreach (var employeeFromLookupTable in employeesLookupTable)
	{
		predicate = predicate.Or(searchedEmployee => searchedEmployee.Id == employeeFromLookupTable.Id);
		Console.WriteLine(i++);
	}

	using (var companyDbContext = new CompanyDbContext())
	{
		return companyDbContext.Employees.AsExpandable().Where(predicate).ToList();
	}
}

Results TEST CASE 2

The result of this test: The application crashed after the 49 999 iterations (due to adding the Or operator to the Predicate object)

image

Why? Well the PredicateBuilder must recursively walking the expression tree and because the stack size is limited, the StackOverflowException will be thrown.

TEST CASE 3 – Using PredicateBuilder with fragmentation

Using the Join operator or the PredicateBuilder as shown in the Test Case 1 or Test Case 2 is not optional. What can we do? We can fragment the employeesLookupTable into smaller pieces.

private static IEnumerable FoundEmployeesWithFragments(IEnumerable employeesLookupTable, int chunkSize)
{
	var employeesLookupTableFragmented = employeesLookupTable.Select((employeeToSearch, index) =>
															new { EmployeeToSearch = employeeToSearch, Index = index })
											.GroupBy(employeesToSearch => employeesToSearch.Index / chunkSize);

	var foundEmployees = new List();

	foreach (var employeeLookupTableFragmentEntry in employeesLookupTableFragmented)
	{
		IEnumerable employeesFromLookupTable = employeeLookupTableFragmentEntry.Select(e => e.EmployeeToSearch);
		var predicate = PredicateBuilder.False();

		foreach (var employeeFromLookupTable in employeesFromLookupTable)
		{
			predicate = predicate.Or(searchedEmployee => searchedEmployee.Id == employeeFromLookupTable.Id);
		}

		using (var companyDbContext = new CompanyDbContext())
		{
			foundEmployees.AddRange(companyDbContext.Employees.AsExpandable().Where(predicate).ToList());
		}

	}

	return foundEmployees;
}

Results TEST CASE 3 (Each iteration with 100 fragments / 50 000 entities in the memory)

  • Execution time(HH:mm:ss:fffffff): 01:29:14.4135545
  • Avg. % Processor Time: 0.059
  • Avg. Working Set – Private (bytes): 63 724 622.226
  • Avg. Total Server Memory (kilobytes): 886 166.552
  • Avg. Transactions / seconds: 0.050

image

Better? Yes but… The the memory and the CPU is now fine, but the execution time is still horrible. How long would it take on the SQL Server? Couple of milliseconds. What the heck is the problem?

We are looking with the SQL Server Profiler how the query performs.

image

The execution of the query below took about 24 seconds.

exec sp_executesql N'SELECT
    [Extent1].[Id] AS [Id],
    [Extent1].[FirstName] AS [FirstName],
    [Extent1].[Surname] AS [Surname],
    [Extent1].[InsertDateTime] AS [InsertDateTime],
    [Extent1].[FK_Position] AS [FK_Position]
    FROM [dbo].[Employees] AS [Extent1]
    WHERE (([Extent1].[Id] = @p__linq__0) AND (@p__linq__0 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__1) AND (@p__linq__1 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__2) AND (@p__linq__2 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__3) AND (@p__linq__3 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__4) AND (@p__linq__4 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__5) AND (@p__linq__5 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__6) AND (@p__linq__6 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__7) AND (@p__linq__7 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__8) AND (@p__linq__8 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__9) AND (@p__linq__9 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__10) AND (@p__linq__10 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__11) AND (@p__linq__11 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__12) AND (@p__linq__12 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__13) AND (@p__linq__13 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__14) AND (@p__linq__14 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__15) AND (@p__linq__15 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__16) AND (@p__linq__16 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__17) AND (@p__linq__17 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__18) AND (@p__linq__18 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__19) AND (@p__linq__19 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__20) AND (@p__linq__20 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__21) AND (@p__linq__21 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__22) AND (@p__linq__22 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__23) AND (@p__linq__23 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__24) AND (@p__linq__24 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__25) AND (@p__linq__25 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__26) AND (@p__linq__26 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__27) AND (@p__linq__27 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__28) AND (@p__linq__28 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__29) AND (@p__linq__29 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__30) AND (@p__linq__30 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__31) AND (@p__linq__31 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__32) AND (@p__linq__32 IS NOT
NULL)) OR (([Extent1].[Id] = @p__linq__33) AND (@p__linq__33 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__34) AND (@p__linq__34 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__35) AND (@p__linq__35 IS NOT NULL)) OR
(([Extent1].[Id] = @p__linq__36) AND (@p__linq__36 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__37) AND (@p__linq__37 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__38) AND (@p__linq__38 IS NOT NULL)) OR (([Extent1].[Id] =
@p__linq__39) AND (@p__linq__39 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__40) AND (@p__linq__40 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__41) AND (@p__linq__41 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__42) AND
(@p__linq__42 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__43) AND (@p__linq__43 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__44) AND (@p__linq__44 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__45) AND (@p__linq__45 IS NOT
NULL)) OR (([Extent1].[Id] = @p__linq__46) AND (@p__linq__46 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__47) AND (@p__linq__47 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__48) AND (@p__linq__48 IS NOT NULL)) OR
(([Extent1].[Id] = @p__linq__49) AND (@p__linq__49 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__50) AND (@p__linq__50 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__51) AND (@p__linq__51 IS NOT NULL)) OR (([Extent1].[Id] =
@p__linq__52) AND (@p__linq__52 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__53) AND (@p__linq__53 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__54) AND (@p__linq__54 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__55) AND
(@p__linq__55 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__56) AND (@p__linq__56 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__57) AND (@p__linq__57 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__58) AND (@p__linq__58 IS NOT
NULL)) OR (([Extent1].[Id] = @p__linq__59) AND (@p__linq__59 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__60) AND (@p__linq__60 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__61) AND (@p__linq__61 IS NOT NULL)) OR
(([Extent1].[Id] = @p__linq__62) AND (@p__linq__62 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__63) AND (@p__linq__63 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__64) AND (@p__linq__64 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__65) AND (@p__linq__65 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__66) AND (@p__linq__66 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__67) AND (@p__linq__67 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__68) AND (@p__linq__68 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__69) AND (@p__linq__69 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__70) AND (@p__linq__70 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__71) AND (@p__linq__71 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__72) AND (@p__linq__72 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__73) AND (@p__linq__73 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__74) AND (@p__linq__74 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__75) AND (@p__linq__75 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__76) AND (@p__linq__76 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__77) AND (@p__linq__77 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__78) AND (@p__linq__78 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__79) AND (@p__linq__79 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__80) AND (@p__linq__80 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__81) AND (@p__linq__81 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__82) AND (@p__linq__82 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__83) AND (@p__linq__83 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__84) AND (@p__linq__84 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__85) AND (@p__linq__85 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__86) AND (@p__linq__86 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__87) AND (@p__linq__87 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__88) AND (@p__linq__88 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__89) AND (@p__linq__89 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__90) AND (@p__linq__90 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__91) AND (@p__linq__91 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__92) AND (@p__linq__92 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__93) AND (@p__linq__93 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__94) AND
(@p__linq__94 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__95) AND (@p__linq__95 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__96) AND (@p__linq__96 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__97) AND (@p__linq__97 IS NOT
NULL)) OR (([Extent1].[Id] = @p__linq__98) AND (@p__linq__98 IS NOT NULL)) OR (([Extent1].[Id] = @p__linq__99) AND (@p__linq__99 IS NOT NULL))',N'@p__linq__0 int,@p__linq__1 int,@p__linq__2 int,@p__linq__3 int,@p__linq__4
int,@p__linq__5 int,@p__linq__6 int,@p__linq__7 int,@p__linq__8 int,@p__linq__9 int,@p__linq__10 int,@p__linq__11 int,@p__linq__12 int,@p__linq__13 int,@p__linq__14 int,@p__linq__15 int,@p__linq__16 int,@p__linq__17
int,@p__linq__18 int,@p__linq__19 int,@p__linq__20 int,@p__linq__21 int,@p__linq__22 int,@p__linq__23 int,@p__linq__24 int,@p__linq__25 int,@p__linq__26 int,@p__linq__27 int,@p__linq__28 int,@p__linq__29 int,@p__linq__30
int,@p__linq__31 int,@p__linq__32 int,@p__linq__33 int,@p__linq__34 int,@p__linq__35 int,@p__linq__36 int,@p__linq__37 int,@p__linq__38 int,@p__linq__39 int,@p__linq__40 int,@p__linq__41 int,@p__linq__42 int,@p__linq__43
int,@p__linq__44 int,@p__linq__45 int,@p__linq__46 int,@p__linq__47 int,@p__linq__48 int,@p__linq__49 int,@p__linq__50 int,@p__linq__51 int,@p__linq__52 int,@p__linq__53 int,@p__linq__54 int,@p__linq__55 int,@p__linq__56
int,@p__linq__57 int,@p__linq__58 int,@p__linq__59 int,@p__linq__60 int,@p__linq__61 int,@p__linq__62 int,@p__linq__63 int,@p__linq__64 int,@p__linq__65 int,@p__linq__66 int,@p__linq__67 int,@p__linq__68 int,@p__linq__69
int,@p__linq__70 int,@p__linq__71 int,@p__linq__72 int,@p__linq__73 int,@p__linq__74 int,@p__linq__75 int,@p__linq__76 int,@p__linq__77 int,@p__linq__78 int,@p__linq__79 int,@p__linq__80 int,@p__linq__81 int,@p__linq__82
int,@p__linq__83 int,@p__linq__84 int,@p__linq__85 int,@p__linq__86 int,@p__linq__87 int,@p__linq__88 int,@p__linq__89 int,@p__linq__90 int,@p__linq__91 int,@p__linq__92 int,@p__linq__93 int,@p__linq__94 int,@p__linq__95 int,@p__linq__96 int,@p__linq__97 int,@p__linq__98 int,@p__linq__99 int',@p__linq__0=2,@p__linq__1=1,@p__linq__2=12501,@p__linq__3=12502,@p__linq__4=5,@p__linq__5=12503,@p__linq__6=6,@p__linq__7=12504,@p__linq__8=7,@p__linq__9=8,@p__linq__10=9,@p__linq__11=10,@p__linq__12=11,@p__linq__13=12,@p__linq__14=13,@p__linq__15=14,@p__linq__16=3,@p__linq__17=15,@p__linq__18=12505,@p__linq__19=4,@p__linq__20=16,@p__linq__21=12506,@p__linq__22=19,@p__linq__23=17,@p__linq__24=12507,@p__linq__25=20,@p__linq__26=18,@p__linq__27=12508,@p__linq__28=21,@p__linq__29=23,@p__linq__30=12509,@p__linq__31=22,@p__linq__32=39,@p__linq__33=24,@p__linq__34=12510,@p__linq__35=40,@p__linq__36=25,@p__linq__37=12511,@p__linq__38=41,@p__linq__39=26,@p__linq__40=12512,@p__linq__41=42,@p__linq__42=27,@p__linq__43=12513,@p__linq__44=43,@p__linq__45=28,@p__linq__46=12514,@p__linq__47=44,@p__linq__48=29,@p__linq__49=12515,@p__linq__50=45,@p__linq__51=30,@p__linq__52=12516,@p__linq__53=46,@p__linq__54=31,@p__linq__55=12517,@p__linq__56=47,@p__linq__57=32,@p__linq__58=12518,@p__linq__59=48,@p__linq__60=33,@p__linq__61=12519,@p__linq__62=49,@p__linq__63=34,@p__linq__64=50,@p__linq__65=12520,@p__linq__66=35,@p__linq__67=51,@p__linq__68=12521,@p__linq__69=36,@p__linq__70=52,@p__linq__71=12522,@p__linq__72=37,@p__linq__73=53,@p__linq__74=12523,@p__linq__75=38,@p__linq__76=54,@p__linq__77=12524,@p__linq__78=63,@p__linq__79=12525,@p__linq__80=64,@p__linq__81=12526,@p__linq__82=55,@p__linq__83=65,@p__linq__84=12527,@p__linq__85=56,@p__linq__86=66,@p__linq__87=12528,@p__linq__88=57,@p__linq__89=67,@p__linq__90=12529,@p__linq__91=58,@p__linq__92=12530,@p__linq__93=59,@p__linq__94=12531,@p__linq__95=60,@p__linq__96=12532,@p__linq__97=61,@p__linq__98=12533,@p__linq__99=62

The Parameter Sniffing comes to play. When I ran the same query in the Microsoft SQL Server Management Studio with the OPTION(RECOMPILE) (more about here) it took only 22 milliseconds until finished.

image

The problem is caused by choice inappropriate execution plan. You can convert the query above with the excellent sp_executesql formatter into normal Select query. Running the query without Recompile option will generated following execution plan.

image

Running the query with the Recompile option will generated following execution plan.

image

Now is clear why the same query performs better with the recompile option. Index Seek is in this scenario the best choice.

The problem in our case is that the query is generated by PredicateBuilder and we can’t use recompile option here. What we can try is to use small-sized fragments.

Results TEST CASE 3 (Each iteration with 10 fragments / 50 000 entities in the memory)

  • Execution time(HH:mm:ss:fffffff): 00:00:14.7610985
  • Avg. % Processor Time: 58.159
  • Avg. Working Set – Private (bytes): 74 907 452.952
  • Avg. Total Server Memory (kilobytes): 285 762.286
  • Avg. Transactions / seconds: 0.037

image

The time is really fantastic and improved about 39 000 percent! The duration of the first query is about 12 milliseconds and any other takes no time. The Parameter Sniffing comes to play again.

image

The execution plan looks better now, the query is using clustered index seek.

image

Results TEST CASE 3 (Each iteration with 10 fragments / 1 000 000 entities in the memory)

  • Execution time(HH:mm:ss:fffffff): 00:02:40.7322217
  • Avg. % Processor Time: 82.00
  • Avg. Working Set – Private (bytes): 1 300 948.423
  • Avg. Total Server Memory (kilobytes): 312 860.000
  • Avg. Transactions / seconds: 0.051

image

TEST CASE 4 – Using PredicateBuilder with fragmentation parallelized

Iterating in the foreach loop in small fragments was the good idea. Maybe it can be optimized with the help of parallelization. To make possible to execute queries parallel we must create each time the new instance of CompanyDbContext, because DbContext isn’t thread safe.

private static IEnumerable FoundEmployeesWithFragmentsParallel(IEnumerable employeesLookupTable, int chunkSize)
{
	LookupEmployee[] lookupEmployees = employeesLookupTable.ToArray();
	var foundEmployees = new BlockingCollection();
	var rangePartitioner = Partitioner.Create(0, employeesLookupTable.Count(), chunkSize);

	Parallel.ForEach(rangePartitioner, range =>
	{
		var predicate = PredicateBuilder.False();

		for (int i = range.Item1; i < range.Item2; i++) 		{ 			int searchedId = lookupEmployees[i].Id; 			predicate = predicate.Or(searchedEmployee => searchedEmployee.Id == searchedId);
		}

		using (var companyDbContext = new CompanyDbContext())
		{
			companyDbContext.Employees.AsExpandable().Where(predicate).AsParallel().ForEach(foundEmployees.Add);
		}

	});

	return foundEmployees;
}

Results TEST CASE 4 (Each iteration with 10 fragments / 50 000 entities in the memory)

  • Execution time(HH:mm:ss:fffffff): 00:00:15.8002779
  • Avg. % Processor Time: 138.028
  • Avg. Working Set – Private (bytes): 57 753 600.000
  • Avg. Total Server Memory (kilobytes): 287 206.000
  • Avg. Transactions / seconds: 0.067

image

Parallelism seems to be overhead but it isn’t.  The only problem is the small amount of entities in the memory. But when we repeat the same tests with the 1 000 000 entities in the memory the execution time is about the half.

Results TEST CASE 4 (Each iteration with 10 fragments / 1 000 000 entities in the memory)

  • Execution time(HH:mm:ss:fffffff): 00:01:34.260
  • Avg. % Processor Time: 221.00
  • Avg. Working Set – Private (bytes): 1 359 600.239
  • Avg. Total Server Memory (kilobytes): 328 573.000
  • Avg. Transactions / seconds: 0.045

image

TEST CASE 5 – Using Stored Procedure with the help of CodeFirstStoredProcs

At this time Entity Framework in the version 6.0.2 don’t supports calls to the Stored Procedures with the Table-Valued Parameter in the Code First manner. Because of this I used for this test DbContext extension methods from the CodeFirstStoredProcs NuGet package in the version 2.6.0.

The Stored Procedure is using Table-Valued Parameter as input parameter type.

CREATE TYPE [dbo].[LookupEmployees] AS TABLE(
	[Id] [int] NOT NULL PRIMARY KEY,
	[InsertDateTime] [datetime] NOT NULL)

CREATE PROCEDURE [dbo].[GetExistingEmployees]
	@employeesLookup LookupEmployees READONLY
AS
BEGIN
	SET NOCOUNT ON;
	SELECT dbo.Employees.Id,
		dbo.Employees.Firstname,
		dbo.Employees.Surname,
		dbo.Employees.InsertDateTime
	FROM dbo.Employees
	JOIN @employeesLookup el
	ON dbo.Employees.Id = el.Id
END

The next step is to define types in the code which are representing the Stored Procedure and the Table-Valued Parameter.

The class LookupEmployeesTableValuedParameter represents the type LookupEmployees I defined in the SQL Server below.

[StoredProcAttributes.Schema("dbo")]
[StoredProcAttributes.TableName("LookupEmployees")]
internal class LookupEmployeesTableValuedParameter
{
	[StoredProcAttributes.Name("Id")]
	[StoredProcAttributes.ParameterType(SqlDbType.Int)]
	public int Id { get; set; }

	[StoredProcAttributes.Name("InsertDateTime")]
	[StoredProcAttributes.ParameterType(SqlDbType.DateTime)]
	public DateTime InsertDateTime { get; set; }
}

The class GetExistingEmployeesStoredProcedure represent the Stored Procedure GetExistingEmployees.

[StoredProcAttributes.Schema("dbo")]
[StoredProcAttributes.Name("GetExistingEmployees")]
internal class GetExistingEmployeesStoredProcedure
{
	[StoredProcAttributes.ParameterType(SqlDbType.Structured)]
	[StoredProcAttributes.Name("employeesLookup")]
	public IEnumerable LookupEmployees { get; set; }
}

The test method FoundEmploeyeesWithStoredProcedure.

private static IEnumerable FoundEmployeesWithStoredProcedure(IEnumerable employeesLookupTable)
{
	if (employeesLookupTable == null)
	{
		throw new ArgumentNullException("employeesLookupTable");
	}
	var lookupEmployeesTableValuedParameters = new BlockingCollection(employeesLookupTable.Count());
	Parallel.ForEach(employeesLookupTable, employeesLookupTableEntry => lookupEmployeesTableValuedParameters.Add(new LookupEmployeesTableValuedParameter
	{
		Id = employeesLookupTableEntry.Id,
		InsertDateTime = employeesLookupTableEntry.InsertDateTime
	}));

	var getExistingEmployeesStoredProcedure = new GetExistingEmployeesStoredProcedure
	{
		LookupEmployees = new List(lookupEmployeesTableValuedParameters)
	};

	var storedProcedure = new StoredProc().ReturnsTypes(typeof (Employee));

	using (var companyDbContext = new CompanyDbContext())
	{
		return companyDbContext.CallStoredProc(storedProcedure, getExistingEmployeesStoredProcedure).ToList();
	}
}

Results TEST CASE 5 (50 000 entities in the memory)

  • Execution time(HH:mm:ss:fffffff): 00:00:09.872
  • Avg. % Processor Time: 17.00
  • Avg. Working Set – Private (bytes): 55 203 157.333
  • Avg. Total Server Memory (kilobytes): 327 079.000
  • Avg. Transactions / seconds: 0.080

image

Results TEST CASE  5 (1 000 000 entities in the memory)

  • Execution time(HH:mm:ss:fffffff): 00:01:23.761
  • Avg. % Processor Time: 59.656
  • Avg. Working Set – Private (bytes): 973 795 479,704
  • Avg. Total Server Memory (kilobytes): 382 089.481
  • Avg. Transactions / seconds: 0.062

image

The test with Stored Procedure performs very good when the count of entities in the memory is lower, but by 1 000 000 entities is the execution time almost the same as with the PredicateBuilder parallelized version.

Summary

image

So what is the best solution to joining the data in the memory with data in the database? PredicateBuilder is the good solution but utilize CPU more as the Stored Procedure.

But what I would like really to show you is that performance matters. We need sometimes only change our thinking and we can reach excellent performance in no time.

Downloads

2 comments

  1. That is a very good tip especially to those new to the blogosphere.
    Simple but very precise info… Thanks for sharing this
    one. A must read article!

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.