Language-Integrated Query, or LINQ for short, brings a query execution pipeline directly into C# and the managed environments of .NET Framework and .NET Core. LINQ provides several ways to execute queries and handle complex data manipulation tasks. Chunking is one feature of LINQ that simplifies the way you manage collections.
In this article, we’ll examine chunking in LINQ with code examples in C# to illustrate the concepts. To work with the code examples provided in this article, you should have Visual Studio 2022 installed in your system. If you don’t already have a copy, you can download Visual Studio 2022 here.
Create a console application project in Visual Studio 2022
First off, let’s create a .NET Core 9 console application project in Visual Studio 2022. Assuming you have Visual Studio 2022 installed, follow the steps outlined below to create a new .NET Core 9 console application project.
- Launch the Visual Studio IDE.
- Click on “Create new project.”
- In the “Create new project” window, select “Console App (.NET Core)” from the list of templates displayed.
- Click Next.
- In the “Configure your new project” window, specify the name and location for the new project.
- Click Next.
- In the “Additional information” window shown next, choose “.NET 9.0 (Standard Term Support)” as the framework version you would like to use.
- Click Create.
We’ll use this .NET 9 console application project to work with chunking in LINQ in the subsequent sections of this article.
The Chunk extension method in LINQ
Chunking is a feature of LINQ that splits a collection into chunks of fixed sizes. This can greatly help in improving performance of your application in several use cases such as paging and batch processing, or whenever you are handling a large data set that would take a long time to load and consume a lot of memory. Instead of loading a large data set in the memory all at once, you can take advantage of chunking to split the collection into chunks and then load or process these chunks as needed.
To implement chunking in C#, you can take advantage of the Chunk() extension method in LINQ. This method belongs to the System.Linq namespace and returns an enumeration of arrays that contain the sliced pieces of the main array. You can use the Chunk extension method in LINQ to write code that is efficient, performant, and maintainable.
The Chunk extension method is defined in the System.Linq namespace as follows:
public static System.Collections.Generic.IEnumerable Chunk (this System.Collections.Generic.IEnumerable source, int size);
The Chunk extension method accepts two parameters: the data source or collection to be chunked and the size of each chunk. Here, TSource is the main array and size refers to the maximum size of each chunk created from the main array.
Using Chunk to split an array of integers in C#
Let us understand this with a code example. Consider the following code, which uses the Chunk extension method to divide an array of integers into chunks of equal sizes.
int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
var chunks = numbers.Chunk(5);
int counter = 0;
foreach (var chunk in chunks)
{
Console.WriteLine($"Chunk #{++counter}");
Console.WriteLine(string.Join(", ", chunk));
}
In the preceding code example, we create an array of 15 integers, then use the Chunk method to split the array into chunks of equal sizes, i.e., five elements in this example. Finally, we display the integers contained in each chunk at the console.
When you execute the console application, the three chunks of five integers will be displayed in the console window as shown in Figure 1.
IDG
Using Chunk to split a list of strings in C#
You can also use the Chunk method to split a list of strings as shown in the code snippet given below.
List numbers = new List
{ "USA","UK", "India", "France", "Australia", "Brazil"};
var chunks = numbers.Chunk(3);
int counter = 0;
foreach (var chunk in chunks)
{
Console.WriteLine($"Chunk #{++counter}");
Console.WriteLine(string.Join(", ", chunk));
}
When you run the preceding code example, two chunks of string data each containing three elements will be created and the text stored in each chunk will be displayed at the console window as shown in Figure 2.
IDG
Using Chunk to process large files in C#
Chunking is beneficial whenever you’re handling large data sets or massive volumes of data. Chunking also comes in handy when processing large files, saving a lot of resources and time. You can take advantage of the Chunk extension method to split the contents of a file and then process it one chunk at a time. The following code snippet shows how you can split a large file into chunks of 100 bytes each and then display the file content at the console.
int size = 100;
var data = File.ReadLines(@"D:largetextfile.txt");
foreach (var chunk in data.Chunk(size))
{
Console.WriteLine($"Number of lines in the file is: {chunk.Count()}");
Console.WriteLine("Displaying the file content:-");
DisplayText(chunk);
}
void DisplayText(IEnumerable fileContent)
{
foreach (var text in fileContent)
{
Console.WriteLine(text);
}
}
Here the File.ReadLines method reads a file one line at a time without loading the entire file into the memory at once. By using the Chunk method, you can process the contents of the file one chunk at a time. In this example, the text contained in each chunk will be displayed at the console window as shown in Figure 3.
IDG
Working with Chunk
The Chunk extension method in LINQ enables you to split a data source into bite-sized chunks for processing, thereby improving resource utilization and application performance. However, you should keep the following points in mind:
- If the collection being chunked is empty, the Chunk extension method will return an empty chunk but no exception will be thrown.
- If the number of elements in the collection is not exactly divisible by the size of the chunk, the last chunk will contain the remaining data of the collection.
- Additionally, if the size of the chunk is greater than or equal to the size of the collection being chunked, the Chunk method will return only one chunk of data.
If you plan on working with large data sets, or even the occasional large file, Chunk would be a very handy addition to your toolkit.