Working with generics
Visual Studio 2005 included .NET version 2.0 which included generics. Generics give developers the ability to design classes and methods that defer the specification of specific parts of a class or method’s specification until declaration or instantiation.
Generics offer features previously unavailable in .NET. One benefit to generics, that is potentially the most common, is for the implementation of collections that provide a consistent interface to collections of different data types without needing to write specific code for each data type.
Constraints can be used to restrict the types that are supported by a generic method or class, or can guarantee specific interfaces.
Limits of generics
Constraints within generics in C# are currently limited to a parameter-less constructor, interfaces, or base classes, or whether or not the type is a struct or a class (value or reference type). This really means that code within a generic method or type can either be constructed or can make use of methods and properties.
Due to these restrictions types within generic types or methods cannot have operators.
Writing sequence and iterator members
Visual Studio 2005 and C# 2.0 introduced the yield keyword. The yield keyword is used within an iterator member as a means to effectively implement an IEnumerable interface without needing to implement the entire IEnumerable interface.
Iterator members are members that return a type of IEnumerable or IEnumerable<T>, and return individual elements in the enumerable via yield return, or deterministically terminates the enumerable via yield break. These members can be anything that can return a value, such as methods, properties, or operators. An iterator that returns without calling yield break has an implied yield break, just as a void method has an implied return.
Iterators operate on a sequence but process and return each element as it is requested. This means that iterators implement what is known as deferred execution. Deferred execution is when some or all of the code, although reached in terms of where the instruction pointer is in relation to the code, hasn’t entirely been executed yet.
Iterators are methods that can be executed more than once and result in a different execution path for each execution. Let’s look at an example:
public static IEnumerable<DateTime> Iterator()
{
Thread.Sleep(1000);
yield return DateTime.Now;
Thread.Sleep(1000);
yield return DateTime.Now;
Thread.Sleep(1000);
yield return DateTime.Now;
}
The Iterator method returns IEnumerable which results in three DateTime values. The creation of those three DateTime values is actually invoked at different times. The Iterator method is actually compiled in such a way that a state machine is created under the covers to keep track of how many times the code is invoked and is implemented as a special IEnumerable<DateTime> object. The actual invocation of the code in the method is done through each call of the resulting IEnumerator. MoveNext method.
The resulting IEnumerable is really implemented as a collection of delegates that are executed upon each invocation of the MoveNext method, where the state, in the simplest case, is really which of the delegates to invoke next. It’s actually more complicated than that, especially when there are local variables and state that can change between invocations and is used across invocations. But the compiler takes care of all that.
Effectively, iterators are broken up into individual bits of code between yield return statements that are executed independently, each using potentially local shared data.
What are iterators good for other than a really cool interview question? Well, first of all, due to the deferred execution, we can technically create sequences that don’t need to be stored in memory all at one time.
This is often useful when we want to project one sequence into another. Couple that with a source sequence that is also implemented with deferred execution, we end up creating and processing IEnumerables (also known as collections) whose content is never all in memory at the same time. We can process large (or even infinite) collections without a huge strain on memory.
For example, if we wanted to model the set of positive integer values (an infinite set) we could write an iterator method shown as follows:
static IEnumerable<BigInteger> AllThePositiveIntegers()
{
var number = new BigInteger(0);
while (true) yield return number++;
}
We can then chain this iterator with another iterator, say something that gets all of the positive squares:
static IEnumerable<BigInteger> AllThePostivieIntegerSquares(
IEnumerable<BigInteger> sourceIntegers)
{
foreach(var value in sourceIntegers)
yield return value*value;
}
Which we could use as follows:
foreach(var value in
AllThePostivieIntegerSquares(AllThePositiveIntegers()))
Console.WriteLine(value);
We’ve now effectively modeled two infi nite collections of integers in memory.
Of course, our AllThePostiveIntegerSquares method could just as easily be used with fi nite sequences of values, for example:
foreach (var value in
AllThePostivieIntegerSquares(
Enumerable.Range(0, int.MaxValue)
.Select(v => new BigInteger(v))))
Console.WriteLine(value);
In this example we go through all of the positive Int32 values and square each one without ever holding a complete collection of the set of values in memory.
As we see, this is a useful method for composing multiple steps that operate on, and result in, sequences of values.
We could have easily done this without IEnumerable<T>, or created an IEnumerator class whose MoveNext method performed calculations instead of navigating an array. However, this would be tedious and is likely to be error-prone. In the case of not using IEnumerable<T>, we’d be unable to operate on the data as a collection with things such as foreach.
Context: When modeling a sequence of values that is either known only at runtime, or each element can be reliably calculated at runtime.
Practice: Consider using an iterator.
Working with lambdas
Visual Studio 2008 introduced C# 3.0 . In this version of C# lambda expressions were introduced. Lambda expressions are another form of anonymous functions. Lambdas were added to the language syntax primarily as an easier anonymous function syntax for LINQ queries. Although you can’t really think of LINQ without lambda expressions, lambda expressions are a powerful aspect of the C# language in their own right. They are concise expressions that use implicitly-typed optional input parameters whose types are implied through the context of their use, rather than explicit de fi nition as with anonymous methods.
Along with C# 3.0 in Visual Studio 2008, the .NET Framework 3.5 was introduced which included many new types to support LINQ expressions, such as Action<T> and Func<T>. These delegates are used primarily as definitions for different types of anonymous methods (including lambda expressions). The following is an example of passing a lambda expression to a method that takes a Func<T1, T2, TResult> delegate and the two arguments to pass along to the delegate:
ExecuteFunc((f, s) => f + s, 1, 2);
The same statement with anonymous methods:
ExecuteFunc(delegate(int f, int s) { return f + s; }, 1, 2);
It’s clear that the lambda syntax has a tendency to be much more concise, replacing the delegate and braces with the “goes to” operator (=>). Prior to anonymous functions, member methods would need to be created to pass as delegates to methods. For example:
ExecuteFunc(SomeMethod, 1, 2);
This, presumably, would use a method named SomeMethod that looked similar to:
private static int SomeMethod(int first, int second)
{
return first + second;
}
Lambda expressions are more powerful in the type inference abilities, as we’ve seen from our examples so far. We need to explicitly type the parameters within anonymous methods, which is only optional for parameters in lambda expressions.
LINQ statements don’t use lambda expressions exactly in their syntax. The lambda expressions are somewhat implicit. For example, if we wanted to create a new collection of integers from another collection of integers, with each value incremented by one, we could use the following LINQ statement:
var x = from i in arr select i + 1;
The i + 1 expression isn’t really a lambda expression, but it gets processed as if it were first converted to method syntax using a lambda expression:
var x = arr.Select(i => i + 1);
The same with an anonymous method would be:
var x = arr.Select(delegate(int i) { return i + 1; });
What we see in the LINQ statement is much closer to a lambda expression. Using lambda expressions for all anonymous functions means that you have more consistent looking code.
Context: When using anonymous functions.
Practice: Prefer lambda expressions over anonymous methods.
Parameters to lambda expressions can be enclosed in parentheses. For example:
var x = arr.Select((i) => i + 1);
The parentheses are only mandatory when there is more than one parameter:
var total = arr.Aggregate(0, (l, r) => l + r);
Context: When writing lambdas with a single parameter.
Practice: Prefer no parenthesis around the parameter declaration.
Sometimes when using lambda expressions, the expression is being used as a delegate that takes an argument. The corresponding parameter in the lambda expression may not be used within the right-hand expression (or statements). In these cases, to reduce the clutter in the statement, it’s common to use the underscore character (_) for the name of the parameter. For example:
task.ContinueWith(_ => ProcessSecondHalfOfData());
The task.ContinueWith method takes an Action <Task> delegate. This means the previous lambda expression is actually given a task instance (the antecedent Task). In our example, we don’t use that task and just perform some completely independent operation. In this case, we use (_) to not only signify that we know we don’t use that parameter, but also to reduce the clutter and potential name collisions a little bit.
Context: When writing lambda expression that take a single parameter but the parameter is not used.
Practice: Use underscore (_) for the name of the parameter.
There are two types of lambda expressions. So far, we’ve seen expression lambdas. Expression lambdas are a single expression on the right-hand side that evaluates to a value or void. There is another type of lambda expression called statement lambdas. These lambdas have one or more statements and are enclosed in braces. For example:
task.ContinueWith(_ => {
var value = 10;
value += ProcessSecondHalfOfData();
ProcessSomeRandomValue(value);
});
As we can see, statement lambdas can declare variables, as well as have multiple statements.
Working with extension methods
Along with lambda expressions and iterators, C# 3.0 brought us extension methods. These static methods (contained in a static class whose first argument is modified with the this modifier) were created for LINQ so IEnumerable types could be queried without needing to add copious amounts of methods to the IEnumerable interface.
An extension method has the basic form of:
public static class EnumerableExtensions
{
public static IEnumerable<int> IntegerSquares(
this IEnumerable<int> source)
{
return source.Select(value => value * value);
}
}
As stated earlier, extension methods must be within a static class, be a static method, and the first parameter must be modified with the this modifier.
Extension methods extend the available instance methods of a type. In our previous example, we’ve effectively added an instance member to IEnumerable<int> named IntegerSquares so we get a sequence of integer values that have been squared.
For example, if we created an array of integer values, we will have added a Cubes method to that array that returns a sequence of the values cubed. For example:
var values = new int[] {1, 2, 3};
foreach (var v in values.Cubes())
{
Console.WriteLine(v);
}
Having the ability to create new instance methods that operate on any public members of a specific type is a very powerful feature of the language.
This, unfortunately, does not come without some caveats.
Extension methods suffer inherently from a scoping problem. The only scoping that can occur with these methods is the namespaces that have been referenced for any given C# source file. For example, we could have two static classes that have two extension methods named Cubes. If those static classes are in the same namespace, we’d never be able to use those extensions methods as extension methods because the compiler would never be able to resolve which one to use. For example:
public static class IntegerEnumerableExtensions
{
public static IEnumerable<int> Squares(
this IEnumerable<int> source)
{
return source.Select(value => value * value);
}
public static IEnumerable<int> Cubes(
this IEnumerable<int> source)
{
return source.Select(value => value * value * value);
}
}
public static class EnumerableExtensions
{
public static IEnumerable<int> Cubes(
this IEnumerable<int> source)
{
return source.Select(value => value * value * value);
}
}
If we tried to use Cubes as an extension method, we’d get a compile error, for example:
var values = new int[] {1, 2, 3};
foreach (var v in values.Cubes())
{
Console.WriteLine(v);
}
This would result in error CS0121: The call is ambiguous between the following methods or properties.
To resolve the problem, we’d need to move one (or both) of the classes to another namespace, for example:
namespace Integers
{
public static class IntegerEnumerableExtensions
{
public static IEnumerable<int> Squares(
this IEnumerable<int> source)
{
return source.Select(value => value*value);
}
public static IEnumerable<int> Cubes(
this IEnumerable<int> source)
{
return source.Select(value => value*value*value);
}
}
}
namespace Numerical
{
public static class EnumerableExtensions
{
public static IEnumerable<int> Cubes(
this IEnumerable<int> source)
{
return source.Select(value => value*value*value);
}
}
}
Then, we can scope to a particular namespace to choose which Cubes to use:
Context: When considering extension methods, due to potential scoping problems.
Practice: Use extension methods sparingly.
Context: When designing extension methods.
Practice: Keep all extension methods that operate on a specific type in their own class.
Context: When designing classes to contain methods to extend a specific type, TypeName.
Practice: Consider naming the static class TypeNameExtensions.
Context: When designing classes to contain methods to extend a specific type, in order to scope the extension methods.
Practice: Consider placing the class in its own namespace.
Generally, there isn’t much need to use extension methods on types that you own. You can simply add an instance method to contain the logic that you want to have.
Where extension methods really shine is for effectively creating instance methods on interfaces.
Typically, when code is necessary for shared implementations of interfaces, an abstract base class is created so each implementation of the interface can derive from it to implement these shared methods. This is a bit cumbersome in that it uses the one-and-only inheritance slot in C#, so an interface implementation would not be able to derive or extend any other classes. Additionally, there’s no guarantee that a given interface implementation will derive from the abstract base and runs the risk of not being able to be used in the way it was designed. Extension methods get around this problem by being entirely independent from the implementation of an interface while still being able to extend it.
One of the most notable examples of this might be the System.Linq.Enumerable class introduced in .NET 3.5. The static Enumerable class almost entirely consists of extension methods that extend IEnumerable. It is easy to develop the same sort of thing for our own interfaces. For example, say we have an ICoordinate interface to model a three-dimensional position in relation to the Earth’s surface:
namespace ConsoleApplication
{
using Numerical;
internal class Program
{
private static void Main(string[] args)
{
var values = new int[] {1, 2, 3};
foreach (var v in values.Cubes())
{
Console.WriteLine(v);
}
}
}
}
We could create a static class to contain extension methods to provide shared functionality between any implementation of ICoordinate. For example:
public interface ICoordinate
{
/// <summary>North/south degrees from equator.</summary>
double Latitude { get; set; }
/// <summary>East/west degrees from meridian.</summary>
double Longitude { get; set; }
/// <summary>Distance from sea level in meters.</summary>
double Altitude { get; set; }
}
Context: When designing interfaces that require shared code.
Practice: Consider providing extension methods instead of abstract base implementations.