Closely related data can be handled more efficiently when grouped together into a collection. Instead of writing separate code to handle each individual object, you can use the same code to process all the elements of a collection.
To manage a collection, use the Array class and the System.Collections classes to add, remove, and modify either individual elements of the collection or a range of elements. An entire collection can even be copied to another collection.
Some Collections classes have sorting capabilities, and most are indexed. Memory management is handled automatically, and the capacity of a collection is expanded as required. Synchronization provides thread safety when accessing members of the collection. Some Collections classes can generate wrappers that make the collection read-only or fixed-size. Any Collections class can generate it's own enumerator that makes it easy to iterate through the elements.
In the .NET Framework version 2.0, generic collection classes provide new functionality and make it easy to create strongly typed collections. See the System.Collections.Generic and System.Collections.ObjectModel namespaces.
The LINQ to Objects feature allows you to use LINQ queries to access in-memory objects as long as the object type implements IEnumerable or IEnumerable<(Of <(<T>)>)>. LINQ queries provide a common pattern for accessing data; they are typically more concise and readable than standard foreach loops; and provide filtering, ordering and grouping capabilities. LINQ queries can also improve performance.
Defining Collections
A collection is a set of similarly typed objects that are grouped together.
Objects of any type can be grouped into a single collection of the type Object to take advantage of constructs that are inherent in the language. For example, the C# foreach statement (for each in Visual Basic) expects all objects in the collection to be of a single type.
However, in a collection of type Object, additional processing is done on the elements individually, such as boxing and unboxing or conversions, which affect the performance of the collection. Boxing and unboxing typically occur if storing or retrieving a value type in a collection of type Object.
Generic collections, such as List<(Of <(<'T>)>)> avoid these performance hits if the type of the element is the type that the collection is intended for. In addition, strongly typed collections automatically perform type validation of each element added to the collection.
All collections that directly or indirectly implement the ICollection interface or the ICollection<(Of <(<'T>)>)> generic interface share several features in addition to methods that add, remove, or search elements:
An Enumerator
An enumerator is an object that iterates through it's associated collection. It can be thought of as a movable pointer to any element in the collection. An enumerator can be associated with only one collection, but a collection can have multiple enumerators. The C# foreach statement (for each in Visual Basic) uses the enumerator and hides the complexity of manipulating the enumerator.
Synchronization Members
Synchronization provides thread safety when accessing elements of the collection. The collections are not thread safe by default. Only a few classes in the System.Collections namespaces provide a Synchronize method that creates a thread-safe wrapper over the collection. However, all classes in all System.Collections namespaces provide a SyncRoot property that can be used by derived classes to create their own thread-safe wrapper. An IsSynchronized property is also provided to determine whether the collection is thread safe. Synchronization is not available in the ICollection<(Of <(<'T>)>)> generic interface.
The CopyTo method
All collections can be copied to an array using the CopyTo method; however, the order of the elements in the new array is based on the sequence in which the enumerator returns them. The resulting array is always one-dimensional with a lower bound of zero.
Note that the ICollection<(Of <(<'T>)>)> generic interface has additional members, which the non-generic interface does not include.
The following features are implemented in some classes in the System.Collections namespaces:
Capacity and Count
The capacity of a collection is the number of elements it can contain. The count of a collection is the number of elements it actually contains.
All collections in the System.Collections namespaces automatically expand in capacity when the current capacity is reached. The memory is reallocated, and the elements are copied from the old collection to the new one. This reduces the code required to use the collection; however, the performance of the collection might still be negatively affected. The best way to avoid poor performance caused by multiple reallocations is to set the initial capacity to be the estimated size of the collection.
Lower Bound
The lower bound of a collection is the index of it's first element. All indexed collections in the System.Collections namespaces have a lower bound of zero. Arrays have a lower bound of zero by default, but a different lower bound can be defined when creating an instance of the Array class using CreateInstance.
System.Collections classes can generally be categorized into three types:
Commonly used collections
These are the common variations of data collections, such as hash tables, queues, stacks, dictionaries, and lists. Commonly used collections have generic versions and non-generic versions.
Bit collections
These are collections whose elements are bit flags. They behave slightly differently from other collections.
Be sure to choose a collection class carefully. Because each collection has it's own functionality, each also has it's own limitations.
Commonly used Collection Types
Collection types are the common variations of data collections, such as hash tables, queues, stacks, dictionaries, and lists.
Collections are based on the ICollection interface, the IList interface, the IDictionary interface, or their generic counterparts. The IList interface and the IDictionary interface are both derived from the ICollection interface;
therefore, all collections are based on the ICollection interface either directly or indirectly.
Every element contains a value in collections based on the IList interface (such as Array, or List<(Of <(<'T>)>)>) or based directly on the ICollection interface LinkedList<(Of <(<'T>)>)>).
Every collection based on the ICollection interface (such as the Dictionary<(Of <(<'TKey, TValue>)>)> generic class) contains both a key and a value.
The KeyedCollection<(Of <(<'TKey, TItem>)>)> class is unique because it is a list of values with keys embedded within the values and, therefore, it behaves like a list and like a dictionary.
Collections can vary, depending on how the elements are stored, how they are sorted, how searches are performed, and how comparisons are made. The elements of a Dictionary<(Of <(<'TKey, TValue>)>)> are accessible only by the key of the element, but the elements of a KeyedCollection<(Of <(<'TKey, TItem>)>)> are accessible either by the key or by the index of the element. The indexes in all collections are zero-based, except Array, which allows arrays that are not zero-based.
Array Collection Type
The Array class is not part of the System.Collections namespaces. However, it is still a collection because it is based on the IList interface.
The rank of an Array object is the number of dimensions in the Array. An Array can have one or more ranks.
The lower bound of an Array is the index of it's first element. An Array can have any lower bound. It has a lower bound of zero by default, but a different lower bound can be defined when creating an instance of the Array class using CreateInstance.
Unlike the classes in the System.Collections namespaces, Array has a fixed capacity. To increase the capacity, you must create a new Array object with the required capacity, copy the elements from the old Array object to the new one, and delete the old Array.
List Collection Types
The generic List<(Of <(<'T>)>)> class provides features that are offered in most System.Collections classes but are not in the Array class. For example:
- The capacity of an Array is fixed, whereas the capacity of a List<(Of <(<'T>)>)> is automatically expanded as required. If the value of the Capacity property changes, the memory reallocation and copying of elements occur automatically.
- The List<(Of <(<'T>)>)> class provide methods that add, insert, or remove a range of elements. In Array, you can get or set the value of only one element at a time.
- The List<(Of <(<'T>)>)> provides methods that return read-only and fixed-size wrappers to the collection. The Array class does not.
On the other hand, Array offers some flexibility that List<(Of <(<'T>)>)> does not. For example:
- You can set the lower bound of an Array, but the lower bound of a List<(Of <(<'T>)>)> is always zero.
- An Array can have multiple dimensions, while a List<(Of <(<'T>)>)> always has exactly one dimension.
Most situations that call for an array can use a List<(Of <(<'T>)>)> instead; they are easier to use and, in general, have performance similar to an array of the same type.
Array is in the System namespace; List<(Of <(<'T>)>)> is in the System.Collections.Generic namespace.
Dictionary Collection Types
You can use the Dictionary<(Of <(<'TKey, TValue>)>)> generic class which implements the IDictionary interface. The Dictionary<(Of <(<'TKey, TValue>)>)> generic class also implements the IDictionary<(Of <(<'TKey, TValue>)>)> generic interface. Therefore, each element in these collections is a key-and-value pair.
Queue Collection Types
The Queue<(Of <(<'T>)>)> generic class is a first-in-first-out (FIFO) collection class that implements the ICollection interface and the ICollection<(Of <(<'T>)>)> generic interface.
The Queue<(Of <(<'T>)>)> and Stack<(Of <(<'T>)>)> generic classes are useful when you need temporary storage for information, that is, when you might want to discard an element after retrieving it's value. Use Queue<(Of <(<'T>)>)> if you need to access the information in the same order that it is stored in the collection. Use Stack<(Of <(<'T>)>)> if you need to access the information in reverse order.
Three main operations can be performed on a Queue<(Of <(<'T>)>)> and it's elements:
- The Enqueue method adds an element to the end of the queue.
- The Dequeue method removes the oldest element from the start of the queue.
- The Peek method returns the oldest element from the start of the queue but does not remove it from the queue.
Stack Collection Types
The Stack<(Of <(<'T>)>)> generic class is a last-in-first-out (LIFO) collection class that implements the ICollection interface. The Stack<(Of <(<'T>)>)> generic class also implements the ICollection<(Of <(<'T>)>)> generic interface.
Use a queue if you need to access the information in the same order that it is stored in the collection. Use a stack if you need to access the information in reverse order.
A common use for a stack is preserving variable states during calls to other procedures.
Three main operations can be performed on a stack and it's elements:
- The Push method inserts an element at the top of the stack.
- The Pop method removes an element at the top of the stack.
- The Peek method returns an element at the top of the stack but does not remove it from the stack.
Bit Collection Type
Bit collections are collections whose elements are bit flags. Because each element is a bit instead of an object, these collections behave slightly differently from other collections.
The BitArray class is a collection class in which the capacity is always the same as the count. Elements are added to a BitArray by increasing the Length property; elements are deleted by decreasing the Length property. The BitArray class provides methods that are not found in other collections, including those that allow multiple elements to be modified at once using a filter, such as And, Or, Xor , Not, and SetAll.