Dec 29

Self registering factories in C#

c-sharp

One code requirement I frequently have into is the need to create specific instances given a descriptive string.

That is, supposing I have a base class DataType, and the concrete child classes IntegerDataType and BooleanDataType:

    public abstract class DataType
    {
        public abstract string GetDefaultValue();
    }
    // ...
    public class IntegerDataType : DataType
    {
        public override string GetDefaultValue()
        {
            return "0";
        }
    }
    // ...
    public class BooleanDataType : DataType
    {
        public override string GetDefaultValue()
        {
            return "false";
        }
    }

I want to convert a type name string, such as “bool” or “int” to an instance of a class that inherits from DataType. Typically the desire is to get some information specific to the child class by calling an instance function of that class - e.g. BooleanDataType’s GetDefaultValue. In this example, I want to get a string representing the default value of that data type.

We can perform the logic to determine the concrete type anywhere, specifically:

    public static void PrintDefaultForType(DataType dataType)
    {
        if (dataType != null)
        {
            Console.WriteLine(dataType.GetDefaultValue());
        }
        else
        {
            Console.WriteLine("unknown");
        }
    }

    public static void Main (string[] args)
    {
        string dataTypeName = "bool";

        if (dataTypeName == "bool")
            PrintDefaultForType(new BooleanDataType());
        else if (dataTypeName == "int")
            PrintDefaultForType(new IntegerDataType());
    }

An improvement

Of course, the obvious problem with this approach is that it becomes un-DRY as soon as we have multiple locations that might need to convert an input string to a concrete DataType implementation. We can easily move the logic into a function, eliminating the un-DRY-ness:

    public static void PrintDefaultForType2(string typeName)
    {
        DataType dataType = null;
        switch (typeName)
        {
        case "bool": dataType = new BooleanDataType(); break;
        case "int": dataType = new IntegerDataType(); break;
        }

        PrintDefaultForType(dataType);
    }

    public static void Main (string[] args)
    {
        PrintDefaultForType2("bool");
        PrintDefaultForType2("int");
    }

Now we are left with a different problem - adding a new DataType requires modification in multiple locations: first to add the new class, then to add the code to create it based on the typeName string. Certainly not the end of the world (in this simple example, at least), but wouldn’t it be nice if we could just add the new DataType class and be done with it? Fortunately, .NET’s reflection makes this a pretty simple task, so let’s start by modifying our base DataType to be aware of its children:

    using System.Collections.Generic;
    using System.Reflection;
    // ...
    public abstract class DataType
    {
        // ...
        protected abstract string GetTypeName();

        private static Dictionary<string, Type> sTypeMap = CreateTypeMap();
        private static Dictionary<string, Type> CreateTypeMap()
        {
            Dictionary<string, Type> typeMap = new Dictionary<string, Type>();

            Assembly currAssembly = Assembly.GetExecutingAssembly();

            Type baseType = typeof(DataType);

            foreach (Type type in currAssembly.GetTypes())
            {
                if (!type.IsClass || type.IsAbstract ||
                    !type.IsSubclassOf(baseType))
                {
                    continue;
                }

                DataType derivedObject =
                    System.Activator.CreateInstance(type) as DataType;
                if (derivedObject != null)
                {
                    typeMap.Add(
                        derivedObject.GetTypeName(),
                        derivedObject.GetType());
                }
            }

            return typeMap;
        }
    }

So what’s going on here exactly? We are going to use the GetTypeName function (which returns “bool”, “int”, etc; to identify the concrete child class type from a string) to build a dictionary of string type names mapped to the concrete types that they represent. This will be useful later, when we want to actually create a concrete type from that string, but for now let’s just focus on how we are building this mapping.

To find our types, we are going to look through all the types in the current assembly, looking for types that are:

A class type, since our concrete child types must be classes.
Not abstract, since we’ll need to instantiate the child type.
A sub-class of our base type.

Here’s the pertinent code from CreateTypeMap:

    foreach (Type type in currAssembly.GetTypes())
    {
        if (!type.IsClass || type.IsAbstract ||
            !type.IsSubclassOf(baseType))
        {
            continue;
        }
        // ...

Once we have verified that a type meets these criteria, we try to instantiate it - since we need to get the type name that it uses, and we must call an instance method to do so.

Assuming all goes well in the previous step, we can now add an entry to our mapping, e.g.: [ “bool”, typeof(BooleanDataType) ].

Now that we have our mapping of type names to concrete Types, we are ready to make our factory function. We’ll put this in our abstract DataType base class:

    public abstract class DataType
    {
        // ...

        public static DataType Create(string typeName)
        {
            Type derivedType = null;
            if (sTypeMap.TryGetValue(typeName, out derivedType))
            {
                return System.Activator.CreateInstance(derivedType) as DataType;
            }
            return null;
        }
    }

This bit is pretty straightforward - if we’ve got an entry in our mapping, we’ll instantiate the type and return it.

Putting it all together

We have all the pieces, now we can put them together to form our factory.

DataType.cs

using System;
using System.Collections.Generic;
using System.Reflection;

namespace SelfRegisteringFactory
{
    public abstract class DataType
    {
        public static DataType Create(string typeName)
        {
            Type derivedType = null;
            if (sTypeMap.TryGetValue(typeName, out derivedType))
            {
                return System.Activator.CreateInstance(derivedType)
                    as DataType;
            }
            return null;
        }


        public abstract string GetDefaultValue();


        protected abstract string GetTypeName();

        private static Dictionary<string, Type> sTypeMap = CreateTypeMap();

        private static Dictionary<string, Type> CreateTypeMap()
        {
            Dictionary<string, Type> typeMap =
                new Dictionary<string, Type>();

            Assembly currAssembly = Assembly.GetExecutingAssembly();

            Type baseType = typeof(DataType);

            foreach (Type type in currAssembly.GetTypes())
            {
                if (!type.IsClass || type.IsAbstract ||
                    !type.IsSubclassOf(baseType))
                {
                    continue;
                }

                DataType derivedObject =
                    System.Activator.CreateInstance(type) as DataType;
                if (derivedObject != null)
                {
                    typeMap.Add(
                        derivedObject.GetTypeName(),
                        derivedObject.GetType());
                }
            }

            return typeMap;
        }
    }
}

BooleanDataType.cs

using System;

namespace SelfRegisteringFactory
{
    public class BooleanDataType : DataType
    {
        public BooleanDataType()
        {
        }

        public override string GetDefaultValue()
        {
            return "false";
        }

        protected override string GetTypeName()
        {
            return "bool";
        }
    }
}

IntegerDataType.cs

using System;

namespace SelfRegisteringFactory
{
    public class IntegerDataType : DataType
    {
        public IntegerDataType ()
        {
        }

        public override string GetDefaultValue ()
        {
            return "0";
        }

        protected override string GetTypeName ()
        {
            return "int";
        }
    }
}

Main.cs

using System;

namespace SelfRegisteringFactory
{
    class MainClass
    {
        public static void Main (string[] args)
        {
            PrintDefaultForType("bool");
            PrintDefaultForType("int");
        }

        public static void PrintDefaultForType(string typeName)
        {
            DataType dataType = DataType.Create(typeName);

            if (dataType != null)
            {
                Console.WriteLine(dataType.GetDefaultValue());
            }
            else
            {
                Console.WriteLine("unknown");
            }
        }
    }
}

… and that’s it.

This is a simple but powerful way to make self-registering factories in C#. By using this method, the only change you need to make to add a new DataType class is to just add the class. No need to hunt down switch/case statements, external references, or anything else.

This example was a fairly basic implementation of this idea, but can certainly be built upon to form much more advanced self-registering factories.

There are a few shortcomings to this approach, but most can be overcome without too much difficulty, depending on the specific situation:

The child classes must have a default constructor, and be safe to instantiate from the CreateTypeMap function. This can be addressed in a couple different ways, the one I have favored in the past has been to make the child classes have a generic class argument that can be read via reflection in the CreateTypeMap function. Here’s what the code would look like:

DataType.cs

using System;
using System.Collections.Generic;
using System.Reflection;

namespace SelfRegisteringFactory
{
    public abstract class DataTypeBase<T> : DataType
    {
    }

    public abstract class DataType
    {
        public abstract string GetDefaultValue();

        public static DataType Create(string typeName, int importantArg)
        {
            Type derivedType = null;
            if (sTypeMap.TryGetValue(typeName, out derivedType))
            {
                return System.Activator.CreateInstance(derivedType, importantArg)
                    as DataType;
            }
            return null;
        }

        private static Dictionary<string, Type> sTypeMap = CreateTypeMap();

        private static Dictionary<string, Type> CreateTypeMap()
        {
            Dictionary<string, Type> typeMap =
                new Dictionary<string, Type>();

            Assembly currAssembly = Assembly.GetExecutingAssembly();

            Type baseType = typeof(DataType);

            foreach (Type type in currAssembly.GetTypes())
            {
                if (!type.IsClass || type.IsAbstract ||
                    !type.IsSubclassOf(baseType))
                {
                    continue;
                }

                string typeName =
                    type.BaseType.GetGenericArguments()[0].Name;

                typeMap.Add(typeName, type);
            }

            return typeMap;
        }
    }
}

BooleanDataType.cs

using System;

namespace SelfRegisteringFactory
{
    public class BooleanDataType : DataTypeBase<bool>
    {
        public BooleanDataType(int importantArg)
        {
            // Does something with importantArg
        }

        public override string GetDefaultValue()
        {
            return "false";
        }
    }
}

Conflicts can arise when two different child classes attempt to register the same type name string. This can be handled in a number of ways at runtime, but I’ve yet to find a way to handle it at compile time (which would be ideal). In practice, I haven’t actually found this to be much of an issue, since the child class names usually parallel the identification strings.