Code Configuration and Data

The three main ways to control the behavior of an application are code, configuration and data:

Code

Simply hard-coding can be a very quick way to get something done. This is a valuable approach, especially while prototyping or iterating quickly to get feedback. It is often “the simplest thing that could possibly work”, at least for now, and if treated with appropriate consideration, is acceptable. However it is important to recognize the compromise, and put in place a way of changing the implementation easily later. Often this means hiding the hard-coded values behind an interface, so that the implementation can be swapped out. For example, you could choose to hard-code something like an email address; it’s clearly not ideal, but will work. If you opt to wire up the hard-coded value through either an interface or constructor parameter, it is obvious where to make a single change later.

However, to change a hard-coded value means a re-compilation of the application, and a full build and deployment cycle. If you are practicing continuous deployment this might not be a big problem, but there is always an overhead. This will probably also mean downtime while the deployment takes place, which needs to be factored in – depending on the business context this might not be acceptable. Code is an implementation concern first and foremost, so hard-coding should only be done as an exceptional conscious decision rather than habit, and always with a view to replace when the time is right. Some bad examples I’ve seen include:

  • Hard-coding if/else statements against organization ids that actually describe the per-organization (data!) behavior of a feature which could have been captured as an entity and stored in data
  • Hard-coding things like logging level, making it impossible to change the verbosity without recompiling and reducing the team’s ability to investigate problems which could have been a configuration setting to enable temporarily changing the value

Configuration

Values in configuration are also relatively easy to use, as most frameworks have this capability easily to hand. If anything, the tendency many teams have is to over-use configuration, especially in .Net projects via the AppSettings mechanism. I’ve seen projects with dozens or even hundreds of AppSettings! If you want to store a lot of configuration, look into better ways to organize the configuration. The .Net framework offers 2 mechanisms that are particularly powerful here: custom config sections, where you can define an object model to make the configuration more obvious than arbitrary string keys with values, and putting configuration into separate files. I often like to separate out as much as possible into sections and files, because it opens up other possibilities. For example I often work on microservices style architectures, where there can be a fair amount of configuration describing the addresses of other services, and it is vital that all the services have the same address book; if the configuration is extracted to a separate file, I can populate it with a particular environment’s addresses once on the build server, and copy the transformed configuration to all services, removing the complexity of keeping multiple copies up to date separately.

Another common problem with configuration is ownership of the configuration keys. I’ve seen lots of in-house libraries that “push” configuration keys, by making their own configuration calls. In my opinion, the host application owns its own configuration. Individual components should not request configuration directly, but should require configuration to be injected.

Configuration changes are easier to make than code changes; they normally require a redeploy, but there is no recompilation needed. The configuration can be thought of as being an operational concern, and should mostly relate to the hosting environment. Things like connection strings or addresses are good candidates for configuration; they don’t change often and vary only by environment. Things like feature flags, or parameters that influence behavior should probably go somewhere else based on what would require them to change. Configuration changes often require input from a member of the technical teams, and often require downtime as part of deployment. It is technically possible to do things like manually edit configuration files on deployed instances, but not a good idea in practice, because all instances in a load-balancer need to be updated, and the deployment will drift from the repeatable scripted known path.

Data

Treating parameters that influence behavior as data is arguably the slowest way to implement; it requires external infrastructure to hold the data, and load it. However, it opens up the ability to make changes on the fly to a running application. Things like feature flags should ideally be treated as data, so that changing their values doesn’t need a redeployment or downtime. This also allows these parameters to be owned by the business. Changing the value is very cheap, and a natural evolution can be to add a user interface to allow super-users to make changes.

Data that will change frequently should be treated in this way from as early in the lifecycle as possible – it will quite simply be a waste of the team’s time to offer any other implementation as they will have to make frequent changes that they wouldn’t otherwise need to. The tendency on the whole is to over-use configuration and under-use data. Some examples I’ve seen that highlight this include:

  • Putting a list of times when a service will be available and when a maintenance page will be shown into a config file, knowing that the schedule is set 1 week at a time (giving the team a constant stream of work to update the “configuration”)
  • Putting blacklisted email addresses in config, and updating the list after any new address is found