When Feature Flags Do And Don’t Make Sense

View Reddit by whackriView Source



DaGrokLife · August 8, 2020 at 2:39 pm

>For example, mandating that every single code change should be behind a feature flag, “just in case we made a mistake”.

Oh I bet that’s totally not an unmanageable mess with weird side effects!

Architektual · August 8, 2020 at 3:39 pm

You should be removing the flag and it’s associated logic once the feature is released.

shinazueli · August 8, 2020 at 3:57 pm

In true Reddit fashion, I haven’t read the article and I’m going to express my opinion anyway.

The correct time to use a feature flag is:

– When you need different behavior for different environments. Like if you have different servers for each customer, and you aren’t going to turn on features that they haven’t paid for yet. There are other ways of doing this, notably just having a separate “feature service” that is keyed by customer ID, which you control.

– When you’re trying to ascertain whether a change causes subtle issues, and you can’t sufficiently test it in dev environments because “reasons”. Usually shit like “our dev environment is about three orders of magnitude smaller than our production environment so scale issues never show up there”. If you put it behind a feature flag than you can effectively redeploy without making large (risky) code changes. That being said, this matters *far* more in old, antiquated server designs than it does now, because in most setups I can easily roll back to the last deployed version, and making an environment change is just as expensive as making a code change is.

– When you want to ship code that is intended to be more of a beta, for review of a small chunk of your user population. You can use a feature flag to turn on/off beta features that aren’t yet ready for prime time. This is the best use of feature flags that I’ve seen.

fishling · August 8, 2020 at 4:45 pm

~~The article misses another use, which is simply~~ to separate code deployment from feature activation. If there is some specific launch milestone that you are targeting, it is better to have the code deployed and tested in production in advance and then just enabled on launch day.

I had the bad experience of working with a tech group that was pushing for feature flags to be used instead of any branching and merging. They thought this was required by trunk-based development and was “what Google does therefore is awesome”, ignoring the fact that Google has a huge amount of custom infrastructure and teams dedicated to tooling to make what they do all work.

Nope, everyone on all teams had to check into master every day for it to be true continuous integration according to Fowler, and use feature flags to isolate code, ignoring the fact that code isolated by all these feature flags isn’t actually integrated together unless you are testing all those flags, which they weren’t.

Edit: strikethrough part irrelevant to my point to avoid pedantic replies

richizy · August 8, 2020 at 5:12 pm

_Note that I’ll be talking about feature flags in the sense of (gradually, plus with canarying) launching code to 100% to prod and not A/B testing._

A good article, but I think it’s too superficial to really gain any insight. I don’t see enough details from the KCG anecdote to prove any meaningful conclusion.

Of course there are problems with feature flagging, but what’s the alternative? Don’t use feature flags when it’s too troubling? The main benefit of these flags is that if something goes wrong, it’s much easier to rollback the flag than it is to rollback the feature code itself.

> While feature flags are great in some cases, we should also keep in mind their costs. Software engineering is primarily an exercise in managing complexity. And each feature flag immediately doubles the universe of corner cases that your programmers have to understand, and your code is required to handle. “But what would happen if Foo is enabled, Bar is disabled, and we do independent A/B tests on Baz and Kaz on the same day?” In my experience, this combinatorial explosion in complexity can and will lead to bugs. Not to mention slowing down the speed at which your team can make any changes.

This combinatorial explosion can be dealt with in two ways.

1. Modularize your flags in a way that they do not interact or interfere with each other. This may be difficult to introduce into legacy code but with careful design can be done.
2. Test in an “All Launches On” and “All Launches Off” configuration. Ideally do this for both unit and integration tests. This cuts your combinatorial explosion into just two system-under-tests you have to worry about. Then when you’re done launching, you can remove the flag entirely.

I should warn this technique is not a cure-all, it may not guarantee test coverage over all combinations of flags, but it’s a start.

djk29a_ · August 8, 2020 at 5:31 pm

Seems reasonably balanced of an overview / intro on anti-patterns for feature flags but I hesitate to call the Knight Capital bungle a shining example of feature flag misuse. If anything, it showed the problem with attempts to do an RCA on a single complicated incident in real world systems.

What bothers me with most anti-pattern articles is a lack of solutions and a lack of empathy or evidence for _why_ people reach for the pattern incorrectly. Engineers stuck in a feature factory hell with spaghetti code aren’t going to code their way out of a death spiral when the business has every incentive to keep things going the same way, for example.

ooohhimark · August 8, 2020 at 7:33 pm

I am wondering if there is a better way around feature flags when developers have to create different functionality across many countries with different laws. You are from the Netherlands you get flag A, you are in UK you get flag B.

jeffbell · August 8, 2020 at 7:33 pm

This week: “This change should be totally safe. It is turned off behind a feature flag.”

Next week: “This change should be totally safe. All it does is flip a feature flag.”

PiLLe1974 · August 8, 2020 at 8:25 pm

Interesting topic, still have to learn about this.

We’re just discussing this on a team where many newcomers come from a background of continuous integration and short-lived changelists (not even branches) merged into a master branch.

Now more modules and custom branches exist on that team and we switch (our mindsets and version control) to a git environment with staging, release and dev branches, in general a development environment that implies often more long-lived off-master branches than we’re used to.

We see that some veteran team members mostly use feature flags sparingly for very big changes in early development spanning across dependent modules of the system, changes that gradually land in lock-step fashion in multiple module repository master and need to be tested thoroughly including testing of deployed versions (on multiple platforms) to test also in “real world” environments, not just on dev branches.

…I’ll see what I think about feature flags when I spot more of them (hoping most are limited to one or two depending modules).

codemonkey14 · August 8, 2020 at 10:05 pm

We use feature flags for every new line of code. We also have a policy of cleaning up feature flags once they’re live and proven stable.

“Everything should be caught in QA” is laughable. “Poor man’s binary rollback” lol what about an instantaneous off switch for new code is “the poor man’s rollback”? Sure as shit beats trying to manage different copies of the application scattered throughout different servers/pods/etc.

Nothing is a silver bullet, but a quick on/off switch for new code is amazingly easy to implement and a great compliment to other risk aversion measures like alerting and automated roll backs. Hell, a sophisticated feature flag system can even automatically roll back a recently activated feature flag if too many errors start showing up on logs (LinkedIn built a fairly complex system around this that apparently works well for them)

icesurfer10 · August 8, 2020 at 11:02 pm

One problem with feature flags is that in theory, you have a very large number of different combinations. Some may be reliant upon others, some may not.

Either way, you need to devise a sensible test strategy. You have extra scenarios to be tested.

Otherwise it’s very possible that turning a feature off becomes almost as risky as turning one on.

binaryfireball · August 8, 2020 at 11:57 pm

I prefer strategy patterns that can be switched out via config vs the sea of ifdefs that I sometimes have to deal with…. You can make the argument that this implementation is a form of feature flagging but really I’m just venting about the 1k lines of code in a file that are not enabled and are not checked by the compiler. It’s messy to read and makes me sad.

vxd · August 9, 2020 at 12:18 am

Hijacking this thread: Do y’all use any tools, services, or strategies to make managing feature flags easier?

AttackOfTheThumbs · August 10, 2020 at 3:02 pm

It’s weird to me how one month ago, the same article got no traction when posted here. https://old.reddit.com/r/programming/comments/hpe9m6/when_feature_flags_do_and_dont_make_sense/

I use feature flags 100% for beta testing, or as we call them, experimental. Users don’t know of them and are only told when they face the specific issue. Then we tell them they can test this or that. Eventually they may become an actual configuration choice, if they get good feedback. 25% of them die.

Leave a Reply

Your email address will not be published. Required fields are marked *