(Not) Solving Hard Problems
I don’t like solving hard problems.
They can be fun to solve, and appeal to the my geeky side. But trying to solve the really hard problems can be risky for your business. This counters the de facto stance of higher education and academia, where hard problems are solved, in many cases, because they are hard problems. Simple, elegant solutions to practical problems are dismissed as lacking academic rigor.
A couple of examples why solving hard problems can be costly.
The first involves a program that is developed to back up customer records from host A to host B. After an initial full backup, only the deltas (changes) are sent, however the protocol used to send the changes is not guaranteed to send the changes in the same order in which the changes were made. This means that situations where a record was edited and then deleted on host A could result in the record being deleted on host B and then a stale edit being applied. This problem isn’t so bad as long as host B can robustly handle attempts to access invalid records. However a more pathological case to consider is where a record on host A is deleted and then the exact same record is re-added. On host B, this could manifest as an addition of a duplicate record, which would be ignored, and then a deletion of a record. The result is that a record on host A would not appear on host B when the backup was run. Assume that the cost of changing the protocol is very high.
Before attempting the complicated step of re-designing the communication protocol or the change logic to ensure proper ordering of events, one should consider how often this particular case will occur in common usage. If the case is common, then a more robust solution may be required. If the case occurs or is expected to occur very infrequently, then a more practical approach would be to clearly document the scenario and describe its impact. In this case in question, if a second backup is run before any further edits are done to the record on host A, then the data on host A and host B will be synchronized. If the cost of running two sequential backups is small, that might be the better solution.
The second example is to do with two communications systems that need to allow users of each to have access to users from the other. They are currently incompatible, but serve a similar function. An engineer notes that a proxy method could be used to translate the protocol used by the first system into equivalent primitives on the protocol used by the second system. While this solution might have the lowest cost at first blush, it is likely to be the most complicated. Proxying two protocols could exponentially increase the number of error conditions in the combined system. This will make design, development and debugging more time consuming and costly. The total cost of ownership will increase. Even more insidiously, this solution appears very simple even though it is not. Business types, executives and even some engineers, will dramatically under-estimate the timeframe and cost of deployment.
An alternative approach would be to give the clients of one of the communications system direct access to the other. While this solution might appear less attractive, because end users will have to use one interface for each system, the ultimate recurring cost may be lower. In order to evaluate this approach, one would have to consider the cost of deploying the solution and training the user base.
Both of the scenarios above were based on real-world business decisions I was involved in over the last year, and I am happy to say that reason won out both times. The key point to drive home is that complicated technical solutions often have hidden costs, so do your homework before jumping to a conclusion.