Saturday, December 22, 2018

Smart versus Complex Code

Dumb Code vs Smart Code

When is smart code too smart? When is dumb code too dumb. It is often said that the smartest people express the most complex ideas in the simplest ways. But in the field of software, sometimes code has necessary complexity. I must admit I have been the target of accusation regarding complex code. But how complex was/is it really...?

The Definition of Complexity

This blog is somewhat introspective, so I will ask myself: what do people mean by "complex". Secondly, is complexity really up for debate? And lastly, who decides, or perhaps rather how should we decide when something is complex and when it is necessary. Clearly, one person's complexity is another's banana (easily ingested); so...am I coding bananas or rats-nests?

Definitions of complexity may be one of the following:
  1. The average human (or coder) finds it hard to understand.
  2. The code looks like Frankenstein in a beat-up jalopy driving through New York City at rush-hour, while the police force, public transportation and garbage collectors are out on strike. Aka "the rats-nest".
  3. The programmer has used every trick, tool, idiom, and pattern supported in the language. Sometimes called complexity for complexity's sake. Also, "job security".
  4. Terseness of expression; requires an exorbitant amount of time to analyze code as correct.

Analyzing Complexity

I suppose the first question to ask oneself when accusations of complexity come flying is: "Precisely how do you perceive my code to be complex?". I suppose this dialog could be interactive with said accuser. And, of course, this question could be asked introspectively. (Much like I am doing right now...which feels a lot like talking to myself. "Wait who said that?...the inner sage...no you're not...yes I am...ummm, nevermind"). In any event, it would be good to understand the way in which one's code is complex. 

Is the Complexity up for Debate?!

After analysis, the question should be asked: "How reliable and competent is the accuser in making these accusations". This probably should be left up to one's inner monologue (perhaps that goes without saying...). In any event, their competence/experience and reliability may directly affect how we should respond to the accusation. Lets take a closer look, definition by definition:
  • Average employee accuser accuses, unable to clarify how or why they think it is complex.
    • If the average is pretty average, e.g. they have trouble tying their shoes. The accusation can probably be safely and sagely ignored. Move on with your life (and... stop blogging and refactoring at 1/4 to midnight)
    • If you work at Sandia National Laboratory or MIT Lincoln Labs perhaps some consideration should be given to the complexity of the problem being addressed. Probably warrants investigation.
  • Accuser calls out "Rats-Nest": 
    • Pretty much all non-comatose humans can spot a rats-nest - it's part of our genetic make-up since it keeps our toes from being nibbled upon at inopportune times. This accusation requires some serious consideration. Rats-nests are usually fraught with bugs, conversely as code fraught with bugs is usually a rats-nest. Time to go-a refactoring.
    • Except in the very unusual case that the accuser is your sworn enemy and is just out to undermine your credibility. Then, just maybe, accusation can be ignored.
  • Accuser seems to have noticed that you have used the Visitor pattern in conjunction with anonymous instance classes delegating through a Chain-of-responsibility, Composite acyclic N-ary tree to draw your screens output:
    • Hmmm... if the accuser was smart enough to decompose your drawing routine with that amount of detail, they might know what they are talking about. Unless that complexity is providing some serious optimization or unless the alternatives are even worse, it's time to go-a refactoring.
    • OTOH, if that accuser just found out how to tie their shoes that morning and they were able to figure out the above. Congratulations, you are probably a genius and your code is most likely GOLD. Go have a beer...
  • Accuser just read Design Patterns and sees that your class uses Singleton, Proxy, Null-Object, Immutable and Strategy all at the same time! 
    • Newbie, alert! While their enthusiasm is infectious and noteworthy, their lack of experience is demonstrated in the fact that those patterns and idioms are as common as inheritance and composition are fundamentals of object-oriented languages. Move on...
  • Accuser accuses you of using the ternary expression (GASP) and thus your code is too terse! 
    • The ternary operator exists for a reason. Move along to that kick-butt recursive descent parser that you are working on...
  • Accuser accuses you of placing an entire sorting algorithm packed in one line of code. 
    • That's just ridiculous. Have fun at the next Obfuscated Code Conference. For everyones' sake go-a refactoring.
  • Newbie accuser does not see a single if/else or switch statement in any of your code... yet it handles all sorts of special cases! 
    • Ignore the newbie, and my congratulations for understanding and applying State Pattern. Reward yourself with some much needed sleep...
  • Accuser sees that you have refactored 10K lines of code down to 1K lines.
    • Tough call. Consideration to the quality of the original code must be analyzed. Terseness to this degree may make the overall design opaque and resistant to change. May want to refactor.
    • Additionally, programmer may want to double check that hidden features such as exception safety and error recovery are as robust as the original. Perhaps time to go-a refactoring. 
    • Or... Great job. The new code may be just what the doctor ordered; that is, if it is just large enough to do exactly what is needed and no less. Go have a cigar...

Living with the Guilt

To wrap up this discussion (perhaps only with myself). Complexity clearly comes in a variety of flavors. Probably the two most egregious examples of complexity are #2 "the true rats-nest" and #3 "unnecessary complexity".
  • The top honors goes to the rats-nest; the reason is fairly obvious. Other forms of complexity may only make the lives of maintainers of the code difficult. Rats-nests, on the other hand, hurt everyone: coders, users, babies, small furry pets, everyone loses. Say 10000 Hail Mary's (I will pray for you)...
  • Unnecessary Complexity. Most of the time unnecessary complexity is incidental. You might just have not thought of the easiest solution yet. Other times, you may just be a "fancy boy". The guilty parties know who they are... 
So when should you feel a modicum guilty? That answer is easy.
  • One, when it really isn't complex at all; it is misunderstood by a junior programmer. Guilt newbie into buying a couple of books...
  • Two, when the algorithm mandates it. For instance, "egrep" I have recently learned uses finite-automata to do back-tracing in O(1) (constant) time. I don't really have any idea of what finite-automata are, probably something in one of those Comp-Sci classes that I never took - but I feel smarter just saying it... finite-automata, yeah that's right. In any event, egrep can be orders of magnitude faster than grep in certain search scenarios because of... finite-automata... and O(1) back-tracing, so, clearly in this case the complexity of... finite-automata... is practical. Enjoy a SnackWells Devil's Food Cake....
  • Last, terseness of expression is sometimes celebrated and sometimes scorned. Terseness and quality sometimes go hand-in-hand, but so can terseness and obfuscation. So it is probably right to feel a little guilty anyway...

Wednesday, May 3, 2017

Love-Hate relationship with CMake

I have a Love-Hate relationship with CMake. That is, I love to hate CMake... because it may literally be the worst language ever invented and certainly the worst I've ever used. It is opaque, does tons of magic things that are incomprehensible, and does them in a super inefficient way! I think I literally despise the tool. After using it fairly regularly for over a year I have neither acclimated to the tool or have warmed up to it in any way. Every time I want to do the simplest thing with CMake it can take me and my fellow developers hours to solve it effectively (and by effectively that is to say hack something in and hope that it works). I think it has also cost my company literally years of man-hours to migrate over to it as our primary build system. Worse still, due to the complexity and pain of migrating to it, I think those who made the decision to do so would never admit that it is terrible and will never want to move to something else that would be sane like... I don't know a bash script would be better. I feel like they, and everyone else, are just following the trendy herd.


The thing with CMake is that I NEVER feel like I am doing something other than hackery that isn't going to blow up in my face at any minute. To put this in perspective, C++ Template Meta Programming is more comprehensible.

As evidence to how terrible it is I just visited the FAQ Page to find a nugget on how to accomplish an uninstall target in a make file.  I've pasted that entire bit below (all credit and copyrights goes back to Kitware for the below). So what takes an incomprehensible bit of CMake (yeah good luck writing that from scratch...) they casually mention could also be done in one super simple and easy to understand xargs statement in shell. It's no wonder I prefer nothing but Make and Bash.


Can I do "make uninstall" with CMake?

By default, CMake does not provide the "make uninstall" target, so you cannot do this. We do not want "make uninstall" to remove useful files from the system.
If you want an "uninstall" target in your project, then nobody prevents you from providing one. You need to delete the files listed in install_manifest.txt file. Here is how to do it. First create file cmake_uninstall.cmake.in in the top-level directory of the project:
if(NOT EXISTS "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")
  message(FATAL_ERROR "Cannot find install manifest: @CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")
endif(NOT EXISTS "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")

file(READ "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt" files)
string(REGEX REPLACE "\n" ";" files "${files}")
foreach(file ${files})
  message(STATUS "Uninstalling $ENV{DESTDIR}${file}")
  if(IS_SYMLINK "$ENV{DESTDIR}${file}" OR EXISTS "$ENV{DESTDIR}${file}")
    exec_program(
      "@CMAKE_COMMAND@" ARGS "-E remove \"$ENV{DESTDIR}${file}\""
      OUTPUT_VARIABLE rm_out
      RETURN_VALUE rm_retval
      )
    if(NOT "${rm_retval}" STREQUAL 0)
      message(FATAL_ERROR "Problem when removing $ENV{DESTDIR}${file}")
    endif(NOT "${rm_retval}" STREQUAL 0)
  else(IS_SYMLINK "$ENV{DESTDIR}${file}" OR EXISTS "$ENV{DESTDIR}${file}")
    message(STATUS "File $ENV{DESTDIR}${file} does not exist.")
  endif(IS_SYMLINK "$ENV{DESTDIR}${file}" OR EXISTS "$ENV{DESTDIR}${file}")
endforeach(file)
Then in the top-level CMakeLists.txt add the following logic:
# uninstall target
configure_file(
    "${CMAKE_CURRENT_SOURCE_DIR}/cmake_uninstall.cmake.in"
    "${CMAKE_CURRENT_BINARY_DIR}/cmake_uninstall.cmake"
    IMMEDIATE @ONLY)

add_custom_target(uninstall
    COMMAND ${CMAKE_COMMAND} -P ${CMAKE_CURRENT_BINARY_DIR}/cmake_uninstall.cmake)
Now you will have an "uninstall" target at the top-level directory of your build tree.
Instead of creating an "uninstall" target, Unix users could enter this command in the shell:
 xargs rm < install_manifest.txt

Tuesday, June 21, 2016

Bash PhoneHome Demonstration

Executing your Executables

No I am not about to go on some right-wing rant to excise evil programs. Rather I am going to discuss a nice way to use Bash to launch your executables. For those of us that write *nix programs for a living we often use Bash (or other shell) script to manage the launch of the program but not always in the most flexible manner. We use a script because the actual program launch is complicated and involves executing multiple programs. Or we use one because launching our program requires a tedious number of arguments. And finally, and most importantly, we often use a script to isolate users from the complication of executing something that isn't trivially simple (and even moderately simple ones such as a java-based application).

So one common issue that launch scripts have is that they typically need to know the path to associated program files, whether that is other executables and scripts, configuration files, resource files, etc. Issues arise if users install the application and call the launch script in unexpected ways. Examples include: putting a symbolic link to the launch script, calling it through a relative or fully-qualified path, adding the launch script's parent folder to PATH, adding a symbolic link to the launch script's parent folder and putting it in the PATH, creating a directory and putting a symbolic link to the launch script and then putting a symbolic link to the directory holding the symbolic link to the launch script and then putting THAT symbolically linked directory in the PATH (Yikes!), etc. etc.

However, with a little Bash-Foo we can protect our scripts and ensure they know how to find its true installation home. I.e. we want them to be able to "phone-home". Once home is known (which in my case is set in a QUALIFIED_PATH variable), we can simply move relatively from QUALIFIED_PATH to load or reference other files. Easy peezy, mac and cheesy (as my kids like to say)!

Boom: There It Is

The short version of this (but that doesn't work on BSD variants of UNIX such as OSX):

#!/bin/bash
######################################################
################# BLOCK TO PHONE HOME ################
# This can be used to find installation directory
# no matter how this script is referenced/linked or
# put in the PATH or whatever
BASENAME=$(basename "$0")
if [ "$0" == "${0#/}" ] ; then
    QUALIFIED_PATH="$PWD"/"${0%/*}"
else
    QUALIFIED_PATH="${0%/*}"
fi
# NOTE: the following doesn't work on OSX and other BSD (on non-Gnu) derivatives
# Download tarball with portableReadlink function for a universal alternative
PATH_TO_EXEC=$(readlink -f "${QUALIFIED_PATH}/${BASENAME}")
QUALIFIED_PATH=$(dirname "${PATH_TO_EXEC}")
############### END BLOCK TO PHONE HOME ##############
######################################################

echo "My Path Home is: ${QUALIFIED_PATH}"


Additionally, a version that includes a "portableReadlink" function to compensate for lacking "readlink -f" is downloadable from this tarball package. The actual useful block is at the top of boomThereItIs/phoneHome.sh, while the tarball also includes a test program called testPhoneHome.sh in a parent directory that demonstrates its utility in a number of ways. That tarball is located on my Google Drive here: Download phoneHomeDemo.tar.bz2

Enjoy!

Saturday, June 11, 2016

Bash Foo: Finding Unique (and Duplicate) Files

Where is that darned file I changed?

I sometimes have the problem that I changed a version of some file somewhere in some directory but I don't know which version I changed. As a programmer, I often copy or embed versions of the same or very similar utility or file in a different projects.

The problem is when later on I forget which project I know I made a patch or fix on that file. I imagine this thing happens to those of us that program for a living a bit more often than the average computer user. This exact thing happened to me this morning (again) concerning a fix in a rather difficult custom ThreadPool class I've developed that has made its way into at least 4 or 5 different projects (some of which have multiple branches). The issue was that I made that fix (an optimization really) months ago but couldn't bring it over to my other projects at the time. Today I couldn't even remember which project I made the fix for, much less branch of a project!

Bash to the Rescue: findUnique.sh

So, I put together a little Bash Foo this morning that searches out unique files and produces a little report. It takes a little while to run if searching a large directory tree (especially if you are still working on a harddisk instead of a solid-state drive), but it certainly is faster than trying to do it manually.

Running it is very simple. Here is an example:

Archimedes:~ $ findUnique.sh ThreadPool.hpp

The following files are equivalent:
./ProjectA/version1/include/util/concurrent/ThreadPool.hpp

./ProjectA/version2/include/util/concurrent/ThreadPool.hpp



./ProjectA/version3/include/util/concurrent/ThreadPool.hpp



./ProjectB/include/util/concurrent/ThreadPool.hpp



Totally Unique:


./ProjectC/include/util/concurrent/ThreadPool.hpp

Back to work

Now that I spent an hour building this awesome tool, quickly identifying the suspect file, I can go back to work merging that ThreadPool change everywhere. Maybe you too can use this tool, if so: Enjoy!

Here's the code:


#!/bin/bash
############################################################################
# findUnique.sh - Bash Foo to search out unique and duplicate files.
#
# Run with: findUnique.sh
#
# Author: Ryan Fogarty
# Last Edited: 2016.06.11
# Copyright: Ryan Fogarty (FogRising) 2016
############################################################################

# Thanks StackOverflow for this little convenient tidbit...
containsElement () {
  local e
  for e in "${@:2}"; do [[ "$e" == "$1" ]] && return 0; done
  return 1
}

if [ $# -ne 1 ] ; then
   echo "Run with: $0 "
   exit 1
fi

bold=$(tput bold)
normal=$(tput sgr0)

uniqueFile=$1

SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
uniqueFiles=($(find . -name "${uniqueFile}"))
IFS=$SAVEIFS

#echo "num uniqueFiles to test : ${#uniqueFiles[@]}"

declare -a reportedFiles

# Number of files - 1 : used to determine if a file is totally unique
numOthers=${#uniqueFiles[@]}
let numOthers-=1


for uf in "${uniqueFiles[@]}" ; do

   # If we've already reported as equivalent to
   # something else, skip this one
   if containsElement "$uf" "${reportedFiles[@]}" ; then
      continue
   fi

   uniqueFlag=0
   declare -a equivalentFiles
   equivalentFiles=()
   for ouf in "${uniqueFiles[@]}" ; do
      diff --brief "${uf}" "${ouf}" &> /dev/null
      retVal=$?
      if [ $retVal -gt 1 ] ; then
        retVal=1
      fi
      let uniqueFlag+=retVal
      if [ $retVal -eq 0 ] ; then
         equivalentFiles+=("${ouf}")
      fi
   done
   if [ $uniqueFlag -eq $numOthers ] ; then
      echo "${bold}Totally Unique:${normal}"
      echo "${uf}"
      reportedFiles+=("${uf}")
   elif [ $uniqueFlag -eq 0 ] ; then
      echo "${bold}All the following files are exactly alike:${normal}"
      for eqf in "${uniqueFiles[@]}" ; do
         echo "${eqf}"
      done
      exit 0
   else
      echo "${bold}The following files are equivalent:${normal}"
      for eqf in "${equivalentFiles[@]}" ; do
         echo "${eqf}"
         reportedFiles+=("${eqf}")
      done
   fi

done

Thursday, August 15, 2013

Demeter and the Inventor's Paradox

So I've been writing code for a long time, over 15 years professionally. Occasionally I come across a concept that I've read about in the past but that just didn't hit home. Today that concept was the Law of Demeter and the related principle: the Inventor's Paradox. In this post I am going to share my indirect experience with the Inventor's Paradox. I say indirect because, although this concept has possibly been glossed over by me in the past, I have never applied its principles in a conscious manner.

The Inventor's Paradox summarized simply is that often times it is easier to solve a more general problem than a more specific one. Stated another way, solving a general problem is likely to lead to a much more elegant solution. I started applying this principle about 12 years ago after a quick introduction to (the GOF) Design Patterns. Design Patterns opened up my mind to what Object Oriented Programming (OOP) really is and how it can be made to work to my advantage. In particular, I began to think about programs at different levels of abstraction. I continued on my journey via the internet and published tutelage of Bob Martin, Grady Booch and Hunt and Thomas (among many others). Very quickly the concepts that I read and studied began to pay dividends. Namely, my code
  • Grew smaller, MUCH smaller than some of my peers,
  • Began to suffer fewer bugs,
  • Became easier to maintain,
  • Was easier to extend.
One particular area where I've seen application of these principles that reaps huge benefits is with Data Modeling. I happen to work in a field that is singularly unique in the way software is created and works. In my field it is not uncommon to communicate with dozens of other software systems that are each a part of a very complicated coordinated system. The components of the system have evolved over long periods of time. And each component is unique in its language, structure, build environment, hardware requirements, etc. Integrating these systems is a huge chore and often amounts to building unique adapters or bridges between any two components. Sometimes, however, we have the opportunity to design a model for integrating multiple components.

On occasion these opportunities have yielded really intelligent solutions that continue to evolve over time due to their elegance and simplicity. Many times however these opportunities are squandered and the interfaces/bridges end up having a very short lifespan before being reinvented and replaced.

The best data models I've used are ones that hide away the numerous pedantry of data specifics particular to an originating system. Poor models, which unfortunately are very prevalent, are peppered with detail oft-times unnecessary to downstream components. The worst models have data members that mix concepts or ontologies. In other words, a field may convey special information when a value is within some range or another field is at some value.

However, the overarching issue with poor models is lack of a general specification. A proper specification should attempt to formalize the needs of the components in the most general sense. That specification needs to satisfy the immediate needs of its current clients but also to imagine the needs of any future clients. Additionally, the specification needs to include areas for data providers to squirrel away details that are not part of the formal interface but are critical to the needs of system development and maintenance. I usually refer to these areas as Metadata extensions. They are metadata in the sense that they provide information about the data itself. As an example, metadata fields might include details about the algorithm that produced the data for downstream clients.

Reading the Inventor's Paradox reminded me of these situations where a more general approach to data sharing was so much more powerful and easier at the same time!

References:
Adaptive Programming and Demeter

Tuesday, February 1, 2011

Bashmarks update 1.07_2011.02.01

Bashmarks 1.07 is being released.

There are a few bug fixes and a few changes to its interface. Namely the commands g and p have been renamed to gb and pb. I feel that follows more the convention of linux 2-letter commands. Additionally, a new list bookmark (lb) command has been added.

Fixed bugs include:
  • Fixed issue with "cd //" ending up in infinite loop
  • Made /tmp/*_BashMarks.tmp file (used by autocompletion)
  • Now use SHELL_SESSION_ID (UUID) to guarantee no collisions
  • Called __autocompleteBms at end of file to ensure autocompletion
Download here: Bashmarks 1.07

Monday, November 22, 2010

Bashmarks update 1.05_2010.11.05

Download the latest fixes version of Bashmarks version 1.05 directory bookmarks.

Fixes:
  • cd, pushd, popd - now return proper error values (bash expression chaining will now work correctly such as: push somedir && popd)
  • Added version number. Currently 1.05_2010.11.22