This post records a minor success on my computer vision project training a LeNet-5 style convolutional neural net to recognize a stylus. The structure of a convolutional neural network is designed to automatically generate features from the image that are significant to the detection of categories.
The object of the game for my project is to be able to examine any given patch in an image and detect the presence of the stylus. And, if the stylus is detected, then determine the dimensions and location of the stylus. While categorization was not too difficult, the regression portion determining the location was/is problematic.
The neural net was created using PyNeurGen with modified classes for Convolutional and Subsampling layers. A StochasticNode class created for drop-out nodes simulates unreliable sensors. Basically, drop-outs are a method of regularization. At some point I will add those classes to the PyNeurGen project when the code settles down.
My initial approach used an output that signified category and location simultaneously, reasoning that the same inputs are going into both calculations. If the stylus was not detected, all the location coordinates would be 0. If detected, then draw the outline.
However, all too often, the tentative lines that appeared were erratic. The network was deciding for each and every point, whether (1) a stylus was present, and (2) the specific location of the point or output a zero. It makes more sense to detect the stylus presence, then having that belief, calculate the location.
Possibly by running additional epochs, the problems would sufficiently dissipate, but intuitively it seemed like a poor approach.
Eventually, it occurred to me that a better structure added an additional output layer. In the new format, only the two categorization nodes would sit in the former output layer. Then, when using that a priori determination of category, the category nodes would inform the location nodes without ambiguity.
In a sense, the network should be viewed as two networks, category and regression, with sparse links between them.<< more >>
An updated version of PyNeurGen which is Python Neural Genetic Algrorithms, has been uploaded. This software package implements a pure Python version of neural networks and a version of genetic algorithms, grammatical evolution.
It has been awhile since the last update. A major portion of the work entailed writing unit testing for the various modules. While some testing was done originally, in the intervening period I have developed a much greater appreciation for test functions, and so I went back and increased the coverage to somewhere around 90%. Some refactoring was necessary to break up functions into smaller more easily testable chunks.
In addition, some functions are improved. I reworked the stopping criteria for grammatical evolution. Normally, the determination of when to stop the evolutionary process revolves around when the figure of merit, the fitness value. The fitness value, generated by the process for each genotype, must become sufficiently great or small, depending upon the objective. When the hurdle is reached, it is time to stop. The other typical reason to stop the process is when a maximum number of generations has been reached.
PyNeurGen now enables custom fitness functions that among other things can enable an evolutionary process that tries build a distribution of solutions. Suppose that the fitness value is only a loose proxy for what you really want to do. In such cases, it may be that all of the fitness values in an upper range could signify genotypes that are suitable for that purpose. In situations like that, it makes more sense to think of building a population where, for example, the upper quartile must be above a hurdle fitness value. With the custom functions, that can be done now. In fact, any distribution of fitness values can be used as the stopping criteria.
Recently, I had the opportunity to visit our coffee roaster, Diantha's Coffee, a wholesale coffee roaster here in the San Francisco Bay Area, where I could see first-hand the coffee being roasted. They have an old fashioned roaster with considerable charm that has manual controls.
It turned out to be an interesting opportunity to apply machine-learning techniques to a novel process.
The Sound of Coffee Roasting
Roasting is in many ways a methodical process that takes skill and experience to get the right combination of mixtures of coffee beans, roasting temperatures, timing, and rapid cool downs.
One part of the process involves determining whether the beans have been sufficiently roasted. Depending upon the type of bean and the roast desired, there are different cycles of roasting and cool down.<< more >>