This post records a minor success on my computer vision project training a LeNet-5 style convolutional neural net to recognize a stylus. The structure of a convolutional neural network is designed to automatically generate features from the image that are significant to the detection of categories.
The object of the game for my project is to be able to examine any given patch in an image and detect the presence of the stylus. And, if the stylus is detected, then determine the dimensions and location of the stylus. While categorization was not too difficult, the regression portion determining the location was/is problematic.
The neural net was created using PyNeurGen with modified classes for Convolutional and Subsampling layers. A StochasticNode class created for drop-out nodes simulates unreliable sensors. Basically, drop-outs are a method of regularization. At some point I will add those classes to the PyNeurGen project when the code settles down.
My initial approach used an output that signified category and location simultaneously, reasoning that the same inputs are going into both calculations. If the stylus was not detected, all the location coordinates would be 0. If detected, then draw the outline.
However, all too often, the tentative lines that appeared were erratic. The network was deciding for each and every point, whether (1) a stylus was present, and (2) the specific location of the point or output a zero. It makes more sense to detect the stylus presence, then having that belief, calculate the location.
Possibly by running additional epochs, the problems would sufficiently dissipate, but intuitively it seemed like a poor approach.
Eventually, it occurred to me that a better structure added an additional output layer. In the new format, only the two categorization nodes would sit in the former output layer. Then, when using that a priori determination of category, the category nodes would inform the location nodes without ambiguity.
In a sense, the network should be viewed as two networks, category and regression, with sparse links between them.<< more >>
Recently, I had the opportunity to visit our coffee roaster, Diantha's Coffee, a wholesale coffee roaster here in the San Francisco Bay Area, where I could see first-hand the coffee being roasted. They have an old fashioned roaster with considerable charm that has manual controls.
It turned out to be an interesting opportunity to apply machine-learning techniques to a novel process.
The Sound of Coffee Roasting
Roasting is in many ways a methodical process that takes skill and experience to get the right combination of mixtures of coffee beans, roasting temperatures, timing, and rapid cool downs.
One part of the process involves determining whether the beans have been sufficiently roasted. Depending upon the type of bean and the roast desired, there are different cycles of roasting and cool down.<< more >>