This is a fully functional Theano-based program for constructing, training, and evaluating convolutional neural networks. Available layer types are convolutional, fully-connected, and softmax (special case of fully connected), which means a network does not have to use convolution. The networks learn using stochastic gradient descent, with optional dropout, momentum, and L2 regularization.
NOTE: This is not designed to work with python versions < 3.x. Also note, I have chosen to use Greek characters for learning rate (η), momentum (μ), and L2 regularization (λ). Just a personal preference.
I will be adding choices for error/loss functions soon, but it hasn't been a priority. The current function (in both FCLayer and SoftmaxLayer) is log likelihood. It is easy enough to change loss function for specific use needs if you are familiar with Theano.
There may be some disarray in the code at the moment, mostly because I have made changes on the fly for specific projects. However, it is still very fast and efficient when used with a decent Nvidia GPU.
I will post a notebook demonstrating the usage with MNIST soon, but in the mean time, here is a basic use example:
# if convnetwork.py is not in current python path: # import sys # sys.path.append('/path/to/directory/containing/file') import convnetwork from convnetwork import ConvNet as CN from convnetwork import ConvLayer as cvl from convnetwork import FCLayer as fcl from convnetwork import SoftmaxLayer as sfl # init network with architecture defined by layers=[] net = CN(layers=[cvl(inpt_shape=(None, 1, 28, 28), filter_shape=(32, 1, 3, 3), pool_shape=(2,2)), cvl(inpt_shape=(None, 32, 13, 13), filter_shape=(64, 32, 4, 4), pool_shape=(2,2)), fcl(n_in=32*5*5, n_neurons=32, p_drop=0.2), sfl(n_in=32, n_neurons=10)]) # train the network using stochastic gradient descent train = (Xtr, ytr) val = (Xval, yval) test = None mb_size = 50 epochs = 30 learn_rate = 0.02 momentum = 0.4 l2_reg = 0.0 net.sgd(train, val, test, mb_size, epochs, learn_rate, momentum, l2_reg) # prints status # prints best scores and best epoch after training completes # Make a prediction from single or multiple observations: x # return class prediction net.predict(x) # return predicted probabilities net.p_predict(x) # reset network to re-train net.reset() # then retrain net.sgd(...)
Get the code here