Neural Network Design via Evolutionary Programming: Discovering Relevant Inputs and Optimal Configuration

2015-02-27T00:00:00Z (GMT) by Jonathan P. Lessner
Neural Networks, capable of accurate prediction over complex domains, learn subtle correlations between disparate data streams in a non-linear way. This is both their greatest strength and their greatest drawback. Selecting relevant data streams from among massive, diverse data sets relies on domain expertise and neural network engineering experience. Selecting and curating input streams for networks is difficult and time consuming. In domains like weather, traffic prediction, and medicine, an automated method is needed to discover and curate only the most relevant data from large data sets of varying value. An automated, complete combinatorial search for optimal input configurations would lead to an intractable processing task. This thesis aims to leverage Evolutionary Programming to search the solution space in an acceptable amount of time. A scoring algorithm was designed to reward accuracy and efficiency. Each generation was made of a population of fifty Back Propagation Neural Networks, whose input nodes each represented a single data stream. The configuration of each member was recorded in a Genome. The first generation's members were configured by a full randomization process. Each network was trained and tested in the standard BPNN method. They were scored according to the fitness algorithm, and the top five performers were retained. The following generation was crated from methods which crossbred and mutated the genomes of the victorious members. Each experiment was terminated after one thousand generations. Since pressure was put on the members to perform according to the the metrics of accuracy (low MSE) and efficiency (low numbers of inputs), it was hypothesized that the best networks in the final generation would contain only the best input node configurations, representing the most relevant possible data among the available data set. This methods effectiveness was confirmed in two ways. First, the accuracy and efficiency of the networks improved steadily over the generations, in response to selective pressure. Second, the hypothesized discovery of relevant data was supported by mirroring known efficacious medical treatments for the sampled patient. It achieved these goals in an acceptable amount of time, typically between twelve and twenty four hours on a desktop PC.