Well, yes, I am aware that there is a large body of literature on trying to predict earthquakes using various statistical methods and that at least one
paper suggests that neural networks are a promising approach, and
another suggests that foreshocks are potentially useful data as precursors to earthquakes. I wanted to see for myself whether a modern machine learning algorithm might do a decent job catching the highly non-linear relationship, if it exists, between foreshocks and subsequent earthquakes, and I couldn't actually find any papers that try exactly the particular method I envisioned.
I figure that this is a worthwhile project to attempt even if it doesn't succeed because the potential payoff in the very unlikely chance that it works would be enormous (in terms of lives saved). And I mean, I still have my high-end computers that I originally bought to do my Master's thesis research with, and I'm already familiar with programming neural nets from using them for said thesis, so I figure I may as well give this experiment a shot. If nothing else it's practice programming with the machine learning tools that I'm supposedly an expert with. And I mean, I can say I tried to predict earthquakes with neural networks. How cool is that?
Anyway, I can report that the best performance I've seen so far with the algorithm is a hilariously bad 99.5% error rate. And it's unreliable to boot because sometimes with the exact same experimental setup I will get 100% error instead, so its possible that 99.5% was a result of random noise. XD Though also keep in mind this network I'm using is far from optimized, other than being run on the GPU. I still have to experiment more with the number of hidden nodes and the learning rate to see what effect different values will produce. Unfortunately this is one of the difficulties of using algorithms like neural networks, is that a lot of the hyper-parameters need to be fine-tuned before you can honestly say if the algorithm is performing or not.
I also realized that I'm doing something outside of normal practice with regards to autoencoder neural networks, because normally you train the autoencoder to just reconstruct the input, but I've hacked this particular autoencoder to actually try to associate two sets of images of identical dimensions... there is a surprising lack of literature on using autoencoders in this way, so I'm not entirely confident this could even work in theory.
I'm actually tempted to modify my dataset to be more like a traditional supervised learning problem with labels to classify so I can apply other machine learning algorithms, like a Support Vector Machine, Convolutional Neural Network, and Deep Belief Network, the three I used in my thesis successfully. But I wanted to try the autoencoder first, because the problem as I envisioned it seems uniquely suited to it, in the sense that the autoencoder can output an image that maps all the predicted earthquakes of a given 24 hours in their approximate location and magnitude, and this output could be parsed to identify all the large magnitude earthquakes predicted in that 24 hour window. I figure that if it works with a high enough degree of accuracy, it would be relatively easy to set up a server with the fully trained network that takes the previous 24 hours of USGS earthquake data, and then predicts in real time the next 24 hours of earthquakes, giving out their predicted location and magnitude to a website front end.
Anyways, I don't actually expect this to work. The probability that some silly guy with just a Masters degree in Computer Science can come in and upset the whole field of seismology with a working machine learning earthquake predictor is very, very low. But the probability is nonzero and the sheer number of lives that could be saved if I even just open the door to research that leads eventually to a working earthquake predictor is so high, that I consider it worth the very small effort on my part to at least try it and see what happens. It is my personal moonshot project, that I happen to be in a position to attempt right now. At most it costs me a few days of coding and waiting for the neural net to train and test. And like I said, I'm learning and practicing my skills while doing so.
And also, it's fun to code together something cool like this. My masters thesis was all about solving the occluded object recognition problem that at most nudges the field forward a tad bit. A paper about optimizing some obscure hyper-parameter in an obscure algorithm is also kinda boring (unless you actually understand how cool Convolutional Neural Networks are). This though, this is something grand, an application of a technology that I don't think enough people realize is exceptionally powerful. There is a reason why all the latest speech recognition software by Google, Microsoft, IBM, etc., they all use Deep Neural Networks now. The work of the last fifty years of A.I. and machine learning research is finally starting to really pay off.
And so, I'm just trying to take a very powerful technology, and applying it to a very hard problem. Admittedly I'm approaching the problem rather naively compared to the experts in the field of seismology, but occasionally even the naive approach can work. And even if it doesn't, at least we'll know that it doesn't. Even negative results are an important part of doing science.