Earthquake Prediction Project

Darklight · by **Darklight** on 2013-08-31T22:42:00

So, I've been thinking about what could be done that would do the most good for the least effort. Visions of creating Friendly A.I. aside, one area I think that could potentially have a tremendous impact would be Earthquake Prediction. From looking at Wikipedia's List of Natural Disasters by Death Toll, arguably the most common natural disaster with the highest death tolls tends to be Earthquakes, and Tsunamis caused by Earthquakes.

I therefore wonder if a simple way to do a lot of good with comparatively minimal effort would be to devise a computer algorithm to predict earthquakes using foreshock data. As someone who's been doing some research with machine learning algorithms, I've been thinking, why not just take the last 40 years of Earthquake data and feed them into our best machine learning algorithms, and see if there's some kind of relationship between foreshocks and large magnitude earthquakes?

Then, if we can get it to work experimentally (by showing that it predicts previous earthquakes), set up a server with a website that automatically inputs into the machine learning model the most recent earthquake data and makes predictions on, say a 24 hour time frame.

What do people think? Worthwhile effort, or crazy pipe dream?

peterhurford · by **peterhurford** on 2013-09-02T07:03:00

It's worth trying, but I think it's just Not That Easy. There's a good discussion of this in Nate Silver's The Signal and the Noise.

Darklight · by **Darklight** on 2014-08-18T01:27:00

Just wanted to say that I finally got around to attempting to code this thing.

My first experimental set up has consisted of taking all the earthquake data from 1973 to 2000 from the USGS website in the form of CSV files, parsing these files and creating a dataset of images that map each earthquake to a pixel of brightness relative to the magnitude for every 24 hour period. Then, using the Theano library in Python, I implemented a modified autoencoder neural network to take a given image representing the earthquakes in a given 24 hours, and another image representing the next 24 hours, and learning to associate the two images together. The network is then tested by giving it images it wasn't trained on, to see if it can generate an image that can then be parsed into predictions for the next 24 hours, which are then compared with the correct image.

So far early experiments with default hyper-parameters have produced a 100% error rate... XD

Which is to say Peter was probably right about it being Not That Easy.

peterhurford · by **peterhurford** on 2014-08-18T23:12:00

Note that there's a good wealth of study on this topic already. The consensus seems to be that predicting earthquakes is incredibly hard.

Darklight · by **Darklight** on 2014-08-19T01:44:00

Well, yes, I am aware that there is a large body of literature on trying to predict earthquakes using various statistical methods and that at least one paper suggests that neural networks are a promising approach, and another suggests that foreshocks are potentially useful data as precursors to earthquakes. I wanted to see for myself whether a modern machine learning algorithm might do a decent job catching the highly non-linear relationship, if it exists, between foreshocks and subsequent earthquakes, and I couldn't actually find any papers that try exactly the particular method I envisioned.

I figure that this is a worthwhile project to attempt even if it doesn't succeed because the potential payoff in the very unlikely chance that it works would be enormous (in terms of lives saved). And I mean, I still have my high-end computers that I originally bought to do my Master's thesis research with, and I'm already familiar with programming neural nets from using them for said thesis, so I figure I may as well give this experiment a shot. If nothing else it's practice programming with the machine learning tools that I'm supposedly an expert with. And I mean, I can say I tried to predict earthquakes with neural networks. How cool is that?

Anyway, I can report that the best performance I've seen so far with the algorithm is a hilariously bad 99.5% error rate. And it's unreliable to boot because sometimes with the exact same experimental setup I will get 100% error instead, so its possible that 99.5% was a result of random noise. XD Though also keep in mind this network I'm using is far from optimized, other than being run on the GPU. I still have to experiment more with the number of hidden nodes and the learning rate to see what effect different values will produce. Unfortunately this is one of the difficulties of using algorithms like neural networks, is that a lot of the hyper-parameters need to be fine-tuned before you can honestly say if the algorithm is performing or not.

I also realized that I'm doing something outside of normal practice with regards to autoencoder neural networks, because normally you train the autoencoder to just reconstruct the input, but I've hacked this particular autoencoder to actually try to associate two sets of images of identical dimensions... there is a surprising lack of literature on using autoencoders in this way, so I'm not entirely confident this could even work in theory.

I'm actually tempted to modify my dataset to be more like a traditional supervised learning problem with labels to classify so I can apply other machine learning algorithms, like a Support Vector Machine, Convolutional Neural Network, and Deep Belief Network, the three I used in my thesis successfully. But I wanted to try the autoencoder first, because the problem as I envisioned it seems uniquely suited to it, in the sense that the autoencoder can output an image that maps all the predicted earthquakes of a given 24 hours in their approximate location and magnitude, and this output could be parsed to identify all the large magnitude earthquakes predicted in that 24 hour window. I figure that if it works with a high enough degree of accuracy, it would be relatively easy to set up a server with the fully trained network that takes the previous 24 hours of USGS earthquake data, and then predicts in real time the next 24 hours of earthquakes, giving out their predicted location and magnitude to a website front end.

Anyways, I don't actually expect this to work. The probability that some silly guy with just a Masters degree in Computer Science can come in and upset the whole field of seismology with a working machine learning earthquake predictor is very, very low. But the probability is nonzero and the sheer number of lives that could be saved if I even just open the door to research that leads eventually to a working earthquake predictor is so high, that I consider it worth the very small effort on my part to at least try it and see what happens. It is my personal moonshot project, that I happen to be in a position to attempt right now. At most it costs me a few days of coding and waiting for the neural net to train and test. And like I said, I'm learning and practicing my skills while doing so.

And also, it's fun to code together something cool like this. My masters thesis was all about solving the occluded object recognition problem that at most nudges the field forward a tad bit. A paper about optimizing some obscure hyper-parameter in an obscure algorithm is also kinda boring (unless you actually understand how cool Convolutional Neural Networks are). This though, this is something grand, an application of a technology that I don't think enough people realize is exceptionally powerful. There is a reason why all the latest speech recognition software by Google, Microsoft, IBM, etc., they all use Deep Neural Networks now. The work of the last fifty years of A.I. and machine learning research is finally starting to really pay off.

And so, I'm just trying to take a very powerful technology, and applying it to a very hard problem. Admittedly I'm approaching the problem rather naively compared to the experts in the field of seismology, but occasionally even the naive approach can work. And even if it doesn't, at least we'll know that it doesn't. Even negative results are an important part of doing science.

Darklight · by **Darklight** on 2014-08-19T15:09:00

Another update...

So, by reducing the number of hidden nodes in the network to exactly one (!!!), I've managed to get the network to achieve a 92% error rate aka 8% accuracy rate on the validation set (stuff it wasn't trained on) after only three epochs (runs through the whole training set) of training. Oddly, increasing the number of hidden nodes or the number of epochs of training actually reduces performance with the current configuration.

Keep in mind that the way I have it set up, the error rate is calculated by looking at all the earthquakes of magnitude 3.0 or higher that are either predicted or which actually occurred, and then comparing the difference of the magnitude of the prediction with the actual occurrence. If the difference in magnitude is less than or equal to 1.0, then the prediction is correct, otherwise it is marked as an error. This means that even if the network correctly predicts an earthquake at a location, if the magnitude is off by more than 1.0, it's still wrong.

So just the fact that it manages to get 8% of the predictions right, is an interesting development. It is made more mysterious by the fact that this result only happens with exactly one hidden node, which is kinda bizarre.

Though in a way it could make sense, if it means that there's only one particular feature that is relevant in the data, likely the combined magnitude of nearby earthquakes. Or something like that. I'm not sure. Neural networks work in mysterious ways... XD

Brian Tomasik · by **Brian Tomasik** on 2014-08-20T05:46:00

(also made this reply on FB)

That's interesting about your earthquake data. The only explanation I have is the obvious one that it could be overfitting with more nodes or epochs, but if you have lots of data that shouldn't be true. Another obvious (and maybe not improbable) explanation is that there's some error in the setup. Typically when I get funky results it's because I did something wrong.

You could try a toy problem to verify the setup -- e.g., take one day's data, x, add noise to it, eps, and try to use x to predict x + eps. That should have high accuracy. If so, then the setup seems ok.

You might also sanity check with linear regression to make sure that doesn't do better. Also trying plain vanilla NNs seems reasonable to determine whether 8% accuracy is good or bad for your variant.

peterhurford · by **peterhurford** on 2014-08-21T02:57:00

Hm, I do data science, but I've never worked with neural nets, so I can't offer any help. But glad you're informed about this. And it certainly is worth doing, if solely from a "learn more about data science by applying it" perspective.

Brian Tomasik · by **Brian Tomasik** on 2014-08-21T06:09:00

It's worth doing for your own edification up to diminishing returns. You may get to do similar projects for pay at a job, so keep that in mind when assessing the learning value.

Darklight · by **Darklight** on 2014-08-21T17:17:00

I will probably do the toy problem at some point as you suggested.

But first, the sanity check with linear regression! I was actually considering this before as well, so I figure it's worth a try. Though it appears that the way I've structured the problem, the closest equivalent to linear regression is actually to perform multivariate regression on a General Linear Model. I'm working on that right now, though the sheer amount of memory required to create something like 64800 x 64800 matrices of single precision floats is really taxing the computer, even though it has 32 GB RAM. Right now there's something like 118 GB split between the RAM and the virtual memory. XD

A plain vanilla NN isn't actually that different from the autoencoder implementation I'm using, other than the fact that the autoencoder is designed to deal with a large output vector, while something like a Multi-Layer Perceptron is designed with just a few classes as output nodes. Actually, I'm finding that what I've actually implemented is kind of a weird hybrid of an autoencoder and an MLP. Though if you mean I should try just implementing a Perceptron without a hidden layer and doing Logistic Regression on the data, I suppose I could try that too.

Yeah, at the very least this is good practice with Python libraries like Numpy and Theano, figuring out how to solve this problem within the constraints of limited memory resources. Though I realize that a big reason why I'm having memory issues is actually because of the way I designed the dataset. It has lots of zeroes where no earthquake happened on a given day. In theory there's probably a much more efficient way to encode this problem, but I wanted to see whether or not storing the location information structurally might make the task easier.

Brian Tomasik · by **Brian Tomasik** on 2014-08-21T19:51:00

Darklight wrote:Though if you mean I should try just implementing a Perceptron without a hidden layer and doing Logistic Regression on the data, I suppose I could try that too.

I was thinking just to do one output at a time, but I didn't realize you had ~64800 of them.

You could try a subset of those outputs.

In theory, a NN that shares hidden nodes to produce the full set of outputs should be better than NNs that predict each output individually because of sharing info across tasks, but trying the individual NNs would at least be a sanity check.

Darklight · by **Darklight** on 2014-08-22T11:03:00

It took a while, but I finally completed the closest thing to linear regression on the data. The trouble was partly due to all the memory usage, and partly because, given the official formula (normal equation) B = (XTX)^(-1)XTY, the matrix XTX (dot product of the transpose of X, and X) turned out to be a singular matrix. As singular matrices have no inverse, I had to substitute the Moore-Penrose pseudoinverse into the equation, replacing (XTX)^(-1)XT with it.

Though first, I tried just assuming that XTX was its own inverse, and got 8.700949% error on the training set, and 100.00% error on the validation and test sets.

Substituting the Moore-Penrose pseudoinverse got me down to 0.016236% error on the training set, 99.488243% error on the validation set, and 99.496515% error on the test set.

Brian Tomasik · by **Brian Tomasik** on 2014-08-23T00:44:00

Very interesting! So I guess your neural network is doing quite well compared with linear regression.

Darklight · by **Darklight** on 2014-08-23T02:49:00

Latest best performance results:

Training Error: 89.66%
Validation Error: 81.92%
Test Error: 77.88%

I made a few changes to the network to see if it would help. First, I untied the weights, as previously the weights between the hidden layer and the output layer had been tied to the weights between the input and the hidden layer, ostensibly because in its original configuration, the autoencoder was just reconstructing the input, and so it made sense to tie the weights then. Since my task is actually something more akin to a regression problem, I untied the weights (duh!). Second, I changed the error/cost function. It was previously using Cross-Entropy, which makes sense when your input data is binary, and/or you're predicting probabilities of classes. But for a regression problem where you're trying to predict real values, I figured that Sum of the Squared Error makes more logical sense. Third, I tried a learning rate of 0.005, after previously trying 0.1, 0.01, and 0.001. Apparently this network is rather sensitive to learning rate.

So now I have a bit of a moral dilemma to consider. If this network can predict earthquakes something like 20% of the time, should I implement the real-time predictor and start forecasting earthquakes? The danger here is that 20% is low enough that it'll be wrong most of the time, but right every fifth time, and so that means that it has the potential to occasionally save lives, at the cost of many potentially expensive and annoying false alarms. With such a high rate of false positives, people might just ignore the warnings anyway.

Though, maybe if I can increase the accuracy further, this will be less of a dilemma. There's still a bunch of things I have yet to try, such as modifying the network to be convolutional, or pretraining the network as a restricted Boltzmann machine, or even just stacking additional hidden layers. Though the fact that the network seems to work best with just a single hidden node remains one of the most bewildering things about this experiment. And I still need to try tinkering with the various hyper-parameters more. The main issue with that is that in the current form, it takes about 10 minutes just to load the dataset, and an additional 40 minutes to run 10 epochs of training on the network. Oddly I've found that the times when the thing actually gets the best performance is around epoch 2 or 3, and then subsequent epochs it gets worse. This would suggest that the learning rate is still too high and/or the weights are saturating. So maybe I should be implementing some kind of weight decay or momentum, or even just some kind of regularization.

Hmm... I'm starting to think I could turn this experiment into another paper. I wasn't expecting the thing to actually work. XD

Brian Tomasik · by **Brian Tomasik** on 2014-08-23T07:30:00

Congrats.

Darklight wrote:Training Error: 89.66%
Validation Error: 81.92%
Test Error: 77.88%

Weird. The numbers should be the opposite. You're doing worst on the training set.

Darklight wrote:So now I have a bit of a moral dilemma to consider. If this network can predict earthquakes something like 20% of the time, should I implement the real-time predictor and start forecasting earthquakes? The danger here is that 20% is low enough that it'll be wrong most of the time, but right every fifth time, and so that means that it has the potential to occasionally save lives, at the cost of many potentially expensive and annoying false alarms. With such a high rate of false positives, people might just ignore the warnings anyway.

Presumably other experts would get involved and advise you on that if it got to that point.

Darklight wrote:Though the fact that the network seems to work best with just a single hidden node remains one of the most bewildering things about this experiment.

Does that mean each output is just a function of a linear sum of the inputs? So it's as though you're doing a bunch of independent logistic regressions / generalized linear models?

peterhurford · by **peterhurford** on 2014-08-23T14:44:00

The training error / validation error / test error is indeed weird, but maybe they're printed backwards?

How likely is something like this to arrive due to spurious noise? Or some methodological error?

jason · by **jason** on 2014-08-23T15:21:00

I've met geotechnical engineers who work on this very prediction problem in the context of undersea quakes causing landslides that cause tsunamis. I'm sure it would bear fruit to get in touch with some university researchers working on these problems if you want to take your research to the next level.

Darklight · by **Darklight** on 2014-08-23T15:56:00

Does that mean each output is just a function of a linear sum of the inputs? So it's as though you're doing a bunch of independent logistic regressions / generalized linear models?

Each hidden node in this kind of network is roughly equivalent to a Perceptron on the input. The added output layer then basically functions as a bunch of Perceptrons on the hidden layer. Even with just one hidden node, this means that the input travels through a non-linear function (sigmoidal in this case) before it reaches the output nodes, and so it's not quite the same thing as what you describe.

If I got rid of the hidden node and just directly connected the input and output layers, that I believe would be the equivalent of what you mentioned. It would also create a network with more connection weights than I can currently compute on my GPU's meager 3 GB of video memory... which would force me to use the CPU, which is about an order of magnitude slower.

So what's actually happening, -I think-, is that basically the single hidden node is detecting some feature in the input, probably magnitude at various locations, and then outputting a real value to all the output nodes, the weights of which determine what value each one returns in response. The hidden node basically compresses the entire input into a single quantity that is somehow highly correlated with earthquakes at a given location. I suspect from the image of the filter that is produced, which looks very much like a simple map of the most likely spots for an earthquake to occur, that all the hidden node is doing is learning where earthquakes happen frequently, and outputting some kind of value that depends on the particular combination of earthquakes. Each output node essentially computes the estimated magnitude of a quake at that location given the value from the hidden node.

It might even just be guessing that whenever an earthquake happens in the previous 24 hours, that another one of a similar magnitude will happen again in the next 24 hours. Though, I'm not sure, because I don't know what the actual likelihood of that event is. I should try generating an image of the predictions so I can visibly compare them.

The training error / validation error / test error is indeed weird, but maybe they're printed backwards?

It's not printed backwards, those are the actual results.... it's definitely weird, but it's possibly due to the fact that the training set is basically all the quakes from 1973 to 1994, while the validation set is 1995 to 1997, and the test set is 1998 to 2000. Maybe if I randomized things, or did cross-validation, I'd get more "expected" results. Then again, weird is kinda par for the course with this network.

How likely is something like this to arrive due to spurious noise? Or some methodological error?

I doubt it's just noise, given that most of the time with different hyper-parameters the network gets around 99.99% error. That seems to be the "noise" or random chance level. It -could- always be some methodological error I haven't noticed, but I obviously haven't found it if that's the case.

Edit:

I've met geotechnical engineers who work on this very prediction problem in the context of undersea quakes causing landslides that cause tsunamis. I'm sure it would bear fruit to get in touch with some university researchers working on these problems if you want to take your research to the next level.

I've considered it, though I'm not sure how seriously they'd take me, given that my background is computer science, and I'm not currently attached to any university or institution. I figure I'd also need to do a more proper literature review and such so I know who to contact. I've actually also been thinking I should get in touch with my old supervisor and maybe some other university researchers in machine learning to help me figure things out. Given how I wasn't really expecting this thing to work, I didn't want to waste people's time if it turned out that the problem was intractable.

In yet another example of the weirdness of my network, the current best result (by validation error) is:

Training error: 92.67%
Validation error: 73.66%
Test error: 69.39%

Oddly with that run, the training error fell to a low of 80.62% but the validation error increased to 75.63% for that epoch.

Brian Tomasik · by **Brian Tomasik** on 2014-08-24T00:17:00

Darklight wrote:Each hidden node in this kind of network is roughly equivalent to a Perceptron on the input. The added output layer then basically functions as a bunch of Perceptrons on the hidden layer. Even with just one hidden node, this means that the input travels through a non-linear function (sigmoidal in this case) before it reaches the output nodes, and so it's not quite the same thing as what you describe.

Yeah, what I had in mind is that each output is something like y = f( w * f( sum_i xi * wi ) ) , where f is sigmoid? And then if we define g = f( w * f() ), this says y = g( linear sum of inputs ), i.e., g is the link function in the generalized linear model? Of course, that's a more complicated g than a sigmoid.

Darklight wrote:Each output node essentially computes the estimated magnitude of a quake at that location given the value from the hidden node.

Makes sense.

peterhurford · by **peterhurford** on 2014-08-24T14:41:00

Did you get this one?

Darklight · by **Darklight** on 2014-08-25T03:40:00

Yeah, what I had in mind is that each output is something like y = f( w * f( sum_i xi * wi ) ) , where f is sigmoid? And then if we define g = f( w * f() ), this says y = g( linear sum of inputs ), i.e., g is the link function in the generalized linear model? Of course, that's a more complicated g than a sigmoid.

That sounds about right. To be honest, my mathematical acumen isn't terrific, so I got confused about what a Generalized Linear Model actually was.

Did you get this one?

Alas, I haven't set things up to actually predict future earthquakes quite yet. Right now the experiments I've been doing all involve the earthquake data from between 1973 and 2000. I still need to download and parse the data for 2001-2014. Though I suppose I could have just input yesterday's data into the network to see if it predicts today's.

I also still need to write up a script to convert the prediction matrix into an image that overlays the predictions onto a map of the world so I can actually visualize the predictions, and also a script to parse the prediction matrix into a list of earthquake locations and magnitudes. There's lots of little things I need to do before this thing will be ready for prime time.

peterhurford · by **peterhurford** on 2014-08-25T13:26:00

I hope you're successful, so that I can tell all my friends, "Yeah, that's the guy that figured out how to predict earthquakes more accurately than anyone before. Yeah, he figured it out on the forum that I moderate. Yeah, I did my best to try and talk him out of it."

Darklight · by **Darklight** on 2014-08-25T17:50:00

I hope you're successful, so that I can tell all my friends, "Yeah, that's the guy that figured out how to predict earthquakes more accurately than anyone before. Yeah, he figured it out on the forum that I moderate. Yeah, I did my best to try and talk him out of it."

XD Thanks.

Though I have to report some bad news, which is that I did some analysis on what the network was actually predicting, and it looks like it's mostly predicting smaller earthquakes around magnitude 3.0-6.0, and isn't making much of any predictions in the 6.0-10.0 range.

peterhurford · by **peterhurford** on 2014-08-26T01:02:00

"Earthquake Early Warning System Gave 10 Second Alert Before Napa Quake Felt"

Brian Tomasik · by **Brian Tomasik** on 2014-08-26T03:04:00

Darklight wrote:so I got confused about what a Generalized Linear Model actually was.

I forgot the exact definition too, but that's what Wikipedia is for.

Darklight · by **Darklight** on 2014-08-26T18:52:00

"Earthquake Early Warning System Gave 10 Second Alert Before Napa Quake Felt"

Nice!

I'm in the process of putting together some code that will finally allow me to make proper future predictions. Apparently the network actually works astonishingly good on magnitudes around 4.0, but doesn't seem to predict anything beyond magnitude 6.0. I still need to check to see if it manages to predict the location of the earthquakes and just gets the magnitude wrong, because then at least, it might still be somewhat useful.

I forgot the exact definition too, but that's what Wikipedia is for.

Even -after- reading the Wikipedia article, I'm still confused. XD

Darklight · by **Darklight** on 2014-08-28T01:18:00

So, I finally got around to downloading and creating a dataset of all the recorded earthquakes from 1900 to 2014. Hopefully with this larger dataset, I can get more accurate predictions. Though most of the data from before 1973 is rather sparse, so it may not be -that- useful. On the other hand, the data from 2001 to 2014 should be of some use...

Also, and this is VERY PRELIMINARY, but I've put together a website that shows the predictions for the next 24 hours and have a script on my computer that automatically updates that website once a day:

http://www.earthquakepredictor.net

DanielLC · by **DanielLC** on 2014-08-28T18:06:00

How exactly do you measure the false positive rate? There's a huge number of earthquakes that don't happen.

peterhurford · by **peterhurford** on 2014-08-28T20:31:00

It would be cool if the website also steadily collected real earthquake data, and tracked how the prediction measured to the actual.

Darklight · by **Darklight** on 2014-08-28T20:49:00

DanielLC wrote:How exactly do you measure the false positive rate? There's a huge number of earthquakes that don't happen.

What I'm calling the "False Positive Rate" is simply the number of times a prediction above a certain cutoff magnitude is made where no earthquake above that magnitude actually occurs, divided by the total number of predictions made above the cutoff magnitude.

I actually made a mistake with my initial false positive rate calculations (alas, it was too good to be true), and have revised the website accordingly.

peterhurford wrote:It would be cool if the website also steadily collected real earthquake data, and tracked how the prediction measured to the actual.

Well, right now the way it updates is that I have a Task Scheduler on my computer scheduled to run a Python script every 24 hours that automatically downloads the latest earthquake data and runs it through the network and some more scripts to get the map and list that you see on the site. I could possibly implement something to also create a "Past Predictions" map that shows the previous predictions and the actual earthquakes and where they overlap. But that requires a bunch of additional work. Making it update in real-time would probably require periodic Cron Jobs on the server. I'm not sure that the shared hosting server that I'm using now has the necessary Python libraries to make this implementation feasible.

And anyway, right now I'm more focused on trying to work out the issues with the network itself, and get it to the point where the predictions are accurate enough that the website will actually be worth looking at.

Darklight · by **Darklight** on 2014-08-29T14:40:00

Some bad news... upon closer inspection, it looks like my neural network earthquake predictor is broken... or rather, it cleverly just predicts the average magnitude all the time and due to the way my error calculations worked with a cutoff of magnitude 3.0 and a range of 1.0, it found a sweet spot around 2.9 to predict nearly all the time, everywhere. Unfortunately this means that while it gets a bunch of predictions right by chance, it's not really predicting anything useful. Well, back to the drawing board. -_-;;;

Sorry to get people's hopes up. I guess this explains why it seemed to work best with just one hidden node.

Given that I put two weeks of my life into this and bought the domain name and everything, I figure I'll keep trying with different hyper-parameters, like more hidden nodes, but if things aren't promising, I may eventually consider it all a sunk cost, and shelve this project. Well, it was fun to think I could save the world so easily while it lasted, but I guess reality isn't like that.

-sigh-

DanielLC · by **DanielLC** on 2014-08-29T20:56:00

I had a feeling it was something like that. Any idea why the linear model didn't do that?

Darklight · by **Darklight** on 2014-08-30T03:59:00

DanielLC wrote:I had a feeling it was something like that. Any idea why the linear model didn't do that?

I'm not sure, but it seems like the linear model just overfitted or memorized the training data. The way the neural network worked, the single hidden node acted as a bottleneck that had to find some way to compress the input, and apparently it found the best way to do that was to learn what appears to be the average magnitude of all the earthquakes. Or something like that.

peterhurford · by **peterhurford** on 2014-08-30T14:29:00

Did your training set include things that were not earthquakes?

Darklight · by **Darklight** on 2014-09-01T03:31:00

peterhurford wrote:Did your training set include things that were not earthquakes?

If by that you mean other kinds of events, then no, the only thing I trained it on was the latitude, longitude, and magnitude of earthquakes. However, the way the training set was set up was by creating these 360 x 180 images where earthquakes were pixels of brightness equivalent to their magnitude (magnitude divided by ten to create a range of 0-1, and then multiplied by 255 to have a pixel value range of 0-255). This means that everywhere there was no earthquake the pixel was black (zero). So in that sense it did have non-events incorporated into the training set in this way.

DanielLC · by **DanielLC** on 2014-09-01T04:39:00

But no earthquake would be negative infinity on the Richter scale. Zero is still a tenth as strong as one.

Earthquake Prediction Project

Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project

Re: Earthquake Prediction Project