Posts by Ramil (146)

raML - Back at it!

Progress Couple of days later Finally got to implementing mini batches to enable fast training. Was observing very strange behavior, only later realized I forgot to change Sigmoid to Identity activation in the output layer, so no wonder the model didn't exactly give good predictions for housing prices :P Anyway, after fixing some things, here is a mini batch training of the mnist dataset ("but Ramil, it doesn't make sense to use MSE here", yes, yes, but I don't have Softmax implemented correctly yet). So this is the fitting going once through all data with 32 sized batches. The good thing is that it seems noisy, as it should be. Reminder to future me: Next major goal to speed things up is to write convolutions.

Life Update

As the saying goes, life happens all at once (is it even a saying actually?). Anyway, brain CPU is loaded thinking about many different things that will impact me and my family. So, unfortunately, raML is put on a short hold. However good news, it now has a logo :) because it got the point when it's not worth being included in the Development section of the website. Philosophy/Psychology. Saying no isn't easy. Often, we are either overly excited or are too scared to make rational decisions, but it's important to try and get rid of as much of the human bias as possible. I can't remember the last time when my initial reaction matched what I ended up doing after doing some sober thinking. On the other hand, there are times when the most rational thing to do is to act spontaneously and grab the opportunity by what seems to be its tail. And yeah, you will be grabbing a lot of things that aren't tails at all, but you have to train yourself to recognize when a rushed decision is necessary. Sounds like a contradiction? Basically the rule is - default to making sober decision, but train your spider sense to know when it's not possible.

raML - Next Actual Milestone

So yesterday I've decided to try and write raML's model for the famous MNIST dataset. So I added one-hot encoding, generalized some cost functions, and just ran it as it is. So since I don't have categorical_crossentropy implemented, I just used MSE for now. And since I don't have batches, the model actually took quite some time to train (a couple of minutes), which I wasn't happy about. So today's goals are: 1. Time Profiling 2. Softmax 3. Categorical crossentropy 4. Batches Progress Did time profiling, everything actually seems very reasonable besides data download (on a slow side) and training (because of lack of batches) Added softmax (but still testing). Vectorized derivatives !! Added caching for the datasets! Wow!

raML - Goals Update

Big: 1. Optimizers 2. Kernel Weight Regularization 3. Validation 4. Lambda Layer (harder to do efficiently than I thought, might need to use sympy) Medium: 1. PyPi Update: Restructured raML as a package - can (almost) pip install it now :) Oh man, that looks nice! Update 2: Done with basic Validation. Added (smart) dataset splitting, validation and validation plots

3 Images for Why Weight and Data Normalization are Important

Setup: Deep Neural Net with 4 Dense layers (100, 20, 10, 1) sizes trained on 4898 samples with 11 features, 1000 epochs, Relu activations Bad Weight initialization (from random [0,1] interval) and no data scaling (preprocessing layer). Plot cuts off cause of gradient explosion leading to nan values Normalized weights but still no data normalization. Gradient explodes a bit later. Healthy training plot. Input data was normalized, weights were sampled from the appropriate normal distributions. Tested also with 13 layers, the gradient begins to explode at around 13 layers. I think there is also an initialization adjustment of variance based on layer depth, that could help, but honestly, it's good enough for now.

1 ... 13 14 15 ... 30