Ramil's Cave

Ramil 2020-09-28

raML - Goals Update

Big: 1. Optimizers 2. Kernel Weight Regularization ~~3. Valid~~ation 4. Lambda Layer (harder to do efficiently than I thought, might need to use sympy) Medium: ~~1. PyPi~~ Update: Restructured raML as a package - can (almost) pip install it now :) Oh man, that looks nice! Update 2: Done with basic Validation. Added (smart) dataset splitting, validation and validation plots

Comments (0)

Ramil 2020-09-27

3 Images for Why Weight and Data Normalization are Important

Setup: Deep Neural Net with 4 Dense layers (100, 20, 10, 1) sizes trained on 4898 samples with 11 features, 1000 epochs, Relu activations Bad Weight initialization (from random [0,1] interval) and no data scaling (preprocessing layer). Plot cuts off cause of gradient explosion leading to nan values Normalized weights but still no data normalization. Gradient explodes a bit later. Healthy training plot. Input data was normalized, weights were sampled from the appropriate normal distributions. Tested also with 13 layers, the gradient begins to explode at around 13 layers. I think there is also an initialization adjustment of variance based on layer depth, that could help, but honestly, it's good enough for now.

Comments (0)

Ramil 2020-09-25

raML - Near Goals

Big: 1. ~~Model compilation~~ 2. Validation 3. Optimizers Small: 1. Lambda Layer 2. ~~Data normalization as a layer (maybe?)~~ Progress so far: Implemented model compilation. Now, creating a Deep Neural Network is as easy as it is in Keras model = Sequential([ Dense(size=3, input_shape=X.shape), Dense(size=1, activation=Sigmoid) ]) model.compile(cost=MSE(), metrics=[RMSE()]) Looks just like Keras, you say? Well good, cause Keras does model creation the right way! I've also added Relu, but still testing to make sure it's working right. This actually made me realize, I should organize optimizers! Update After investigating, found out that the problem is most likely in exploding gradients. Didn't expect it to appear that early! Update 2 Oh this is so cool! After finding out the exploding gradient in a relatively small network, I knew that it probably wasn't due to the learning rate (although making it smaller did help), but rather it was due to weight initialization - that's actually worth writing a separate blog post about, but basically, I used to sample from a uniform 0,1 distribution, but it's much better to sample from a (normal) distribution centered at 0 (note: that doesn't fully solve it, for best performance, one need to take into account variation also, which should depend on layer's depth)

Comments (0)

Ramil 2020-09-24

Morning Goal

Let's see if I can stick to this new rubric. (Ramil from the future: "No") raML: 1. Lambda Layer 2. Implement More Datasets 3. Add more cost functions (RMSE) 4. Come up with a better DNN model creation procedure Update. Progress so far: Added more datasets, added Metrics, improved tqdm (the progress bar thing in terminal that tracks training progress). Here we have MSE Loss and RMSE metric tracked for model training on a Swedish Auto Insurance dataset. So beautiful

Comments (0)

Ramil 2020-09-23

raML is up!

Github link

Comments (0)

Ramil R Aleskerov

Mathemagician, Developer, Cosmopolitan

Contact

raML - Goals Update

3 Images for Why Weight and Data Normalization are Important

raML - Near Goals

Morning Goal

raML is up!

Favorite Podcasts

Latest Tweets