Into Data Science
About Me
- Rahul Kumar
- Hi! I am Rahul. After years of bouncing around different sectors, I've been specializing in Python, Machine Learning, Deep Learning, NLP, and Statistics. Being a Technology Lover, I strongly believe 'Nothing can stress or stop you from achieving your dreams if you cherish hope more than your fears.'
Monday, July 12, 2021
Problems with Sigmoid and Tanh activation functions
Saturday, July 10, 2021
Backprop: What you need to know
1. Gradients are important:
- If it's differentiable, we can probably learn on it.
2. Gradients can vanish:
- Each additional layer can successively reduce signal vs noise.
- ReLus are useful here.
3. Gradients can explode:
- Learning rates are important here.
- Batch normalisation can help.
4. ReLu layers can die:
- Keep calm and lower your learning rates.
Source : Machine Learning crash course by Google.
Thursday, June 24, 2021
Few rules of thumb for Hyperparameter Tuning
Most machine learning problems require a lot of hyperparameter tuning. Unfortunately, we can't provide concrete tuning rules for every model. Lowering the learning rate can help one model converge efficiently but make another model converge much too slowly. You must experiment to find the best set of hyperparameters for your dataset. That said, here are a few rules of thumb:
- Training loss should steadily decrease, steeply at first, and then more slowly until the slope of the curve reaches or approaches zero.
- If the training loss does not converge, train for more epochs.
- If the training loss decreases too slowly, increase the learning rate. Note that setting the learning rate too high may also prevent training loss from converging.
- If the training loss varies wildly (that is, the training loss jumps around), decrease the learning rate.
- Lowering the learning rate while increasing the number of epochs or the batch size is often a good combination.
- Setting the batch size to a very small batch number can also cause instability. First, try large batch size values. Then, decrease the batch size until you see degradation.
- For real-world datasets consisting of a very large number of examples, the entire dataset might not fit into memory. In such cases, you'll need to reduce the batch size to enable a batch to fit into memory.
Remember: the ideal combination of hyperparameters is data-dependent, so you must always experiment and verify.
Source: Machine Learning Crash Course by Google
Sunday, April 11, 2021
Advantages & Disadvantages of Sequential and Functional APIs
Both Sequential APIs and Functional APIs are declarative
It means you start it by declaring which layers you want to use and how they should be connected, and only then can you start feeding the model some data for training or inference.
So advantages of having this:
- The model can easily be saved, cloned, and shared.
- Its structure can be displayed and analyzed.
- The framework can infer shapes and check types, so errors can be caught easily (i.e. before any data ever goes through the model).
- It's also fairly easy to debug since the whole model is a static graph of layers.
Friday, April 2, 2021
Why do we need activation functions?
### 0001
We used a lot of activation functions in our daily projects but do we really know why do we need an activation function in the first place?
Reason 1:
Well, if you chain several linear transformations, all you get is a linear transformation.
For example, if f(x) = 2x + 3 and g(x) = 5x - 1, then chaining these two linear functions gives you another linear function: f(g(x)) = 2(5x-1)+3 = 10x + 1. So if you don't have some nonlinearity between layers, then even a deep stack of layers is equivalent to a single layer and you can't solve very complex problem with that.
Reason 2:
If you want to guarantee that the output will always be positive, then you can use the ReLU activation function in the output layer. Alternatively, you can use the "softplus" activation function, which is a smooth variant of ReLU: softplus(z) = log(1+exp(z)). It is close to 0 when z is negative and close to z when z is positive.
Finally, if you want to guarantee that the predictions will fall within a given range of values, then you can use the logistic function or the hyperbolic tangents and then scale the labels to the appropriate range: 0 to 1 for the logistic function and -1 to 1 for the hyperbolic tangents.
Source: Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow, Chapter-10
Thursday, March 11, 2021
Atomic Habits - Bullet key summary
Source: https://images.app.goo.gl/x4FKRsTHtgAHeEx27 |
- Designing your environment for success:
- Make your desired changing things visible in your work/home environment.
- The TWO-minute rule to start a habit:
- Start any desired habit by dividing it into segments and complete segments which require your two minutes only.
- Master the entry points (or first step) of habit.
- Join the community whom normal habit is your desired habit.
- Use variable rewards:
- Most bad habits have immediate rewards and long-term consequences. Most good habits are the exact opposite.
- Create a reward system.
Tuesday, January 5, 2021
Begin Journey with me to become Data Scientist
Hi Reader,
This is my first blog in the field of Data Science and Machine Learning. Before going any further, let me introduce myself. I'm Rahul Kumar, born and live in Delhi, India. I have done my B. Tech in Mechanical Engineering from Delhi Technological University and completed my M.Tech in the field of Renewable Energy from Indian Institute of Technology Roorkee.
Currently, I have started working in a Company whose ideology is to forecast solar generation, wind generation, and Load generation using Machine Learning Algorithms.
Before joining the above organization I learned basic python, Intro to machine learning, and Statistics. By the way, I am still learning but now the learning curve grows exponentially with time.
Let's see our journey with a sigmoid curve, I am at the bottom with the knowledge of Python, Machine Learning, and Basic Statistics.
We have to reach the top in a very short period of time. For me, it's 1-year complete dedicated to this.
So let's start this journey with me. I'll keep posting blogs on every learning.
I am attaching a google sheet where all the resource link is present. I'm requesting you to add more to that.
- Rahul Kumar
Links:
1. https://docs.google.com/spreadsheets/d/1glbDGgU46JZtlqNaX6AiQRx7saj8mWoLoGbXwhtjsIA/edit?usp=sharing
Problems with Sigmoid and Tanh activation functions
The Sigmoid activation function is also known as the Logistic function . The input to the function is transformed into a value between 0 an...
-
The Sigmoid activation function is also known as the Logistic function . The input to the function is transformed into a value between 0 an...
-
Both Sequential APIs and Functional APIs are declarative It means you start it by declaring which layers you want to use and how they should...
-
Source: https://images.app.goo.gl/x4FKRsTHtgAHeEx27 Designing your environment for success : Make your desired changing things visible in yo...