The main problem with neural networks has always been the difficulty of interpreting the trained model. Unlike gradient boosting or random forests, neural networks relies on combinations of mathematical functions that cannot be easily translated to intuitive human understanding.
"Do you mean GB & RF are interpretable because they can generate important variable list?"
No, the generated variable importance means nothing as it's built through a heuristic that doesn't guarantee that its output is aligned with the actual GB or RF algorithm. What I mean is that the underlying algorithm that GB and RF are based on, which is decision trees, actually has a non-mathematical meaning. For example, you can say that if x1 > 700 and x2 == False then y = 4.73. That has an exactly interpretable meaning. While GB and RF ensemble many trees together in some mathematical fashion, fundamentally each tree itself is easy to interpret and the process of building them is intuitive as well. NNs on the other hand are just layers of more or less linear combinations, where the output from one layer are combined linearly to feed into the next layer. If you only had one layer, then the model becomes more or less linear regression and it is easy to interpret, but with multiple layers it's not intuitive to interpret the meaning of the lower layers and weights.
"Do you mean GB & RF are interpretable because they can generate important variable list?"
No, the generated variable importance means nothing as it's built through a heuristic that doesn't guarantee that its output is aligned with the actual GB or RF algorithm. What I mean is that the underlying algorithm that GB and RF are based on, which is decision trees, actually has a non-mathematical meaning. For example, you can say that if x1 > 700 and x2 == False then y = 4.73. That has an exactly interpretable meaning. While GB and RF ensemble many trees together in some mathematical fashion, fundamentally each tree itself is easy to interpret and the process of building them is intuitive as well. NNs on the other hand are just layers of more or less linear combinations, where the output from one layer are combined linearly to feed into the next layer. If you only had one layer, then the model becomes more or less linear regression and it is easy to interpret, but with multiple layers it's not intuitive to interpret the meaning of the lower layers and weights.
No comments:
Post a Comment