1. Given the activation in the output layer , what is the recommended loss function in each case below?

    Activation Loss function
    sigmoid A.: cross-entropy
    softmax A.: cross-entropy
    none (linear) A.: MSE