Add MAX_GRAD constant (1e6) and clip gradients in BCE and CrossEntropy
backward passes to prevent gradient explosion with extreme prediction
values near 0 or 1.
Also add examples/loss_demo.rs for manual testing and demonstration
of loss function behavior.