Curve Fitting is for the most part what most machine learning boils down to, not that that is a bad thing. How do go be beyond the correlation of the black box? I see the rediscovery of symbolic AI and the introduction of casualty into purely probabilistic ML analogous to what happened in software decades ago when we evolved from assembler and procedural languages and we started to model software/data as richer abstractions with relationships. Not the same thing, but a similar evolution in engineering and computer science.
Causal relationships exist in the world and can influence how we collect our data and engineer the features that drive our ML model training. This includes everything from how we analyze covariance in the data and in how we manage and monitor data distributions. Collecting data and engineering features is not enough. Understanding causal relationships can sometimes be gleaned from the data we observer, but often times we must look at how we can develop experiments and interventions with A/B test strategies and multi-armed banded processes to uncover the causality in order to better train our models.
Intervention and experiments can help us answer some "what if questions" and then you have counterfactual, which are beyond the reach of most experiments, yet understanding causal relationships have the potential to offer us insights and help business make better since of the world and their opportunities. We need better tools and engineering processes to incorporate these skills into our ML frameworks and ML processes.
This is starting to happen in AI and ML today across disciplines that are applying ML. This is a good article on the topic that I suggest all ML engineers and data scientists to read.