According to research done by some folks at mit, machine learning models seem to be converging on a consistent statistical representation of reality. This would mean that given a set of observations, there exists a quantification of reality that reasonably represents those observations. This would mean, for all intents and purposes, that we have a mathematical representation of our understanding about the world.
This led me to asking "How can this be? Isn't the world subjective?". The fact that some of us open a banana from the stem, and others open the banana from the bottom (and whether that even is the 'bottom') is culturally subjective.
At which point I realized - it's not a representation of reality, but rather a representation of reality as we on average perceive it, a reality created by the set of data that informs our representation.
If we were to split a dataset by language, and still have a sufficient quantity of data to learn separate representations that each converge on their own statistical optima, I'd guess you would end up with similar representations that differ (of course) based on the underlying distribution of the data. But in what ways would this underlying distribution of data differ? I'd argue that with sufficient data, the only differences would be differences in the subjective interpretation of our reality - after all, the underlying reality stayed the same.
NOTE: We would likely have differences in data capture technology as well. Cameras might differ, audio capture technology might differ. I think because we're speaking theoretically, we can handwave the sourcing balanced datasets.
So if we end up with multiple representations, each with small differences, those differences are representative of the subjective reality of those that speak English, or Swahili, or Mandarin. If you can quantify subjective realities, could you theoretically quantify the subjective reality of any group? What about an individual? While impractical given the data needs, this makes it possible to pursue answers to all sorts of interesting philosophical and social science questions.
At this point, we can use the same kernel alignment metrics to measure where our representations are similar and different to dig into the differences in the subjective reality, or culture, of the speakers of these languages.
What does this matter?
The potential to quantify any distribution that we can reasonably gather data about is pretty interesting. Unfortunately, I'm not clever enough to cook up the larger implications of this.
But if we can measure cultural differences.
The platonic representation hypothesis is really just saying that given a set of observations from a distribution, we can mathematically quantify that distribution.