The biggest issue IMHO is latency. It's been the major hindrance for real-time music interaction online. The more distant the collaborator, the worse it tends to be.
Acceptable latency for video conferencing
Target values for acceptable video conference performance are 150 ms latency, 40 ms jitter and 1% or less packet loss. Latency includes a fixed component related to the network transmission path length, so physical distance makes this parameter somewhat difficult to control.
https://searchnetworking.techtarget.com/tip/Video-conferencing-bandwidth-requirements-for-the-WAN
Some people get annoyed when their DAW has ~20 ms or more of latency, and most find 150 ms to be too annoying to deal with.
I'd love to have something that would allow two people, or even better, groups of people to interact and jam together in real time, and I'd definitely be an advocate for adding that functionality to HC, but AFAIK, the technology just doesn't exist that allows that in a seamless, non-buggy, latency-free way.
Please feel free to correct me if you know of a system that actually does work.