The upside to upsampleing when mixing down is if you use mastering plugins.
You have to think in digital terms and how a wave is sampled. The way I remember it is this.
If you see a chart of a sine waves responce in an X/Y chart, in digital it would consist of dots,(separate measurements of voltage height and frequency in time)
in analog you'd see a solid line consisting of height in voltage vertically and and frequency in time moving left to right.
The waveform is sampled digitally. You know that. The sine wave is sampled in measurements both
vertically and horisontally. The height (dynamics) sample rate is your 16, 24, 32 bits. The width of the waveform
is your 44, 48, 96, etc. These separate measurements in the thousands per second are converted back to an analog
waveform by connecting and smoothing the dots. Like a CRT screen, the light beam moves so fast, the eyes dont see the blinking.
same with the ears with separate dots creating the sound.
If you take a digital photo, and enlarge it enough, you see the picture is made up of dots. When editing a photo and cropping it the boxes shades
change from one color to another through a series of half tone boxes along the otherwise sharp edges in normal view.
By enlarging the photo you can crop the edges much more accurately by removing single pixials, or even half pixials. When you reduce back down the
edit is sharp and accurate.
Same happens when you enlarge an audio file horrizontally. The cropping used by mastering effects will be more accurate and leave the uncropped areas
undisturbed so whan you reduce it back down, less data is destroyed, more original quality is preserved, and less noise/distortion from missing data is introduced.
It wont add any quality, but less quality is sacrificed using the plugins.
Question comes down to, Will up sampling make that big a difference in what you hear?
Theoretically, more data is preserved and you could analize it down and prove it mathamatically.
Will the primative ear drims be abole to hear such small amounts of data loss at such high frequencies hear it is the question.
My thing is Data loss is collective. The more times you process a signal even just up sampleing, and putting a waveform
under a microscope for surgery by a plugin has losses. If you want to retain quality you have to try it and hear if the loss is worth it or not.
If you cant hear a difference, its a waste of time, if you can you have to decide if its worth the difference. Do a file both ways with the same
plugin settings and A/B them in random order and see if you can hear a difference on a good playback system.
You may find recordin at the higher sample rate is more benificial. You just have to decide if the music recorded is worth the extra effort.
if the music was played by star musicians and its going to be a hit single, I'd record it at maximum sample rates so i'd capture all that can be captured.
I could always crop what isnt needed. If its my own music, I'd rarely need anything above 24/48 which is how I record.