So first you take the Raw footage and break it up into scenes, then you take each scene and reduce the information with 512 color fractal patterning that aids knowledge of where the palette range is close as the objectivity in the image shifts and also the fractal patterning reduces the picture information especially high frequency image data and more complex image data keeping the image in a low range of data.
you objectify like a person the whole palette shift of the similar palette object into a mixed resolution Palette shift graphing of each scene palette object containing key detail and palette shifting with mixed resolution and detail frequency lowering relative upto the highest resolution detail at 360p.
you also leave little 360p gap filled by grains to lower the data required relative to visible definition.
Along with the Graining some of the detail is low mixed res greyscale data representing it's color relationship in pixel patterning.
Then when rendering do detail, color and flow upscaling and recreation plus AI defractalization and AI degraining.
Now you use various 3 number tensors packed in single numbers to log efficiently bitwise as much as you can squeeze down 28kbit/s.
For the audio I would have a similar mixed res palette shift object coding with fractal reduction and grainyness and use the AI to upscale it so I would only need 10kbits for reasonable 24bit 96KHz audio considering.
If done right you should get slightly lower quality than 30bit 360p at 256kbits. Today.
This I feel would be a good way to spread 28Kbits around enough for a reasonable upscaling ability.in the future and it could allow for more low bandwidth internet radio services.