Just a side note,
Guys, remember that uploading textures to the GPU is a slow, expensive operation on the hardware pipeline and should ideally be minimized and relegated to initial stages. (I think it was Tom Krcha or Mike Jones from Adobe who told me that). Doing it on every frame may be choking your pipeline. That's why it's optimal for Starling to just manipulate textures that are already on the GPU.
In this case, perhaps sacrificing resolution for FPS speed is best...
I could be wrong on the following but:
2048 x 2048 x 32bits = 134217728 bits = 67108864 bytes = 65536 kb = 64mb for that size texture...
For a mobile device with limited system and GPU memory... ouch?
Hmm... can you try a test by playing the video the old way on the flash.display DisplayList stage that floats above Stage3D and let us know framerates?
You're not supposed to mix and match DisplayList and Stage3D because of the performance hit, but in this case, maybe you'll get better performance than redrawing the video on a texture for every frame? Then after you're done with the video, just removeChild the sprite overlay video container from the DisplayList.