Nate,
Wanted to post this comment on this topic since it relates to some suggestions Daniel mentioned above about possibly using QuadBatches to render a SkeletonSprite's content (even if they weren't directly in the display list).
However, I was motivated by the requests on this other topic to possibly separate the need for every render() call to also calculate and create the rendered Quads/QuadBatches:
http://forum.starling-framework.org/topic/spine-performance-issue
I played with this a bit over the holiday break, and I put together a variation of this idea on a forked version of spine-runtimes:
https://github.com/jamikado/spine-runtimes
Specifically the changes only occur in one class, SkeletonSprite, and in this one commit:
https://github.com/jamikado/spine-runtimes/commit/e88637e62e05ed29213eb61450da92f50f76517d
Not sure if you would be interested in checking it out as a suggestion, but I'll outline the potential benefits here in case others are interested. Maybe there are issues with this modification that needs further study since I'm relatively new to Spine's runtime system, but it seems to work fine with my tests.
First, there is one unrelated optimization change I made in SkeletonSprite which was to remove the instance variable vertices:Vector.<Number> and make it just a reusable static helper since you don't need that duplicated on every SkeletonSprite instance since you only use it to locally grab data within a single method call like the other temp static helpers.
Basically, the changed code moves the actual creation of quads generated by SkeletonSprite out from the render() call and into a cached set of QuadBatches (more than one, since as you know there is a potential for a state change from one slot rendering to the next) that will only be updated when advanceTime is called.
While it is true that in most cases, SkeletonAnimations will be updating *and* rendering on every enterFrame, this change allows for an improvement in the following cases:
1) Starling.stop(false) when wanting to pause the entire Starling display list but still need rendering to occur, such as when temporarily pausing a game level.
2) Whenever you have any SkeletonAnimation that is temporarily (or permanently) removed from any active juggler. Rendering is still getting called with the created QuadBatches, but if no advanceTime is being done, you avoid the need to recompute and create the quads on each and every render() call, instead you just send the cached QuadBatches to render.
All the code that creates and updates this internal QuadBatch array is in direct relation to the same way RenderSupport itself builds up and reuses QuadBatch instances, so the code is well tested and stable in its technique.
There should be no extra increase in draw calls compared to the older method (I added a batchable property on SkeletonSprite just for kicks and giggles, and it defaults to true), and the QuadBatch instances are reused and trimmed in only the rarest of cases, so there should also not be any additional impact on temporary object creation.
This doesn't really improve the rendering times compared to the older technique, but it should allow for performance improvements with paused SkeletonAnimations (either all or some at a time) if this is a common practice in a given app scenario.
I could see this being useful when there might be lots of SkeletonAnimations in a scene but only a few go active in animation when interacted with, while the rest just reuse their QuadBatches when advanceTime does not get triggered.
I don't know if in the future there will ever be some API that can inform whether after calling update(delta) no actual change in the state also occurred so one could possibly even avoid updating the QuadBatches in this scenario too, but for now, if advanceTime gets called, the QuadBatches get rebuilt...