Nathan's Weblog
   


This is Nathan Lutchansky's weblog, Copyright (C) 2003-2005 Nathan Lutchansky.

Contact

  • Email
  • Web page
  • Categories

  • Tech
  • Personal
  • Subscribe

  • Atom feed
  • RSS feed
  • LiveJournal

  •        

    Tue, 08 Feb 2005
    Some ideas on Spook

    I've just released the first new version of Spook in almost six months. Embarrassing that I've let things go for so long, but at least all the features that have been accumulating in private trees since then have mostly been added. The network-side code has been overhauled which not only made things more efficient but paved the way for multiple media streams in the same RTSP session. Audio. w00t.

    Since I started adding the audio support, the limitations and flaws in the current stream implementation have been bothering me. It kind of made sense at first to declare "the output from this grabber is named 'foo', and the input to this encoder is named 'foo' and the output will be named 'bar'," etc, but it's not only confusing but it prevents me from implementing a number of more useful features. It would be nice, for example, to have Spook automatically open all the video and audio capture devices it could get its hands on, compress them to some default reasonable format, and put up a template webpage with a list of everything it found so that the user could test out Spook without configuring anything at all. Not possible if all capture and transformation actions have to be declared explicitly.

    There's two major changes I'd like to make to the way Spook creates media streams. The first is to change the media compression and format specification from an imperative form to a declarative form. There's no good reason (other than Unix tradition, I suppose) to force the user to explicitly list the various filters and encoders and the order they should be used, when all that's really necessary is for the user to specify the desired end result. "I want a 320x240 384kbps MPEG4 stream" is enough information to set up the entire module pipeline automatically. The main problem that arises is getting all the nuances of each type of stream correct—providing two different streams, one of 30 fps and one of 10fps, can be done by dropping frames after compression with JPEG, but with MPEG4 the frames must be dropped before they are encoded.

    The other change is to convert the stream namespace into a hierarchy instead of using free-form strings. This sounds minor, but it allows modules to create new streams on the fly without causing confusion. Through the magic of sysfs, the V4L module can automatically discover all your USB webcams, configure streams for them, and create meaningful, static names for them. The two cams plugged into the hub on the lower USB port on the front of your system will be always be named something like "Device::Video::USB::2-1:1.0" and "Device::Video::USB::2-2:1.0" no matter when they were connected, because their name is based on their position on the bus. The same devices can be accessible with their device path, like "Device::Video::/dev/video1" or whatever, if you prefer that.

    The real advantage of putting the streams into a tree is being able to perform transformations over an entire subtree rather than on one at a time. This would allow one part of the tree to contain a "mirror" the same streams as another part, but having passed through one or more modules. For example, the "Low Quality::" tree could contain the above streams as "Low Quality::Video::USB::2-1:1.0", etc, compressed as 100 kbps 10 fps video, rather than the native 30 fps uncompressed YUV format received from the hardware. These mappings from an existing subtree to a new one would be specified in the configuration file, of course, so the user could control which subtrees were available and the exact parameters used.

    At the end of the whole chain, there still needs to be some sort of connection to make the stream available to clients across the network. This could be done through explicit pairings ("rtsp://*/webcam" will serve video from "Low Quality::Video::USB::2-1:1.0") or by exporting entire subtrees ("rtsp://*/cams/" will serve anything below "Low Quality::Video", such as "rtsp://*/cams/USB::2-1:1.0") as the user sees fit. As I mentioned, template webpages could optionally enumerate the streams in exported subtrees to simplify the initial configuration.

    By now, you're probably wondering why I'm putting so much effort into automated support for large numbers of streams. It's not like most people have so many webcams that they can't configure them manually, right? Well, the ultimate goal is to turn Spook into more of a generic network media mixer, capable of importing video from any video source local or remote, performing some set of transforms on it, and exporting it over the network using a variety of formats and protocols. Re-encode video from the DCS-900 watching your koi pond into MPEG4 and relay it through another, more well-connected server via RTSP? No problem. Webcast press conferences from the room's AV system while simultaneously recording them for later retrieval over streaming RTSP or AVI download over HTTP? Makes sense to me. "Tune in" to the physics class lecture you're skipping by dialing a number from your mobile phone? Why not?

    Putting as many media sources as possible into a common namespace doesn't get much closer to convergence, but it's a necessary step.

    [/tech/dev] Posted at: 00:44

    Comments

    Your Comment

     
    Name:
    URL/Email: [http://... or mailto:you@wherever] (optional)
    Title: (optional)
    Comment:
    Save my Name and URL/Email for next time