August 29, 2002
Video Quality Models
So far I've skimmed over the following Video Quality Models (VQMs). Each model has its own metrics and measurement techniques that apply to specific types of video. I'm going to enumerate these models and try to explain a little bit about how they work and what types of video are applicable.
The Institute for Telecommunications Sciences has its own VQM developed by the System Performance Standards Group. Information about this model is available on their Video Quality Research web page. It's documented in NTIA Report 02-392. This model compares a reconstructed video stream against the original by looking at luminance, gain, spatial and temporal distortions, non-linear scaling and clipping, blocking, and blurring. It is specifically applicable to television (e.g. MPEG-2) and video-conferencing streams (e.g. MPEG-4, H.263), as well as general purpose video streams.
The Sarnoff JNDmetrix VQM works differently, but at first glance appears to be less complex. It may, however, be computationally more expensive because it performs a gaussian blur and computes non-linear "busyness" values. Basically, this model looks at local distortions between the reconstructed and original images or video frames and then weights those distortions based on the busyness of the local area. The idea being that people don't notice errors in areas that are visually busy. Since this model applies to single images regardless of encoding, it is applicable to any video codec.
The third VQM, Perceptual Image Distortion, is described in a paper by Patrick C. Teo and David J. Heeger. This model passes an image through a linear filter identifying spatial and orientation characteristics. Normalized local-energy calculations are used to identify distortions in the transformed image. Since this calculation is also done only on images, it should be applicable to any video codec.
August 28, 2002
Apple, although often accused of bad developer support, is still loads better than Microsoft. At least in my opinion. I downloaded a few QuickTime developer documents from the Apple Developer Connection to read up on Video Digitizers. All of the Inside Macintosh books are available for free download in PDF format. And a lot of it (if not all of it) is available for online browsing in HTML. I happen to have a copy of them on CD-ROM from several years ago, but I don't use it.
Their free suite of development tools is also excellent. Project Builder is the best IDE I've ever used, both from a functional and user-interface standpoint. Interface Builder is also really cool. Symantec C++ (which I guess doesn't exist anymore), Metrowerks CodeWarrior, and Forte for Java (now Sun ONE Studio). None of those are as good.
So I've read up on using Video Digitizers in Inside Macintosh: QuickTime Components. The API seems straight-forward enough, but as I learned when adding audio support to vat, I'm going to run into problems I didn't know about. Or maybe that phase is over since I've already tried to get vic to capture video a bunch of times using Sequence Grabbers.
Perceptual Metric Proposals
Just finished up my meeting with Ketan. Showed him what I had so far in terms of the visual and network metrics papers and web sites. We talked about what's coming up and where to go from here.
On September 16th, we'll be going to the NCNI meeting to present a few different proposals as to how we can go about developing a tool to predict video quality based on network measurements. So between now and then, my job is to find perceptual image quality models which I think work well and also apply to H.263 and MPEG-2. Given these models, I need to figure out what sort of network measurements will let us predict the final video quality.
I've already got the Sarnoff JND model which I do think works well. In particular, this model applies to DCT images in general, rather than video streams. So it should work for both H.263 and MPEG-2. We should be able to figure out the probabilities for different levels of image reconstruction quality given some network characteristics and traffic statistics. And then applying the JND model to the reconstructed images will give us probabilities for different video qualities under those network conditions.
So, other than looking for some additional perceptual image quality models, I also need to look up H.263 implementations to see which are the most commonly used features. Different features will affect the traffic characteristics and image reconstruction. There might be some papers which deal specifically with H.263.
One thing which will come later on is figuring out how we can generate traffic with video characteristics to gather link statistics and measurements. Ketan suggested a Java applet which generated the traffic, but we both have doubts as to how well that would work. A Java applet is not going to be very reliable for precise time measurements. But it would be easier to implement than Ketan's other suggestion. If we point a web browser at our own server which is modified to send back data mimicking video traffic, we can make measurements at the TCP/IP layer. I think this approach is both more accurate and more elegant from a user's point of view.
August 27, 2002
In preparation for my meeting with Ketan tomorrow afternoon, I've gone through the 25 or so papers I found so far and prioritized them. I placed them into four categories: video metrics, network metrics, video + network metrics, and miscellaneous. Under each category I seperated papers into high, medium, and low relevance. Luckily, most of the papers are of high relevance. However, only one paper fit into the video + network metrics category and it deals with diff-serv networks instead of general networks. The two papers I filed under miscellaneous are Steve Smoot's dissertation and a paper titled VoIP Performance Monitoring.
One of the papers pointed me at a traffic analysis tool called NetScope that was developed by AT&T Labs-Research. I found a paper describing this tool at Papers on Internet Traffic Analisys (yes, analysis was spelled wrong). There's some good information at both of these sites which I'll probably want to follow up on later. But I don't know if AT&T Labs-Research will actually let outsiders access their tools or research data. Their web site seems more overview than details; perhaps contacting some of the people there directly would be fruitful.
However, I still haven't really found anything that specifically correlates traffic characteristics with video quality. I suppose we'll have to generate models and run simulations and tests to determine this ourselves, given a specific video system. From scratch.
August 25, 2002
So I used CiteSeer and the references or bibliography sections of those papers I found earlier about perceptual video quality metrics and I now have about 25 papers in total. Some of them deal with techniques to measure video quality and some with measuring network traffic properties. I also tried to find papers specifically looking at the impact of network traffic properties on video quality, and while I think I found a few, this was something harder to come across.
Steve Smoot's dissertation is about maximizing video quality given bit-rate constraints, but just about everything he references deals with video signal processing and not the relationship between network traffic properties and video quality. A few of the references deal with how to measure perceptual video quality. So I didn't really think much of what he referenced would help that much. There were a couple which looked a little encouraging, but I wasn't able to actually find the papers themselves. And one or two are books, not papers. CiteSeer didn't have his paper in their database so I couldn't find papers which cited his paper. Maybe I'm missing something here, since Ketan specifically pointed me at Smoot's dissertation.
Some links which I think will be helpful to me later on are:
- MPEG-2 Overview by Dr. Gorry Fairhurst.
- A Beginner's Guide for MPEG-2 Standard by Victor Lo.
- MPEG FAQs
I need to read up on MPEG so I know more about the codec.
August 23, 2002
Got FireWire Cam
I got an ADS Pyro 1394 webcam from Herman Towles, a Senior Research Associate here at UNC Chapel Hill. This will let me finish up video capture for vic in Mash. I'm running it on my iBook with the 30-minute trial drivers from IOXperts.
So I connected it up and started poking around on Apple's developer site looking for those things I dumped into my QuickTime Video Capture Archive Note; particularly the Video Digitizer implementation. So I have a starting point there, but I need to read through the Video Digitizer documentation more.
August 22, 2002
Perceptual Video Quality Metrics Papers
At Ketan's suggestion, I've started collecting papers and information about perceptual video quality metrics. These papers will give us a starting point for figuring out how to associate network statistics like delay and jitter with perceived video quality.
He pointed me at Steve Smoot's dissertation. I also found the Video Quality Research site of the System Performance Standards Group, part of the Institute for Telecommunication Sciences. They have a bunch of papers there dealing with the objective measurement of audio and video.
I also found a couple of papers about measuring perceived visual quality based on the Sarnoff Just-Noticeable Difference (JND) vision model. This model looks interesting because it does a better job of calculating perceived differences than a simple difference map or signal-to-noise calculations.
Now I'm going to use CiteSeer to find more papers related to those I mentioned above. I've got about ten right now that seem relevant at first glance, and CiteSeer will probably give me several more.