September 28, 2002
TONIC & Vidcap Machines
Ketan noted that it's very difficult to do range lookups or searches, since Chord is a distributed hash table and that's not what hash tables are good for. But as with databases, you can usually address this problem by adding a supplemental lookup table which lets you traverse the keys in some sorted order. Someone at Duke who was at TONIC today is actually working on this.
Amyn argued that having everything distributed everywhere doesn't make much sense, and that it is more productive to specifically choose nodes to provide your service based on the capabilities of that node. I don't completely disagree (ref. William Gibson's Idoru), but I do think there are real applications which would directly benefit from Chord. Namely, it would be great to have my home running off a DHT like Chord in conjunction with Zeroconf. That way, plugging in my new WD1200JB would add 120GB to every single system on my network, instead of just Binibik. I bought this drive because I needed more home directory space. But I am also running out of local disk space on my workstations. Adobe Photoshop could really benefit from something like this. The question of what happens when a disk goes down can already be addressed with RAID principles. A harder question is what happens when you take a machine somewhere else where it no longer has access to the "local network disk" (I will coin this term to refer to a local disk actually built on a network; i.e. not really local but also not remote). I think you would have to be able to specify specific blocks of data as guaranteed local, and then this would work.
In other news, the Vidcap machines were taken out of the DiRT lab to get partitioned for Windows 2000 and FreeBSD. I will be installing FreeBSD onto a second partition after those machines are returned to the lab.
September 25, 2002
I had my weekly meeting with Ketan this afternoon. Not much to talk about since I haven't done much this past week with everything else that's been going on. But an interesting idea came up while we were talking about measuring quality with different frame rates. Originally I was wondering whether or not there would be a problem dumping 30fps to disk. But Ketan started wondering how things would look if we dumped frames at varying frame rates, and how that would affect the quality.
Anyway, I have to start setting up the test environment for video capture in the DiRT lab. We've got two machines Vidcap3 and Vidcap4 attached to cameras in Sitterson 155. I'm going to install Windows and FreeBSD on these as dual-boot systems. We need the FreeBSD side for dummynet and the Windows side for both the VQM Software and because David Stotts wants to run some experiments and right now it looks like he needs Windows for them.
September 23, 2002
Presenting Chord at TONIC
Ketan's asked me to present Chord this Friday when TONIC will be held at UNC. The Chord project was developed by Ion Stoica, et al. and is described in Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. (Note that this version of the paper is different from the one listed in CiteSeer and on the TONIC home page--I like this version better.)
Since I need to make a few PowerPoint slides to describe Chord, I just downloaded OpenOffice. Unfortunately, OpenOffice is pretty large and the new file server hard disk (WD1200JB) I ordered has not arrived yet, so I had to install OpenOffice on Orlando instead of Binibik. Since the WD1200JB will actually end up being larger than my current 80GB backup drive, sometime in the future I will have to purchase a new backup drive around 320GB in size. If you're interested in hard disk information, visit StorageReview.com. They have a WD1200JB review, which places this drive on the leaderboard for a 7200RPM IDE drive.
September 22, 2002
Continue as Planned
Ketan and I talked a little bit yesterday while going to the TONIC meeting at Duke. Since it's quite possible that the VQM Software will become available in the near future, and we still need to setup our research infrastructure, we're going to go ahead as planned with the hope the software will be made available by the time we are ready to use it.
The first thing I am going to do is to install Open Mash vic on Effie and transmit test video through dummynet to make sure I fully understand how to use dummynet. I've already read through the simple documentation and it's basically delay and drop targets in ipfw. If I use dummynet correctly, I should see horrible video on a vic receiver.
September 19, 2002
VQM Software Part III
I got a response from Stephen Wolf just now and unfortunately he can't say with any degree of certainty what sort of delay we're looking at regarding the no-cost evaluation agreement for the VQM Software. It's up to the lawyers, however long they will take. When the software is available, he will get back to me. But we may have to pursue alternative means of video quality evaluation for now. It would be bad if we worked on something for a while and then the VQM Software was made available, though.
September 18, 2002
VQM Software Part II
At Ketan's suggestion, I fired off another email to Stephen Wolf asking when he thinks the evaluation agreement for the VQM Software will be ready. If the lawyers are going to take too long, then we will probably have to look at an alternative evaluation solution.
In the meantime, I will try to get familiar with dummynet since we will probably use it to simulate end-user links during the video quality evaluation process on the Oracle.
VQM Software (pending)
Yesterday I emailed Stephen Wolf at the Institute for Telecommunication Sciences about getting access to their VQM Software for my video quality prediction research. I received a response this morning, and unfortunately their lawyers are still working on the no-cost evaluation agreement which we need to sign before we can download the software. Stephen says they should be done soon, so hopefully this won't present too much of a delay. I need to get the cameras in 150 Sitterson ready for capture first anyway, but we do need the VQM Software in order to conduct the actual experiments.
September 17, 2002
Non-DMA Video Digitizing
Turns out the reason VDSetPlayThruDestination() was returning -2208 (noDMA) is because when using an external device like a FireWire camera, you can't do DMA. Obviously. VDSetPlayThruDestination() should be used for video capture cards on the PCI bus, but a different sequence of video digitizer API calls are needed for interfacing with USB and FireWire cameras.
I found a post in Apple's quicktime-api list archives by Steve Sisak from back on June 16, 2000 which explains this and also provides the correct video digitizer API call sequence. These are API calls I hadn't looked at before, so I'm reading through them as I try to add them to the Mac OS X video capture code in Open Mash. I'm thinking about the best way to do this is because the frames need to be grabbed asynchronously; vic shouldn't eat up CPU cycles and also let the user do things while grabbing a frame which means I can't just sleep until the frame is grabbed. I think it will work if VDCompressOneFrameAsync() is called once when starting the capture device and recalling it after each successful frame grab.
September 16, 2002
NCNI at MCNC
One thing which was suggested was to pipe reconstructed video through a video card's TV-out port, then re-digitize it for analysis. The reasoning being that this would let you run the oracle on any codec or application without needing source code to dump the frames to disk. On one hand, you'd essentially be taking the reconstructed video and repurposing it to NTSC before analysis, which would introduce additional artifacts. However, it is true that a number of participants will be displaying received video on television or projector systems anyway.
First step now is to try and get some YUV data to run through the free VQM Software and get some idea of how long it takes to actually run a simulation. Ketan and I were thinking it would be possible to grab YUV frames out of Open Mash vic quite easily.
At the moment, I'm installing FreeBSD on Effie.
September 15, 2002
Well, I've put all the basic video digitizer commands into the Mac OS X video capture class for Open Mash. The initial setup appears to be working, but I'm getting a -2008 error message when I try to set the video digitizer's capture buffer via VDSetPlayThruDestination. This indicates that the location of the destination PixMap is wrong for what I'm trying to do. I'll have to dig around and figure out why. I recall seeing somewhere what situations would return -2008 (noDMA) but I can't find where that is anymore. So many Inside Macintosh PDFs open.
September 13, 2002
Finished Imaging With QuickDraw
Just finished reading through the parts of Imaging With QuickDraw that I think are relevant to the data structures and subroutines I will need to use with the Video Digitizers to access the actual frame buffer of a captured frame. It's still a bit unclear to me how to actually use the Video Digitizer API in a correct sequence of commands that will let me setup the frame buffers, for example. Hopefully that will be fairly straight-forward with minimal API calls.
September 11, 2002
Preparing for NCNI
I met with Ketan just a little while ago and we talked about the NCNI meeting that's coming up this Monday at 9am. We need to present our findings and propose a solution for predicting video quality based on network measurements. I need to write up a one or two page hand-out for the meeting that does this.
Ketan agrees with me that the best video quality model (VQM) to go with is the one detailed in NTIA Report 02-392. This model is the result of years of research by the Institute for Telecommunication Sciences and has several advantages over the other two models I came across:
- The model is fully specified in the 120 page report. All of the mathematics and formulas are described in detail.
- The document provides specific quality evalution formulas for broadcast and video conferencing video streams.
- VQM Software is available free for research purposes.
The other half involves implementing what Ketan has named the <music type="spooky">Oracle</music>. This is the deployment architecture he thinks would make a killer impact on the use of our research. Basically, what this would do is let end-users use a very lightweight application to make link measurements between themselves. They can then send this information to the <hands movement="wavy">Oracle</hands> along with information about the type of video they will be transmitting, and then receive a prediction that says if the video quality will be good or bad. The <sound source="thunder">Oracle</sound> machine would be a central server in our control which would use the provided link data to determine video quality, based on a specific test stream(s). One way to do this would be to actually transmit the video to itself while conforming to the provided network data and running the reconstructed video through the VQM software.
Click to Meet Architecture
Yesterday John Lachance, a Systems Engineer for First Virtual Communications called me back to talk about how their Click to Meet product works. I had contacted FVC about what H.263+ annexes are used in Click to Meet, and was not sure if the response I received from Steve Barth also applied to their web clients. John was able to answer this for me.
FVC uses a client/server architecture for their Click to Meet product family. How it works is that you install their server software on a single machine which is your network's H.323 point of access. All of your video conferencing data will go through this server. End-users point Internet Explorer (5.5+) at the conference and download an ActiveX Control which will talk to the Click to Meet server using a version of the Cu-SeeMe protocol.
So basically the answer is that the Click to Meet web clients will end up using the F and P.5 annexes used by the server.
September 10, 2002
The past few days I've been trying to read through all the documentation related to QuickTime and Video Digitizers. I can get through the Video Digitizer API right now, but there are too many parts of it which I don't understand since it's a low-level API and there are no examples for using Video Digitizers to capture images. So right now I'm concentrating on QuickDraw, which is the legacy drawing API used by QuickTime.
September 6, 2002
I've read through the three visual quality model (VQM) papers I mentioned before, in more detail this time, to try and figure out exactly what needs to be measured and computed, and also how useful each model will be.
The Perceptual Image Distortion model doesn't appear to be particularly useful. From what I gather, they developed a model which closely matches empirical data gathered by other researchers, but did not actually conduct any human perceptual studies on images to determine the real-world usefulness of this model. Nor is there really any rating scale which could be applied to describe an image's perceived quality, unless you just want to throw around their calculated numbers. This model doesn't seem very promising, and they do state in their conclusion that future work needs to be done to correlate their model with some perceived quality.
The JNDmetrix, however, conducted real-person studies with trained image analysts and other people to compare how well JND values correlate to perceived quality. And also how badly root-mean-error fairs in the same tests. In short, the JNDmetrix VQM works extremely well at providing easy to understand ratings of an image's quality. They've won an Emmy for their work and the JNDmetrix is in use by the broadcast industry. Unfortunately, their description of how the JND values are computed is described without hard numbers of formulas. So I'll either need to contact them to see if they are willing to reveal their algorithms, or implement my own interpretation of what is discussed in their paper. Or purchase their JNDmetrix products. I'm not very optimistic about them giving me the algorithms, since they sell JNDmetrix through their products. At least there was no mention of a patent on this technology.
The third VQM is the one documented in NTIA Report 02-392. This paper is extremely detailed and provides all of the information you need to implement and use their model, including suggested parameter values, calculation descriptions, and equations specific to the type of video sequence involved. Their model also applies specifically to video sequences, and not individual images. All the equations appear to be linear, which is a good thing. It is specifically stated that this information is available for anyone to use anywhere for whatever. Like with the JNDmetrix, this model also has a specific number which can be used to compare the quality of two video sequences. Right now, this VQM looks like the best choice.
None of these models specifically tie any sort of network characteristics to the quality calculation, or deal with what type of artifacts were detected in the reconstructed frames. So, I'm guessing we'll have to go with actual calculations on sample video streams in a simulated network.
DCT Quality for NCNI (Tyler Johnson)
I actually started thinking about this right off the bat, because I really don't like the idea of having to run actual video or even fake data with video characteristics every single time an end-user wants to test their connection. After all, what's the difference between that and just firing up a video conferencing session anyway? And it also requires both end-points to cooperate and maybe install something. I happen to be one of the few computer geeks who places a really high value on end-user convenience and usability. (I also use Mac OS X as my workstation of choice, which I guess reinforces that.)
What I am currently considering as the best option is to run tests on a simulated network we have control over (I understand there are tools for simulating network characteristics on FreeBSD routers/gateways) for some different video stream types (e.g. H.263+ with common annexes, MPEG-2, H.263, specific tools) and using that to generate a model which represents "quality" based on the network characteristics. Either that, or use probability and the known effects of loss, delay, or jitter on specific bits of data to generate a probabilistic model. The latter is more complex though.
Then, given those models, the end-user could simply run some simple ping-type tests with different packet sizes to figure out the current characteristics of their connection and then match that up with the pre-computed model associated with their codec/application.
Ketan suggested an interesting approach to this where an end-user could point their web browser to a special server that would be able to run these special ping-type tests. This would not require an end-user to even know what pinging means. Of course, this only observes the connection between the end-user and the special server, not between the two end-users who will be communicating. But it's better than nothing. It would also be possible to conduct this test several times each day over a few weeks to generate aggregate characteristics for a specific end-user and let them know what they can expect, average-case (maybe also best- and worst-case).
On Friday, September 6, 2002, at 07:14 AM, Tyler Johnson wrote:
Wesley I did a cursory review of the information you sent out regarding evaluating quality for compressed video. I suspect we are going to face an architectural decision and I would like to get you thinking about it now.
I see two possible approaches to predicting whether a particular network link will perform well for compressed video. The first is to relate network performance to perceptual quality. So we would say, for example, 5% packet loss causes sufficient tiling at target data rates/codecs that most people would regard as unusable. This would be an extremely convenient thing to do. Perhaps there would be a plug-in architecture that let's you insert your own rating scales. The down side is of course that every codec implementation is different and deals with packet loss, jitter, buffering in ways that could render those correllations meaningless.
The other approach would be to actually send sample video and analyze it on the other end. On the surface, this would seem to ensure dead accurate results. The downside is that the tool would be bigger with more components. I also wonder if we generate sample video, do we have the same problem as approach #1 in terms of different codecs behaving differently than the sample.
I would like for you to think about these issues and what the tradeoffs might be to going down either path. Perhaps there is some other approach?
Click to Meet Annexes
Steven Barth from First Virtual Communications was good enough to get back to me with which annexes their conference server supports:
- Annex F: Advanced Prediction Mode (reduces artifacts)
- Annex P.5 Factor of 4 resampling (allows resizing)
Searching for H.263
I tried to find some information on the H.263+ options commonly used in video conferencing software. The most popular application looks like Microsoft NetMeeting, no doubt in large part because it's bundled with every version of Windows. Cu-SeeMe doesn't look like it officially exists anymore, but I did find Click to Meet by First Virtual Communications. It looks like FVC used to be in charge of Cu-SeeMe but now sells Click to Meet. The free iVisit application also looks pretty popular, based on their discussion forum. It was developed by Tim Dorcey, the creator of Cu-SeeMe. Unfortunately, I could not find out anything about what H.263+ options these applications support. I've emailed FVC and the iVisit people asking if they would be able to provide me with this information. I'm not even sure if NetMeeting implements H.263+, since everything only refers to H.263.
But there's also Polycom, which I know about from my work at the Berkeley Multimedia Research Center. They are good enough to make their technology white papers available, and list the H.263+ annexes and options implemented in their iPower video codec:
- Annex F: Advanced Prediction
- Annex J: Deblocking
- Annex N: Reference Picture Selection
- Annex P: Reference Picture resampling by a factor of 4
- Appendix I: Error Tracking
I also found two more papers. The first is provides a bunch of statistical models for variable-bit-rate video. This might be useful later on when we need to develop our own video stream models for traffic analysis. The second compares PSNR video quality of H.263 with different packet-loss protection schemes. Interesting that they found H.263 PSNR decreases linearly as packet loss increases.