No Reference Metrics (NORM)
The No Reference (NR) Metrics group (NORM) is an open collaborative for developing NR metrics for monitoring visual service quality. NORM encourages knowledge sharing on all aspects of NR metric research and development. Current projects include:
- Open source NR metric
- SI and TI clarification and improvement
- Video quality metadata standard
NORM coordinates work using the VQEG reflector firstname.lastname@example.org
Conference calls are announced on the NORM reflector and listed on the main VQEG webpage.
#1) NR Metric Development—What Is Our Design Goal?
NR metrics use the decoded video at the point of measurement to estimate:
- Mean opinion score (MOS), to assess the overall quality
- Root cause analysis (RCA), to identify the impact of specific visual impairments
We seek a broad scope: camera capture, encoding, decoding, transcoding, scaling, transmission, aesthetics and artistic intent, image enhancement, monitor, and display device.
Users must be able to modify the scope to ignore specific impairments. For example, broadcasters want MOS to ignore artistic intent and aesthetics. The omitted factors should not influence MOS.
The performance goals are robust accuracy with a minimum of operational restriction. Use cases include Video on Demand (VoD), live broadcast services, social media, first responder video, medical, and AI vision systems (autonomous vehicles). Access to the bitstream is beneficial, but not required.
NORM is an open collaborative group that intends to make all of its work public, royalty free.
- No reference resources here.
- This Google Sheet coordinates our work to identify metric components.
- This presentation from 2021 identifies datasets for training NR metrics.
#2) Improved Complexity Metric
This effort began by clarifying SI and TI. The ITU-T Rec. P.910 definitions of SI and TI had contained ambiguities. Agreement was reached on how to eliminate these ambiguities. Our proposal was submitted to ITU-T Study Group 12, to revise P.910.
Work continues on developing an improved metric that assesses the coding complexity of videos. Design specifications include:
- Very light weight algorithm (low coding complexity)
- Estimates a curve that relates coding complexity to bit-rate
- Includes motion estimation
- Questions #1: calculate the convex hull, to choose an optimal compression resolution and bitrate
- "VMAF target of X requires Y MBit/s at resolution Z"
- Question #2: extrapolate the impact of reducing resolution or bitrate
- "We are at bitrate X and we want to reduce the bitrate to X/2. How much will the quality drop?"
- This is question #1 applied to a video that already has compression artifacts
- Question #3: a managed network only has access to compressed videos, and wants to understand the impact of increasing or decreasing bandwidth allocations
- "How much would the quality change (up or down) if we increase or decrease the bandwidth?"
- This question assumes that the head end could adjust its behavior to take advantage of increased bitrates, and that video complexity is somewhat stable over time
- Vision of training libraries to enable research
- Videos without no scene cuts,
- FR metric ratings (e.g., VMAF)
- Training features (e.g., from Motion search repository)
- Variety of resolutions and bitrates of interest to modern paid video services (VMAF 60 to 95)
- Access to the source content is optional, if the other data is available
#3) Video Quality Metadata Standard
Full reference video quality metrics are readily available in most
modern transcoding pipelines. Including full-reference video quality metrics as metadata in compressed bitstreams would take very little space and provides a more accurate and “green” way of estimating source video quality.
To realize this vision, we must establish a standard format to save such metadata at both elementary video bitstream level and system layer. Both hardware (device) makers and service providers have a lot to gain by offering such metadata in their compressed bitstreams.
NR metrics would still be needed in situations like the following:
- In the camera front-end, to estimate quality of raw input
- Legacy content (video quality metadata unavailable)
- Some video broadcasting applications (e.g., transmission over
- Non-transcoding image/video applications (e.g., editing, image enhancement)