Deriving video content type from HEVC bitstream semantics

James Nightingale, Qi Wang, Christos Grecos, Sergio R. Goma

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

8 Citations (Scopus)

Abstract

As network service providers seek to improve customer satisfaction and retention levels, they are increasingly moving from traditional quality of service (QoS) driven delivery models to customer-centred quality of experience (QoE) delivery models. QoS models only consider metrics derived from the network however, QoE models also consider metrics derived from within the video sequence itself. Various spatial and temporal characteristics of a video sequence have been proposed, both individually and in combination, to derive methods of classifying video content either on a continuous scale or as a set of discrete classes. QoE models can be divided into three broad categories, full reference, reduced reference and no-reference models. Due to the need to have the original video available at the client for comparison, full reference metrics are of limited practical value in adaptive real-time video applications. Reduced reference metrics often require metadata to be transmitted with the bitstream, while no-reference metrics typically operate in the decompressed domain at the client side and require significant processing to extract spatial and temporal features. This paper proposes a heuristic, no-reference approach to video content classification which is specific to HEVC encoded bitstreams. The HEVC encoder already makes use of spatial characteristics to determine partitioning of coding units and temporal characteristics to determine the splitting of prediction units. We derive a function which approximates the spatio-temporal characteristics of the video sequence by using the weighted averages of the depth at which the coding unit quadtree is split and the prediction mode decision made by the encoder to estimate spatial and temporal characteristics respectively. Since the video content type of a sequence is determined by using high level information parsed from the video stream, spatio-temporal characteristics are identified without the need for full decoding and can be used in a timely manner to aid decision making in QoE oriented adaptive real time streaming.

Original languageEnglish
Title of host publicationReal-Time Image and Video Processing 2014
Number of pages13
Volume9139
DOIs
Publication statusPublished - 1 Jan 2014
EventReal-Time Image and Video Processing 2014 - Brussels, Belgium
Duration: 16 Apr 201417 Apr 2014

Conference

ConferenceReal-Time Image and Video Processing 2014
CountryBelgium
CityBrussels
Period16/04/1417/04/14

Fingerprint

Semantics
Metric
Encoder
Quality of Service
Unit
Quality of service
Coding
Mode Decision
Quadtree
Customer Satisfaction
Model
Prediction
Customer satisfaction
Weighted Average
Reference Model
Metadata
Streaming
Decoding
Partitioning
Customers

Keywords

  • content type classification
  • HEVC
  • QoE
  • video streaming

Cite this

Nightingale, J., Wang, Q., Grecos, C., & Goma, S. R. (2014). Deriving video content type from HEVC bitstream semantics. In Real-Time Image and Video Processing 2014 (Vol. 9139). [913902] https://doi.org/10.1117/12.2051757
Nightingale, James ; Wang, Qi ; Grecos, Christos ; Goma, Sergio R. / Deriving video content type from HEVC bitstream semantics. Real-Time Image and Video Processing 2014. Vol. 9139 2014.
@inproceedings{f31c14612daf49ccbeb87adb115b250b,
title = "Deriving video content type from HEVC bitstream semantics",
abstract = "As network service providers seek to improve customer satisfaction and retention levels, they are increasingly moving from traditional quality of service (QoS) driven delivery models to customer-centred quality of experience (QoE) delivery models. QoS models only consider metrics derived from the network however, QoE models also consider metrics derived from within the video sequence itself. Various spatial and temporal characteristics of a video sequence have been proposed, both individually and in combination, to derive methods of classifying video content either on a continuous scale or as a set of discrete classes. QoE models can be divided into three broad categories, full reference, reduced reference and no-reference models. Due to the need to have the original video available at the client for comparison, full reference metrics are of limited practical value in adaptive real-time video applications. Reduced reference metrics often require metadata to be transmitted with the bitstream, while no-reference metrics typically operate in the decompressed domain at the client side and require significant processing to extract spatial and temporal features. This paper proposes a heuristic, no-reference approach to video content classification which is specific to HEVC encoded bitstreams. The HEVC encoder already makes use of spatial characteristics to determine partitioning of coding units and temporal characteristics to determine the splitting of prediction units. We derive a function which approximates the spatio-temporal characteristics of the video sequence by using the weighted averages of the depth at which the coding unit quadtree is split and the prediction mode decision made by the encoder to estimate spatial and temporal characteristics respectively. Since the video content type of a sequence is determined by using high level information parsed from the video stream, spatio-temporal characteristics are identified without the need for full decoding and can be used in a timely manner to aid decision making in QoE oriented adaptive real time streaming.",
keywords = "content type classification, HEVC, QoE, video streaming",
author = "James Nightingale and Qi Wang and Christos Grecos and Goma, {Sergio R.}",
year = "2014",
month = "1",
day = "1",
doi = "10.1117/12.2051757",
language = "English",
isbn = "9781628410877",
volume = "9139",
booktitle = "Real-Time Image and Video Processing 2014",

}

Nightingale, J, Wang, Q, Grecos, C & Goma, SR 2014, Deriving video content type from HEVC bitstream semantics. in Real-Time Image and Video Processing 2014. vol. 9139, 913902, Real-Time Image and Video Processing 2014, Brussels, Belgium, 16/04/14. https://doi.org/10.1117/12.2051757

Deriving video content type from HEVC bitstream semantics. / Nightingale, James; Wang, Qi; Grecos, Christos; Goma, Sergio R.

Real-Time Image and Video Processing 2014. Vol. 9139 2014. 913902.

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

TY - GEN

T1 - Deriving video content type from HEVC bitstream semantics

AU - Nightingale, James

AU - Wang, Qi

AU - Grecos, Christos

AU - Goma, Sergio R.

PY - 2014/1/1

Y1 - 2014/1/1

N2 - As network service providers seek to improve customer satisfaction and retention levels, they are increasingly moving from traditional quality of service (QoS) driven delivery models to customer-centred quality of experience (QoE) delivery models. QoS models only consider metrics derived from the network however, QoE models also consider metrics derived from within the video sequence itself. Various spatial and temporal characteristics of a video sequence have been proposed, both individually and in combination, to derive methods of classifying video content either on a continuous scale or as a set of discrete classes. QoE models can be divided into three broad categories, full reference, reduced reference and no-reference models. Due to the need to have the original video available at the client for comparison, full reference metrics are of limited practical value in adaptive real-time video applications. Reduced reference metrics often require metadata to be transmitted with the bitstream, while no-reference metrics typically operate in the decompressed domain at the client side and require significant processing to extract spatial and temporal features. This paper proposes a heuristic, no-reference approach to video content classification which is specific to HEVC encoded bitstreams. The HEVC encoder already makes use of spatial characteristics to determine partitioning of coding units and temporal characteristics to determine the splitting of prediction units. We derive a function which approximates the spatio-temporal characteristics of the video sequence by using the weighted averages of the depth at which the coding unit quadtree is split and the prediction mode decision made by the encoder to estimate spatial and temporal characteristics respectively. Since the video content type of a sequence is determined by using high level information parsed from the video stream, spatio-temporal characteristics are identified without the need for full decoding and can be used in a timely manner to aid decision making in QoE oriented adaptive real time streaming.

AB - As network service providers seek to improve customer satisfaction and retention levels, they are increasingly moving from traditional quality of service (QoS) driven delivery models to customer-centred quality of experience (QoE) delivery models. QoS models only consider metrics derived from the network however, QoE models also consider metrics derived from within the video sequence itself. Various spatial and temporal characteristics of a video sequence have been proposed, both individually and in combination, to derive methods of classifying video content either on a continuous scale or as a set of discrete classes. QoE models can be divided into three broad categories, full reference, reduced reference and no-reference models. Due to the need to have the original video available at the client for comparison, full reference metrics are of limited practical value in adaptive real-time video applications. Reduced reference metrics often require metadata to be transmitted with the bitstream, while no-reference metrics typically operate in the decompressed domain at the client side and require significant processing to extract spatial and temporal features. This paper proposes a heuristic, no-reference approach to video content classification which is specific to HEVC encoded bitstreams. The HEVC encoder already makes use of spatial characteristics to determine partitioning of coding units and temporal characteristics to determine the splitting of prediction units. We derive a function which approximates the spatio-temporal characteristics of the video sequence by using the weighted averages of the depth at which the coding unit quadtree is split and the prediction mode decision made by the encoder to estimate spatial and temporal characteristics respectively. Since the video content type of a sequence is determined by using high level information parsed from the video stream, spatio-temporal characteristics are identified without the need for full decoding and can be used in a timely manner to aid decision making in QoE oriented adaptive real time streaming.

KW - content type classification

KW - HEVC

KW - QoE

KW - video streaming

UR - http://www.scopus.com/inward/record.url?scp=84902504204&partnerID=8YFLogxK

U2 - 10.1117/12.2051757

DO - 10.1117/12.2051757

M3 - Conference contribution book

AN - SCOPUS:84902504204

SN - 9781628410877

VL - 9139

BT - Real-Time Image and Video Processing 2014

ER -

Nightingale J, Wang Q, Grecos C, Goma SR. Deriving video content type from HEVC bitstream semantics. In Real-Time Image and Video Processing 2014. Vol. 9139. 2014. 913902 https://doi.org/10.1117/12.2051757