Support for multiple clock rates in an RTP session
(Unaffiliated)
petithug@acm.org
The usage of multiple clock rates in an RTP session is currently underspecified.
This document lists multiple ways to fix this problem and is meant as a support for discussion.
The clock rate is a parameter of the payload format.
It is often defined as been the same as the sampling rate, but it is not always the case (see e.g. the G722 and MPA audio codecs in ).
An RTP sender can switch between different payloads during the lifetime of an RTP session and because clock rates are defined by payload types, it is possible that the clock rate also varies during an RTP session.
Changing the clock rate during an RTP session is not a problem for the RTP receiver, as it always knows the clock rate associated with a specific RTP packet.
The RTP receiver also has no problem calculating a clock rate independent interarrival jitter.
The problem is with reports carried in RTCP packets that contain fields using units based on the clock rate.
Because the RTCP packets do not contain a field for the payload type, it is difficult for a sender to choose or for a receiver to guess which clock rate to use for this fields.
For example, lip synchronization can be incorrect if the RTP timestamp in the RTCP SR packet use a different clock rate than expected by the receiver.
contains a non-exhaustive list of fields in RTCP packets that use a clock rate:
Field name
RTCP packet type
Reference
RTP timestamp SR
Interarrival jitter RR
min_jitter XR Summary Block
max_jitter XR Summary Block
mean_jitter XR Summary Block
dev_jitter XR Summary Block
Interarrival jitter IJ
RTP timestamp SMPTETC
Jitter RSI Jitter Block
Median jitter RSI Stats Block
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in .
The multiplier used to convert from a wallclock value in seconds to an equivalent RTP timestamp value (without the fixed random offset).
Note that uses various terms like "clock frequency", "media clock rate", "timestamp unit", "timestamp frequency" and "RTP timestamp clock rate" as synonymous to clock rate.
A logical network element that sends RTP packets and sends and receives RTCP packets.
A logical network element that receives RTP packets and sends and receives RTCP packets.
An RTP sender can choose to implement a change in clock rate in various ways.
A RTP sender can switch between payload types set with different clock rates on the same SSRC.
The RTP sender uses the current clock rate as the unit for the fields in the RTCP packets sent.
It is probably the simplest behavior to implement and various implementations already follow this behavior.
This behavior seems to contradict section 5.2.
It is difficult for an RTCP receiver to guess the clock rate used in the RTCP packets.
As in the previous section, the RTP sender switches between different clock rates on the same SSRC, but it always uses the same clock rate as the unit for the fields in the RTCP packets sent.
There is different possible ways to choose this fixed clock rate:
The first clock rate used on the RTP session.
The highest clock rate that can be used on the RTP session.
A unified clok rate, as defined in uRTR.
It is simple to implement.
There is obvious compatibility issues with implementations using a different behavior.
The fixed clock rate must be an integer multiple of the possible clock rates.
uRTR was rejected at IETF 61.
Instead of using various clock rates in the same SSRC, an RTP sender can use a different SSRC for each clock rate.
It is compliant with section 5.2.
As there can be be only one possible clock rate on a specific SSRC, there is no ambiguity in the clock rate used in the RTCP packets.
Changing the SSRC can be a problem for some implementations designed to work only with unicast IP addresses, where having multiple SSRCs is considered a corner case.
Lip synchronization can be a problem in the interval between the beginning of the new stream and the first RTCP SR packet.
This is not different than what happen at the beginning of the RTP session but it can be more annoying for the end-user.
The RTP extension defined in can be used to accelerate the synchronization.
An RTP Receiver can use the clock rate associated with the current payload received in the RTP packets. There is a race condition between the RTP and the RTCP packets that can create transient problems.
Also this method does not work for an RTCP monitor (i.e. an RTCP receiver that does not receive the RTP packets).
This method will not work either if a fixed clock rate is used.
Instead of using the current RTP clock rate, an RTP receiver can use the information in two consecutive SR packets to calculate the clock rate used, i.e. if Ni is the NTP timestamp for the SR packet i, Ri the RTP timestamp for the SR packet i and Nj and Rj the NTP timestamp and RTP timestamp for the previous SR packet j, then the clock rate can be guessed as the closest to (Ri - Rj) / (Ni - Nj).
This document was written with the xml2rfc tool described in .
Key words for use in RFCs to Indicate Requirement Levels
Harvard University
1350 Mass. Ave.
Cambridge
MA 02138
- +1 617 495 3864
sob@harvard.edu
General
keyword
In many standards track documents several words are used to signify
the requirements in the specification. These words are often
capitalized. This document defines these words as they should be
interpreted in IETF documents. Authors who follow these guidelines
should incorporate this phrase near the beginning of their document:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119.
Note that the force of these words is modified by the requirement
level of the document in which they are used.
RTP: A Transport Protocol for Real-Time Applications
This memorandum describes RTP, the real-time transport protocol. RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of- service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers. Most of the text in this memorandum is identical to RFC 1889 which it obsoletes. There are no changes in the packet formats on the wire, only changes to the rules and algorithms governing how the protocol is used. The biggest change is an enhancement to the scalable timer algorithm for calculating when to send RTCP packets in order to minimize transmission in excess of the intended rate when many participants join a session simultaneously. [STANDARDS TRACK]
Writing I-Ds and RFCs using XML
Invisible Worlds, Inc.
660 York Street
San Francisco
CA
94110
US
+1 415 695 3975
mrose@not.invisible.net
http://invisible.net/
General
RFC
Request for Comments
I-D
Internet-Draft
XML
Extensible Markup Language
This memo presents a technique for using XML
(Extensible Markup Language)
as a source format for documents in the Internet-Drafts (I-Ds) and
Request for Comments (RFC) series.
RTP Profile for Audio and Video Conferences with Minimal Control
This document describes a profile called "RTP/AVP" for the use of the real-time transport protocol (RTP), version 2, and the associated control protocol, RTCP, within audio and video multiparticipant conferences with minimal control. It provides interpretations of generic fields within the RTP specification suitable for audio and video conferences. In particular, this document defines a set of default mappings from payload type numbers to encodings. This document also describes how audio and video data may be carried within RTP. It defines a set of standard encodings and their names when used within RTP. The descriptions provide pointers to reference implementations and the detailed standards. This document is meant as an aid for implementors of audio, video and other real-time multimedia applications. This memorandum obsoletes RFC 1890. It is mostly backwards-compatible except for functions removed because two interoperable implementations were not found. The additions to RFC 1890 codify existing practice in the use of payload formats under this profile and include new payload formats defined since RFC 1890 was published. [STANDARDS TRACK]
RTP Control Protocol Extended Reports (RTCP XR)
This document defines the Extended Report (XR) packet type for the RTP Control Protocol (RTCP), and defines how the use of XR packets can be signaled by an application if it employs the Session Description Protocol (SDP). XR packets are composed of report blocks, and seven block types are defined here. The purpose of the extended reporting format is to convey information that supplements the six statistics that are contained in the report blocks used by RTCP's Sender Report (SR) and Receiver Report (RR) packets. Some applications, such as multicast inference of network characteristics (MINC) or voice over IP (VoIP) monitoring, require other and more detailed statistics. In addition to the block types defined here, additional block types may be defined in the future by adhering to the framework that this document provides.
Transmission Time Offsets in RTP Streams
This document describes a method to inform Real-time Transport Protocol (RTP) clients when RTP packets are transmitted at a time other than their 'nominal' transmission time. It also provides a mechanism to provide improved inter-arrival jitter reports from the clients, that take into account the reported transmission times. [STANDARDS TRACK]
Associating Time-Codes with RTP Streams
This document describes a mechanism for associating \%time-codes, as defined by the Society of Motion Picture and Television Engineers (SMPTE), with media streams in a way that is independent of the RTP payload format of the media stream itself. [STANDARDS TRACK]
Rapid Synchronisation of RTP Flows
This memo outlines how RTP sessions are synchronised, and discusses how rapidly such synchronisation can occur. We show that most RTP sessions can be synchronised immediately, but that the use of video switching multipoint conference units (MCUs) or large source specific multicast (SSM) groups can greatly increase the synchronisation delay. This increase in delay can be unacceptable to some applications that use layered and/or multi-description codecs. This memo introduces three mechanisms to reduce the synchronisation delay for such sessions. First, it updates the RTP Control Protocol (RTCP) timing rules to reduce the initial synchronisation delay for SSM sessions. Second, a new feedback packet is defined for use with the Extended RTP Profile for RTCP-based Feedback (RTP/AVPF), allowing video switching MCUs to rapidly request resynchronisation. Finally, new RTP header extensions are defined to allow rapid synchronisation of late joiners, and guarantee correct timestamp based decoding order recovery for layered codecs in the presence of clock skew.
RTCP Extensions for Single-Source Multicast Sessions with Unicast Feedback
This document specifies an extension to the Real-time Transport Control Protocol (RTCP) to use unicast feedback to a multicast sender. The proposed extension is useful for single-source multicast not available or not desired. In addition, it can be applied to any group that might benefit from a sender-controlled summarized reporting mechanism. Ott et al. Internet Draft - Expires Sept 2009 [page 2] RTCP with Unicast Feedback
RTP Timestamp Frequency for Variable Rate Audio Codecs
This memo discusses the problems of audio codecs with variable external sampling rates.
Historically, for audio codecs, the RTP timestamp frequency was chosen to match the sampling rate of the audio codec.
However, this choice is nowadays more difficult to justify, because of the advent of audio codecs (and, even more important, practical use cases) that support multiple sample rates and the switch between the sample rates during the lifetime of an RTP session.
This Internet draft addresses the problem by suggesting that RTP Payload RFCs for such codecs to utilize a single, high, unified RTP timestamp frequency.
This section must be removed before publication as an RFC.
Is it possible to guess the clock rate used in consecutive jitter values?