Type name: video Subtype name: H265 Required parameters: none OPTIONAL parameters: profile-space, tier-flag, profile-id, profile-compatibility- indicator, interop-constraints, and level-id: These parameters indicate the profile, tier, default level, and some constraints of the bitstream carried by the RTP stream and all RTP streams the RTP stream depends on, or a specific set of the profile, tier, default level, and some constraints the receiver supports. The profile and some constraints are indicated collectively by profile-space, profile-id, profile-compatibility-indicator, and interop-constraints. The profile specifies the subset of coding tools that may have been used to generate the bitstream or that the receiver supports. Informative note: There are 32 values of profile-id, and there are 32 flags in profile-compatibility-indicator, each flag corresponding to one value of profile-id. According to HEVC version 1 in [HEVC], when more than one of the 32 flags is set for a bitstream, the bitstream would comply with all the profiles corresponding to the set flags. However, in a draft of HEVC version 2 in [HEVCv2], Subclause A.3.5, 19 Format Range Extensions profiles have been specified, all using the same value of profile-id (4), differentiated by some of the 48 bits in interop-constraints; this (rather unexpected way of profile signaling) means that one of the 32 flags may correspond to multiple profiles. To be able to support whatever HEVC extension profile that might be specified and indicated using profile-space, profile-id, profile-compatibility-indicator, and interop-constraints in the future, it would be safe to require symmetric use of these parameters in SDP offer/answer unless recv-sub-layer- id is included in the SDP answer for choosing one of the sub-layers offered. The tier is indicated by tier-flag. The default level is indicated by level-id. The tier and the default level specify the limits on values of syntax elements or arithmetic combinations of values of syntax elements that are followed when generating the bitstream or that the receiver supports. A set of profile-space, tier-flag, profile-id, profile- compatibility-indicator, interop-constraints, and level-id parameters ptlA is said to be consistent with another set of these parameters ptlB if any decoder that conforms to the profile, tier, level, and constraints indicated by ptlB can decode any bitstream that conforms to the profile, tier, level, and constraints indicated by ptlA. In SDP offer/answer, when the SDP answer does not include the recv-sub-layer-id parameter that is less than the sprop-sub- layer-id parameter in the SDP offer, the following applies: o The profile-space, tier-flag, profile-id, profile- compatibility-indicator, and interop-constraints parameters MUST be used symmetrically, i.e., the value of each of these parameters in the offer MUST be the same as that in the answer, either explicitly signaled or implicitly inferred. o The level-id parameter is changeable as long as the highest level indicated by the answer is either equal to or lower than that in the offer. Note that the highest level is indicated by level-id and max-recv-level-id together. In SDP offer/answer, when the SDP answer does include the recv- sub-layer-id parameter that is less than the sprop-sub-layer-id parameter in the SDP offer, the set of profile-space, tier- flag, profile-id, profile-compatibility-indicator, interop- constraints, and level-id parameters included in the answer MUST be consistent with that for the chosen sub-layer representation as indicated in the SDP offer, with the exception that the level-id parameter in the SDP answer is changeable as long as the highest level indicated by the answer is either lower than or equal to that in the offer. More specifications of these parameters, including how they relate to the values of the profile, tier, and level syntax elements specified in [HEVC] are provided below. profile-space, profile-id: The value of profile-space MUST be in the range of 0 to 3, inclusive. The value of profile-id MUST be in the range of 0 to 31, inclusive. When profile-space is not present, a value of 0 MUST be inferred. When profile-id is not present, a value of 1 (i.e., the Main profile) MUST be inferred. When used to indicate properties of a bitstream, profile-space and profile-id are derived from the profile, tier, and level syntax elements in SPS or VPS NAL units as follows, where general_profile_space, general_profile_idc, sub_layer_profile_space[j], and sub_layer_profile_idc[j] are specified in [HEVC]: If the RTP stream is the highest RTP stream, the following applies: o profile-space = general_profile_space o profile-id = general_profile_idc Otherwise (the RTP stream is a dependee RTP stream), the following applies, with j being the value of the sprop-sub- layer-id parameter: o profile-space = sub_layer_profile_space[j] o profile-id = sub_layer_profile_idc[j] tier-flag, level-id: The value of tier-flag MUST be in the range of 0 to 1, inclusive. The value of level-id MUST be in the range of 0 to 255, inclusive. If the tier-flag and level-id parameters are used to indicate properties of a bitstream, they indicate the tier and the highest level the bitstream complies with. If the tier-flag and level-id parameters are used for capability exchange, the following applies. If max-recv-level- id is not present, the default level defined by level-id indicates the highest level the codec wishes to support. Otherwise, max-recv-level-id indicates the highest level the codec supports for receiving. For either receiving or sending, all levels that are lower than the highest level supported MUST also be supported. If no tier-flag is present, a value of 0 MUST be inferred; if no level-id is present, a value of 93 (i.e., level 3.1) MUST be inferred. When used to indicate properties of a bitstream, the tier-flag and level-id parameters are derived from the profile, tier, and level syntax elements in SPS or VPS NAL units as follows, where general_tier_flag, general_level_idc, sub_layer_tier_flag[j], and sub_layer_level_idc[j] are specified in [HEVC]: If the RTP stream is the highest RTP stream, the following applies: o tier-flag = general_tier_flag o level-id = general_level_idc Otherwise (the RTP stream is a dependee RTP stream), the following applies, with j being the value of the sprop-sub- layer-id parameter: o tier-flag = sub_layer_tier_flag[j] o level-id = sub_layer_level_idc[j] interop-constraints: A base16 [RFC4648] (hexadecimal) representation of six bytes of data, consisting of progressive_source_flag, interlaced_source_flag, non_packed_constraint_flag, frame_only_constraint_flag, and reserved_zero_44bits. If the interop-constraints parameter is not present, the following MUST be inferred: o progressive_source_flag = 1 o interlaced_source_flag = 0 o non_packed_constraint_flag = 1 o frame_only_constraint_flag = 1 o reserved_zero_44bits = 0 When the interop-constraints parameter is used to indicate properties of a bitstream, the following applies, where general_progressive_source_flag, general_interlaced_source_flag, general_non_packed_constraint_flag, general_non_packed_constraint_flag, general_frame_only_constraint_flag, general_reserved_zero_44bits, sub_layer_progressive_source_flag[j], sub_layer_interlaced_source_flag[j], sub_layer_non_packed_constraint_flag[j], sub_layer_frame_only_constraint_flag[j], and sub_layer_reserved_zero_44bits[j] are specified in [HEVC]: If the RTP stream is the highest RTP stream, the following applies: o progressive_source_flag = general_progressive_source_flag o interlaced_source_flag = general_interlaced_source_flag o non_packed_constraint_flag = general_non_packed_constraint_flag o frame_only_constraint_flag = general_frame_only_constraint_flag o reserved_zero_44bits = general_reserved_zero_44bits Otherwise (the RTP stream is a dependee RTP stream), the following applies, with j being the value of the sprop-sub- layer-id parameter: o progressive_source_flag = sub_layer_progressive_source_flag[j] o interlaced_source_flag = sub_layer_interlaced_source_flag[j] o non_packed_constraint_flag = sub_layer_non_packed_constraint_flag[j] o frame_only_constraint_flag = sub_layer_frame_only_constraint_flag[j] o reserved_zero_44bits = sub_layer_reserved_zero_44bits[j] Using interop-constraints for capability exchange results in a requirement on any bitstream to be compliant with the interop-constraints. profile-compatibility-indicator: A base16 [RFC4648] representation of four bytes of data. When profile-compatibility-indicator is used to indicate properties of a bitstream, the following applies, where general_profile_compatibility_flag[j] and sub_layer_profile_compatibility_flag[i][j] are specified in [HEVC]: The profile-compatibility-indicator in this case indicates additional profiles to the profile defined by profile-space, profile-id, and interop-constraints the bitstream conforms to. A decoder that conforms to any of all the profiles the bitstream conforms to would be capable of decoding the bitstream. These additional profiles are defined by profile-space, each set bit of profile-compatibility- indicator, and interop-constraints. If the RTP stream is the highest RTP stream, the following applies for each value of j in the range of 0 to 31, inclusive: o bit j of profile-compatibility-indicator = general_profile_compatibility_flag[j] Otherwise (the RTP stream is a dependee RTP stream), the following applies for i equal to sprop-sub-layer-id and for each value of j in the range of 0 to 31, inclusive: o bit j of profile-compatibility-indicator = sub_layer_profile_compatibility_flag[i][j] Using profile-compatibility-indicator for capability exchange results in a requirement on any bitstream to be compliant with the profile-compatibility-indicator. This is intended to handle cases where any future HEVC profile is defined as an intersection of two or more profiles. If this parameter is not present, this parameter defaults to the following: bit j, with j equal to profile-id, of profile- compatibility-indicator is inferred to be equal to 1, and all other bits are inferred to be equal to 0. sprop-sub-layer-id: This parameter MAY be used to indicate the highest allowed value of TID in the bitstream. When not present, the value of sprop-sub-layer-id is inferred to be equal to 6. The value of sprop-sub-layer-id MUST be in the range of 0 to 6, inclusive. recv-sub-layer-id: This parameter MAY be used to signal a receiver's choice of the offered or declared sub-layer representations in the sprop-vps. The value of recv-sub-layer-id indicates the TID of the highest sub-layer of the bitstream that a receiver supports. When not present, the value of recv-sub-layer-id is inferred to be equal to the value of the sprop-sub-layer-id parameter in the SDP offer. The value of recv-sub-layer-id MUST be in the range of 0 to 6, inclusive. max-recv-level-id: This parameter MAY be used to indicate the highest level a receiver supports. The highest level the receiver supports is equal to the value of max-recv-level-id divided by 30. The value of max-recv-level-id MUST be in the range of 0 to 255, inclusive. When max-recv-level-id is not present, the value is inferred to be equal to level-id. max-recv-level-id MUST NOT be present when the highest level the receiver supports is not higher than the default level. tx-mode: This parameter indicates whether the transmission mode is SRST, MRST, or MRMT. The value of tx-mode MUST be equal to "SRST", "MRST" or "MRMT". When not present, the value of tx-mode is inferred to be equal to "SRST". If the value is equal to "MRST", MRST MUST be in use. Otherwise, if the value is equal to "MRMT", MRMT MUST be in use. Otherwise (the value is equal to "SRST"), SRST MUST be in use. The value of tx-mode MUST be equal to "MRST" for all RTP streams in an MRST. The value of tx-mode MUST be equal to "MRMT" for all RTP streams in an MRMT. sprop-vps: This parameter MAY be used to convey any video parameter set NAL unit of the bitstream for out-of-band transmission of video parameter sets. The parameter MAY also be used for capability exchange and to indicate sub-stream characteristics (i.e., properties of sub-layer representations as defined in [HEVC]). The value of the parameter is a comma-separated (',') list of base64 [RFC4648] representations of the video parameter set NAL units as specified in Section 7.3.2.1 of [HEVC]. The sprop-vps parameter MAY contain one or more than one video parameter set NAL unit. However, all other video parameter sets contained in the sprop-vps parameter MUST be consistent with the first video parameter set in the sprop-vps parameter. A video parameter set vpsB is said to be consistent with another video parameter set vpsA if any decoder that conforms to the profile, tier, level, and constraints indicated by the 12 bytes of data starting from the syntax element general_profile_space to the syntax element general_level_idc, inclusive, in the first profile_tier_level( ) syntax structure in vpsA can decode any bitstream that conforms to the profile, tier, level, and constraints indicated by the 12 bytes of data starting from the syntax element general_profile_space to the syntax element general_level_idc, inclusive, in the first profile_tier_level( ) syntax structure in vpsB. sprop-sps: This parameter MAY be used to convey sequence parameter set NAL units of the bitstream for out-of-band transmission of sequence parameter sets. The value of the parameter is a comma- separated (',') list of base64 [RFC4648] representations of the sequence parameter set NAL units as specified in Section 7.3.2.2 of [HEVC]. sprop-pps: This parameter MAY be used to convey picture parameter set NAL units of the bitstream for out-of-band transmission of picture parameter sets. The value of the parameter is a comma- separated (',') list of base64 [RFC4648] representations of the picture parameter set NAL units as specified in Section 7.3.2.3 of [HEVC]. sprop-sei: This parameter MAY be used to convey one or more SEI messages that describe bitstream characteristics. When present, a decoder can rely on the bitstream characteristics that are described in the SEI messages for the entire duration of the session, independently from the persistence scopes of the SEI messages as specified in [HEVC]. The value of the parameter is a comma-separated (',') list of base64 [RFC4648] representations of SEI NAL units as specified in Section 7.3.2.4 of [HEVC]. Informative note: Intentionally, no list of applicable or inapplicable SEI messages is specified here. Conveying certain SEI messages in sprop-sei may be sensible in some application scenarios and meaningless in others. However, a few examples are described below: 1) In an environment where the bitstream was created from film-based source material, and no splicing is going to occur during the lifetime of the session, the film grain characteristics SEI message or the tone mapping information SEI message are likely meaningful, and sending them in sprop-sei rather than in the bitstream at each entry point may help with saving bits and allows one to configure the renderer only once, avoiding unwanted artifacts. 2) The structure of pictures information SEI message in sprop-sei can be used to inform a decoder of information on the NAL unit types, picture-order count values, and prediction dependencies of a sequence of pictures. Having such knowledge can be helpful for error recovery. 3) Examples for SEI messages that would be meaningless to be conveyed in sprop-sei include the decoded picture hash SEI message (it is close to impossible that all decoded pictures have the same hashtag), the display orientation SEI message when the device is a handheld device (as the display orientation may change when the handheld device is turned around), or the filler payload SEI message (as there is no point in just having more bits in SDP). max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, max-tc: These parameters MAY be used to signal the capabilities of a receiver implementation. These parameters MUST NOT be used for any other purpose. The highest level (specified by max-recv- level-id) MUST be the highest that the receiver is fully capable of supporting. max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, and max-tc MAY be used to indicate capabilities of the receiver that extend the required capabilities of the highest level, as specified below. When more than one parameter from the set (max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, max-tc) is present, the receiver MUST support all signaled capabilities simultaneously. For example, if both max-lsr and max-br are present, the highest level with the extension of both the picture rate and bitrate is supported. That is, the receiver is able to decode bitstreams in which the luma sample rate is up to max-lsr (inclusive), the bitrate is up to max-br (inclusive), the coded picture buffer size is derived as specified in the semantics of the max-br parameter below, and the other properties comply with the highest level specified by max-recv-level-id. Informative note: When the OPTIONAL media type parameters are used to signal the properties of a bitstream, and max- lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, and max-tc are not present, the values of profile-space, tier-flag, profile-id, profile-compatibility-indicator, interop- constraints, and level-id must always be such that the bitstream complies fully with the specified profile, tier, and level. max-lsr: The value of max-lsr is an integer indicating the maximum processing rate in units of luma samples per second. The max- lsr parameter signals that the receiver is capable of decoding video at a higher rate than is required by the highest level. When max-lsr is signaled, the receiver MUST be able to decode bitstreams that conform to the highest level, with the exception that the MaxLumaSR value in Table A-2 of [HEVC] for the highest level is replaced with the value of max-lsr. Senders MAY use this knowledge to send pictures of a given size at a higher picture rate than is indicated in the highest level. When not present, the value of max-lsr is inferred to be equal to the value of MaxLumaSR given in Table A-2 of [HEVC] for the highest level. The value of max-lsr MUST be in the range of MaxLumaSR to 16 * MaxLumaSR, inclusive, where MaxLumaSR is given in Table A-2 of [HEVC] for the highest level. max-lps: The value of max-lps is an integer indicating the maximum picture size in units of luma samples. The max-lps parameter signals that the receiver is capable of decoding larger picture sizes than are required by the highest level. When max-lps is signaled, the receiver MUST be able to decode bitstreams that conform to the highest level, with the exception that the MaxLumaPS value in Table A-1 of [HEVC] for the highest level is replaced with the value of max-lps. Senders MAY use this knowledge to send larger pictures at a proportionally lower picture rate than is indicated in the highest level. When not present, the value of max-lps is inferred to be equal to the value of MaxLumaPS given in Table A-1 of [HEVC] for the highest level. The value of max-lps MUST be in the range of MaxLumaPS to 16 * MaxLumaPS, inclusive, where MaxLumaPS is given in Table A-1 of [HEVC] for the highest level. max-cpb: The value of max-cpb is an integer indicating the maximum coded picture buffer size in units of CpbBrVclFactor bits for the VCL HRD parameters and in units of CpbBrNalFactor bits for the NAL HRD parameters, where CpbBrVclFactor and CpbBrNalFactor are defined in Section A.4 of [HEVC]. The max-cpb parameter signals that the receiver has more memory than the minimum amount of coded picture buffer memory required by the highest level. When max-cpb is signaled, the receiver MUST be able to decode bitstreams that conform to the highest level, with the exception that the MaxCPB value in Table A-1 of [HEVC] for the highest level is replaced with the value of max-cpb. Senders MAY use this knowledge to construct coded bitstreams with greater variation of bitrate than can be achieved with the MaxCPB value in Table A-1 of [HEVC]. When not present, the value of max-cpb is inferred to be equal to the value of MaxCPB given in Table A-1 of [HEVC] for the highest level. The value of max-cpb MUST be in the range of MaxCPB to 16 * MaxCPB, inclusive, where MaxLumaCPB is given in Table A-1 of [HEVC] for the highest level. Informative note: The coded picture buffer is used in the hypothetical reference decoder (Annex C of [HEVC]). The use of the hypothetical reference decoder is recommended in HEVC encoders to verify that the produced bitstream conforms to the standard and to control the output bitrate. Thus, the coded picture buffer is conceptually independent of any other potential buffers in the receiver, including de- packetization and de-jitter buffers. The coded picture buffer need not be implemented in decoders as specified in Annex C of [HEVC], but rather standard-compliant decoders can have any buffering arrangements provided that they can decode standard-compliant bitstreams. Thus, in practice, the input buffer for a video decoder can be integrated with de-packetization and de-jitter buffers of the receiver. max-dpb: The value of max-dpb is an integer indicating the maximum decoded picture buffer size in units decoded pictures at the MaxLumaPS for the highest level, i.e., the number of decoded pictures at the maximum picture size defined by the highest level. The value of max-dpb MUST be in the range of 1 to 16, respectively. The max-dpb parameter signals that the receiver has more memory than the minimum amount of decoded picture buffer memory required by default, which is MaxDpbPicBuf as defined in [HEVC] (equal to 6). When max-dpb is signaled, the receiver MUST be able to decode bitstreams that conform to the highest level, with the exception that the MaxDpbPicBuff value defined in [HEVC] as 6 is replaced with the value of max-dpb. Consequently, a receiver that signals max-dpb MUST be capable of storing the following number of decoded pictures (MaxDpbSize) in its decoded picture buffer: if( PicSizeInSamplesY <= ( MaxLumaPS >> 2 ) ) MaxDpbSize = Min( 4 * max-dpb, 16 ) else if ( PicSizeInSamplesY <= ( MaxLumaPS >> 1 ) ) MaxDpbSize = Min( 2 * max-dpb, 16 ) else if ( PicSizeInSamplesY <= ( ( 3 * MaxLumaPS ) >> 2 ) ) MaxDpbSize = Min( (4 * max-dpb) / 3, 16 ) else MaxDpbSize = max-dpb Wherein MaxLumaPS given in Table A-1 of [HEVC] for the highest level and PicSizeInSamplesY is the current size of each decoded picture in units of luma samples as defined in [HEVC]. The value of max-dpb MUST be greater than or equal to the value of MaxDpbPicBuf (i.e., 6) as defined in [HEVC]. Senders MAY use this knowledge to construct coded bitstreams with improved compression. When not present, the value of max-dpb is inferred to be equal to the value of MaxDpbPicBuf (i.e., 6) as defined in [HEVC]. Informative note: This parameter was added primarily to complement a similar codepoint in the ITU-T Recommendation H.245, so as to facilitate signaling gateway designs. The decoded picture buffer stores reconstructed samples. There is no relationship between the size of the decoded picture buffer and the buffers used in RTP, especially de- packetization and de-jitter buffers. max-br: The value of max-br is an integer indicating the maximum video bitrate in units of CpbBrVclFactor bits per second for the VCL HRD parameters and in units of CpbBrNalFactor bits per second for the NAL HRD parameters, where CpbBrVclFactor and CpbBrNalFactor are defined in Section A.4 of [HEVC]. The max-br parameter signals that the video decoder of the receiver is capable of decoding video at a higher bitrate than is required by the highest level. When max-br is signaled, the video codec of the receiver MUST be able to decode bitstreams that conform to the highest level, with the following exceptions in the limits specified by the highest level: o The value of max-br replaces the MaxBR value in Table A-2 of [HEVC] for the highest level. o When the max-cpb parameter is not present, the result of the following formula replaces the value of MaxCPB in Table A-1 of [HEVC]: (MaxCPB of the highest level) * max-br / (MaxBR of the highest level) For example, if a receiver signals capability for Main profile Level 2 with max-br equal to 2000, this indicates a maximum video bitrate of 2000 kbits/sec for VCL HRD parameters, a maximum video bitrate of 2200 kbits/sec for NAL HRD parameters, and a CPB size of 2000000 bits (2000000 / 1500000 * 1500000). Senders MAY use this knowledge to send higher bitrate video as allowed in the level definition of Annex A of [HEVC] to achieve improved video quality. When not present, the value of max-br is inferred to be equal to the value of MaxBR given in Table A-2 of [HEVC] for the highest level. The value of max-br MUST be in the range of MaxBR to 16 * MaxBR, inclusive, where MaxBR is given in Table A-2 of [HEVC] for the highest level. Informative note: This parameter was added primarily to complement a similar codepoint in the ITU-T Recommendation H.245, so as to facilitate signaling gateway designs. The assumption that the network is capable of handling such bitrates at any given time cannot be made from the value of this parameter. In particular, no conclusion can be drawn that the signaled bitrate is possible under congestion control constraints. max-tr: The value of max-tr is an integer indication the maximum number of tile rows. The max-tr parameter signals that the receiver is capable of decoding video with a larger number of tile rows than the value allowed by the highest level. When max-tr is signaled, the receiver MUST be able to decode bitstreams that conform to the highest level, with the exception that the MaxTileRows value in Table A-1 of [HEVC] for the highest level is replaced with the value of max-tr. Senders MAY use this knowledge to send pictures utilizing a larger number of tile rows than the value allowed by the highest level. When not present, the value of max-tr is inferred to be equal to the value of MaxTileRows given in Table A-1 of [HEVC] for the highest level. The value of max-tr MUST be in the range of MaxTileRows to 16 * MaxTileRows, inclusive, where MaxTileRows is given in Table A-1 of [HEVC] for the highest level. max-tc: The value of max-tc is an integer indication the maximum number of tile columns. The max-tc parameter signals that the receiver is capable of decoding video with a larger number of tile columns than the value allowed by the highest level. When max-tc is signaled, the receiver MUST be able to decode bitstreams that conform to the highest level, with the exception that the MaxTileCols value in Table A-1 of [HEVC] for the highest level is replaced with the value of max-tc. Senders MAY use this knowledge to send pictures utilizing a larger number of tile columns than the value allowed by the highest level. When not present, the value of max-tc is inferred to be equal to the value of MaxTileCols given in Table A-1 of [HEVC] for the highest level. The value of max-tc MUST be in the range of MaxTileCols to 16 * MaxTileCols, inclusive, where MaxTileCols is given in Table A-1 of [HEVC] for the highest level. max-fps: The value of max-fps is an integer indicating the maximum picture rate in units of pictures per 100 seconds that can be effectively processed by the receiver. The max-fps parameter MAY be used to signal that the receiver has a constraint in that it is not capable of processing video effectively at the full picture rate that is implied by the highest level and, when present, one or more of the parameters max-lsr, max-lps, and max-br. The value of max-fps is not necessarily the picture rate at which the maximum picture size can be sent, it constitutes a constraint on maximum picture rate for all resolutions. Informative note: The max-fps parameter is semantically different from max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, and max-tc in that max-fps is used to signal a constraint, lowering the maximum picture rate from what is implied by other parameters. The encoder MUST use a picture rate equal to or less than this value. In cases where the max-fps parameter is absent, the encoder is free to choose any picture rate according to the highest level and any signaled optional parameters. The value of max-fps MUST be smaller than or equal to the full picture rate that is implied by the highest level and, when present, one or more of the parameters max-lsr, max-lps, and max-br. sprop-max-don-diff: If tx-mode is equal to "SRST" and there is no NAL unit naluA that is followed in transmission order by any NAL unit preceding naluA in decoding order (i.e., the transmission order of the NAL units is the same as the decoding order), the value of this parameter MUST be equal to 0. Otherwise, if tx-mode is equal to "MRST" or "MRMT", the decoding order of the NAL units of all the RTP streams is the same as the NAL unit transmission order and the NAL unit output order, the value of this parameter MUST be equal to either 0 or 1. Otherwise, if tx-mode is equal to "MRST" or "MRMT" and the decoding order of the NAL units of all the RTP streams is the same as the NAL unit transmission order but not the same as the NAL unit output order, the value of this parameter MUST be equal to 1. Otherwise, this parameter specifies the maximum absolute difference between the decoding order number (i.e., AbsDon) values of any two NAL units naluA and naluB, where naluA follows naluB in decoding order and precedes naluB in transmission order. The value of sprop-max-don-diff MUST be an integer in the range of 0 to 32767, inclusive. When not present, the value of sprop-max-don-diff is inferred to be equal to 0. sprop-depack-buf-nalus: This parameter specifies the maximum number of NAL units that precede a NAL unit in transmission order and follow the NAL unit in decoding order. The value of sprop-depack-buf-nalus MUST be an integer in the range of 0 to 32767, inclusive. When not present, the value of sprop-depack-buf-nalus is inferred to be equal to 0. When sprop-max-don-diff is present and greater than 0, this parameter MUST be present and the value MUST be greater than 0. sprop-depack-buf-bytes: This parameter signals the required size of the de- packetization buffer in units of bytes. The value of the parameter MUST be greater than or equal to the maximum buffer occupancy (in units of bytes) of the de-packetization buffer as specified in Section 6. The value of sprop-depack-buf-bytes MUST be an integer in the range of 0 to 4294967295, inclusive. When sprop-max-don-diff is present and greater than 0, this parameter MUST be present and the value MUST be greater than 0. When not present, the value of sprop-depack-buf-bytes is inferred to be equal to 0. Informative note: The value of sprop-depack-buf-bytes indicates the required size of the de-packetization buffer only. When network jitter can occur, an appropriately sized jitter buffer has to be available as well. depack-buf-cap: This parameter signals the capabilities of a receiver implementation and indicates the amount of de-packetization buffer space in units of bytes that the receiver has available for reconstructing the NAL unit decoding order from NAL units carried in one or more RTP streams. A receiver is able to handle any RTP stream, and all RTP streams the RTP stream depends on, when present, for which the value of the sprop- depack-buf-bytes parameter is smaller than or equal to this parameter. When not present, the value of depack-buf-cap is inferred to be equal to 4294967295. The value of depack-buf-cap MUST be an integer in the range of 1 to 4294967295, inclusive. Informative note: depack-buf-cap indicates the maximum possible size of the de-packetization buffer of the receiver only, without allowing for network jitter. sprop-segmentation-id: This parameter MAY be used to signal the segmentation tools present in the bitstream and that can be used for parallelization. The value of sprop-segmentation-id MUST be an integer in the range of 0 to 3, inclusive. When not present, the value of sprop-segmentation-id is inferred to be equal to 0. When sprop-segmentation-id is equal to 0, no information about the segmentation tools is provided. When sprop-segmentation-id is equal to 1, it indicates that slices are present in the bitstream. When sprop-segmentation-id is equal to 2, it indicates that tiles are present in the bitstream. When sprop- segmentation-id is equal to 3, it indicates that WPP is used in the bitstream. sprop-spatial-segmentation-idc: A base16 [RFC4648] representation of the syntax element min_spatial_segmentation_idc as specified in [HEVC]. This parameter MAY be used to describe parallelization capabilities of the bitstream. dec-parallel-cap: This parameter MAY be used to indicate the decoder's additional decoding capabilities given the presence of tools enabling parallel decoding, such as slices, tiles, and WPP, in the bitstream. The decoding capability of the decoder may vary with the setting of the parallel decoding tools present in the bitstream, e.g., the size of the tiles that are present in a bitstream. Therefore, multiple capability points may be provided, each indicating the minimum required decoding capability that is associated with a parallelism requirement, which is a requirement on the bitstream that enables parallel decoding. Each capability point is defined as a combination of 1) a parallelism requirement, 2) a profile (determined by profile- space and profile-id), 3) a highest level, and 4) a maximum processing rate, a maximum picture size, and a maximum video bitrate that may be equal to or greater than that determined by the highest level. The parameter's syntax in ABNF [RFC5234] is as follows: dec-parallel-cap = "dec-parallel-cap={" cap-point *("," cap-point) "}" cap-point = ("w" / "t") ":" spatial-seg-idc 1*(";" cap-parameter) spatial-seg-idc = 1*4DIGIT ; (1-4095) cap-parameter = tier-flag / level-id / max-lsr / max-lps / max-br tier-flag = "tier-flag" EQ ("0" / "1") level-id = "level-id" EQ 1*3DIGIT ; (0-255) max-lsr = "max-lsr" EQ 1*20DIGIT ; (0- 18,446,744,073,709,551,615) max-lps = "max-lps" EQ 1*10DIGIT ; (0-4,294,967,295) max-br = "max-br" EQ 1*20DIGIT ; (0- 18,446,744,073,709,551,615) EQ = "=" The set of capability points expressed by the dec-parallel-cap parameter is enclosed in a pair of curly braces ("{}"). Each set of two consecutive capability points is separated by a comma (','). Within each capability point, each set of two consecutive parameters, and, when present, their values, is separated by a semicolon (';'). The profile of all capability points is determined by profile- space and profile-id, which are outside the dec-parallel-cap parameter. Each capability point starts with an indication of the parallelism requirement, which consists of a parallel tool type, which may be equal to 'w' or 't', and a decimal value of the spatial-seg-idc parameter. When the type is 'w', the capability point is valid only for H.265 bitstreams with WPP in use, i.e., entropy_coding_sync_enabled_flag equal to 1. When the type is 't', the capability point is valid only for H.265 bitstreams with WPP not in use (i.e., entropy_coding_sync_enabled_flag equal to 0). The capability- point is valid only for H.265 bitstreams with min_spatial_segmentation_idc equal to or greater than spatial- seg-idc. After the parallelism requirement indication, each capability point continues with one or more pairs of parameter and value in any order for any of the following parameters: o tier-flag o level-id o max-lsr o max-lps o max-br At most, one occurrence of each of the above five parameters is allowed within each capability point. The values of dec-parallel-cap.tier-flag and dec-parallel- cap.level-id for a capability point indicate the highest level of the capability point. The values of dec-parallel-cap.max- lsr, dec-parallel-cap.max-lps, and dec-parallel-cap.max-br for a capability point indicate the maximum processing rate in units of luma samples per second, the maximum picture size in units of luma samples, and the maximum video bitrate (in units of CpbBrVclFactor bits per second for the VCL HRD parameters and in units of CpbBrNalFactor bits per second for the NAL HRD parameters where CpbBrVclFactor and CpbBrNalFactor are defined in Section A.4 of [HEVC]). When not present, the value of dec-parallel-cap.tier-flag is inferred to be equal to the value of tier-flag outside the dec- parallel-cap parameter. When not present, the value of dec- parallel-cap.level-id is inferred to be equal to the value of max-recv-level-id outside the dec-parallel-cap parameter. When not present, the value of dec-parallel-cap.max-lsr, dec- parallel-cap.max-lps, or dec-parallel-cap.max-br is inferred to be equal to the value of max-lsr, max-lps, or max-br, respectively, outside the dec-parallel-cap parameter. The general decoding capability, expressed by the set of parameters outside of dec-parallel-cap, is defined as the capability point that is determined by the following combination of parameters: 1) the parallelism requirement corresponding to the value of sprop-segmentation-id equal to 0 for a bitstream, 2) the profile determined by profile-space, profile-id, profile-compatibility-indicator, and interop- constraints, 3) the tier and the highest level determined by tier-flag and max-recv-level-id, and 4) the maximum processing rate, the maximum picture size, and the maximum video bitrate determined by the highest level. The general decoding capability MUST NOT be included as one of the set of capability points in the dec-parallel-cap parameter. For example, the following parameters express the general decoding capability of 720p30 (Level 3.1) plus an additional decoding capability of 1080p30 (Level 4) given that the spatially largest tile or slice used in the bitstream is equal to or less than 1/3 of the picture size: a=fmtp:98 level-id=93;dec-parallel-cap={t:8;level- id=120} For another example, the following parameters express an additional decoding capability of 1080p30, using dec-parallel- cap.max-lsr and dec-parallel-cap.max-lps, given that WPP is used in the bitstream: a=fmtp:98 level-id=93;dec-parallel-cap={w:8; max-lsr=62668800;max-lps=2088960} Informative note: When min_spatial_segmentation_idc is present in a bitstream and WPP is not used, [HEVC] specifies that there is no slice or no tile in the bitstream containing more than 4 * PicSizeInSamplesY / ( min_spatial_segmentation_idc + 4 ) luma samples. include-dph: This parameter is used to indicate the capability and preference to utilize or include Decoded Picture Hash (DPH) SEI messages (see Section D.3.19 of [HEVC]) in the bitstream. DPH SEI messages can be used to detect picture corruption so the receiver can request picture repair, see Section 8. The value is a comma-separated list of hash types that is supported or requested to be used, each hash type provided as an unsigned integer value (0-255), with the hash types listed from most preferred to the least preferred. Example: "include-dph=0,2", which indicates the capability for MD5 (most preferred) and Checksum (less preferred). If the parameter is not included or the value contains no hash types, then no capability to utilize DPH SEI messages is assumed. Note that DPH SEI messages MAY still be included in the bitstream even when there is no declaration of capability to use them, as in general SEI messages do not affect the normative decoding process and decoders are allowed to ignore SEI messages. Encoding considerations: This type is only defined for transfer via RTP (RFC 3550). Security considerations: See Section 9 of RFC 7798. Published specification: Please refer to RFC 7798 and its Section 12. Additional information: None File extensions: none Macintosh file type code: none Object identifier or OID: none Person & email address to contact for further information: Ye-Kui Wang (yekui.wang&gmail.com) Intended usage: COMMON Author: See Authors' Addresses section of RFC 7798. Change controller: IETF Audio/Video Transport Payloads working group delegated from the IESG.