Internet-Draft | SCHC Payload compression | June 2025 |
Corneo, et al. | Expires 27 December 2025 | [Page] |
This document describes several techniques to enable utilization of the same engine to compress and decompress headers from the existing SCHC framework [RFC8724], but used to compress payload of specific protocols. The first approach is to introduce new type of static rules that enable encoding application data. This extensions provides compact and generic variation on how data is organized. The second approach provides dynamic compression and decompression. Here, the system identifies parts of the payload that can be compressed, and enables a SCHC decompressor to restore the original packet.¶
This note is to be removed before publishing as an RFC.¶
Status information for this document may be found at https://datatracker.ietf.org/doc/draft-corneo-schc-compress-payload/.¶
Discussion of this document takes place on the schc Working Group mailing list (mailto:schc@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/schc/. Subscribe at https://www.ietf.org/mailman/listinfo/schc/.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 27 December 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The main purpose of this document is to describe methods that enable the reduction of the size of the payloads by utilizing the same compression and decompression machinery provided by the SCHC framework [RFC8724]. Utilizing additional steps or introducing new rules, the available SCHC machinery can be reused in IoT devices to also provide compression of the payload.¶
For the static approach, the introduction of new type of rules that are suitable for the protocols that utilize certain type of time series serialization where certain values can be compressed based on the names, metadata related to each record, value type and base values. By utilizing the compression residue, the variable parts of the fields can be managed.¶
The dynamic approach is to produce a context that is customized with rules for the type of values that are expected to be delivered by the devices that utilize key-value based data, e.g., JSON [RFC8259], to generate a dictionary that can be later used to encode the data. This is done by inspecting the values that the devices produces by a fix amount of time, and then mapping some of the values to much smaller values that represent the data sent by the device. This reduce the transmission time and hence also the energy required for the transmission.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [BCP14] (RFC2119) (RFC8174) when, and only when, they appear in all capitals, as shown here.¶
This specification requires readers to be familiar with the terms and concepts that are discussed in [RFC7252].¶
This section introduces a templated method for enhancing the compression of time series sensor data, such as SenML [RFC8428], by utilizing a template-based approach within the context rules. By employing a template-based strategy, it becomes possible to reconstruct the data structure (e.g., SenML or JSON payload) without the necessity of transmitting the structure with each payload. The examples use JSON representation, but CBOR [RFC8949] representation can be used in practice to achieve smaller template (and hence rule) size.¶
When a system is transmitting in the payload the same number of records, the following optimizations can be utilized. As shown on Figure 1 entity with SenML base name (bn
) "dev1.example.com/"
sends two values periodically – one temperature value ("temp"
with value 20
), and one humidity value ("hum"
with value 40
). It encodes the values in one SenML Pack consisting of two SenML uncompressed Records:¶
[ { "bn": "dev1.example.com/", "n": "temp", "v": 20 }, { "n": "hum", "v": 40 } ]
A template with compression/decompression rules can be used to reduce the size of the above payload. For instance, variable values are replaced with $n
placeholders, where n
represents the position of the field value in the compression residue. Applying the template to the above example will then look as follows:¶
[ { "bn": "dev1.example.com/", "n": "temp", "v": "$1" }, { "n": "hum", "v": "$2" } ]¶
In this case, upon applying the rule, the compression residue includes exactly two values. The result of the decompression is exactly one SenML Pack with two SenML records.¶
In some cases, the payload may include values referring to different resource paths, and the number of such measurements may not be known in advance. Furthermore, the measurements may be performed at different times, but sent together in one payload. For instance, the example below shows a payload with multiple temperature and humidity measurements.¶
[ { "n": "dev1.example.com/temp", "v": 20, "bt": 1696231409 }, { "n": "dev1.example.com/hum", "v": 40 }, { "n": "dev1.example.com/temp", "v": 22, "bt": 1696231410 }, { "n": "dev1.example.com/hum", "v": 44 } ]¶
The template for the compression-decompression rule would look as follows:¶
[ $repeat( { "n": "dev1.example.com/temp", "v": "$1", "bt": "$3" }, { "n": "dev1.example.com/hum", "v": "$2" } ) ]
The $repeat()
command means that the two records for "temp"
and "hum"
readings are repeated and the first record contains a time value that applies to both records. The compression residue would contain a list of triplets with the measurement values for both, followed by the time stamp. The two records are repeated in the decompressed version as many times as there are triplets in the compression residue, i.e., if compression residue contains three values, the decompressed format will include one SenML Pack with two SenML Records. If the compression residue contains six values, the decompressed format will include one SenML Pack with four SenML Records, and so on.¶
The decompressor is aware of the $repeat()
syntax, thus can generate valid JSON by, for example, adding commas between records or valid CBOR/EXI with proper record delimiters, instead of simply repeating the template string given as the parameter. Note that the template mechanism from Section 2.1 can be used in the case of Section 2.2 on Figure 2 as well, but with the $repeat()
command, the size of the rule becomes smaller and allows for multiple residue values to be sent, which is especially useful if the number of measurements can vary or is not a priori.¶
To achieve further compression, the use of absolute timestamps (since epoch) or absolute measurement value may be undesirable and relative timestamps / values might be preferable. To achieve this, the epoch (or base) value needs to be established. The following sections provide different mechanisms through which the epoch / base values can be defined in the template.¶
In this case, the base value is defined in the template with a "relValue"
keyword in the placeholder. Using the example from Figure 2, the template can be updated to the following:¶
[ $repeat( { "n": "dev1.example.com/temp", "v": $1(relValue:20), "bt": $3(relValue:1696231409) }, { "n": "dev1.example.com/hum", "v": $2(relValue:40) } ) ]¶
Based on the above template, the compression residue values would be relative to the values specified in the template.¶
In some cases, a shared context can be established without specifying exact relative values. In the case of timestamps, a relTime:x
attribute is proposed, where x
can be "m" (minute), "h" (hour), "d" (day), "M" (month) and "Y" (year). Hence, without specifying an absolute value, the compressor can send the time relative to the "top of" the minute, hour, day, month, or year. In this way, the neither the rules need to be updated nor a lookup needs to be performed.¶
In other cases, the shared context can be established through environment conditions. For example, in the case of temperature and humidity, it can be defined that the base value in summer for temperature is 20 and humidity is 60 and in winter the base value is 0 and humidity 10. Such context is established outside the SCHC template but can be expressed in the template using the notation relContext:x
where x
can mean anything. For instance, x
could be "season" using the example above and, when the switch between summer and winter occurs, it is treated as context specific and not expressed in the template.¶
SCHC [RFC8724] describes mechanisms for handling both fixed and variable length residue fields. Fixed length fields are specified in the FL
field of the rule, whereas the varying length-fields are specified such that the compression residue includes the field length and the value.¶
The use of varying length fields for the values and the field length needs to be usually included within the residue. However, it may be undesirable to include the field length for static cases. Hence, a fl:x
template convention is proposed, e.g., $1(fl:16)
which indicates that the length of the $1
is always 16 bits. Hence the invention supports compression even in the case of fixed length field. The special case of all fields using the same length can be handled by indicating that length in the FL
field.¶
Rules in SCHC are described using the Compression/Decompression Context using the format (and terminology) based on the diagram in Figure 6 of Section 7.1 of [RFC8724]).¶
This section describes the values for the C/D context when using the template-based approach.¶
A new Field ID called "StructuredAppData" is introduced to indicate that the compression/decompression is being performed on the data / payload.¶
If all compression residue values are of same length (e.g., 4 bits), this field contains the bit length. Otherwise, this field contains value 0 indicating that variable-length values are used in the compression residue.¶
set to 1 (default)¶
As per the current spec (i.e., it can be Up, Dw, or Bi).¶
The Target Value is where the template itself is defined.¶
A new MO called "template" is introduced to indicate that the compressor/decompressor should follow template based matching operations.¶
A new action called "template" is introduced. The compressor sends all the values (as described in examples above) and the decompressor reconstructs the original payload but replacing the $x
placeholders (and expanding the other operations such as $repeat
, etc.).¶
The dynamic payload compression requires a new SCHC context that is updated to the compressor after inspecting packets' payloads. This can be done in three phases where the first phase collects statistics about the payload, the second phase identifies the payload fields that may be compressed, and the third phase generates a new SCHC context with rules targeting such fields. This dynamic compression is intended to work with key-value based payload (e.g., JSON).¶
The SenML Record below is the reference payload for the whole section.¶
[ { "bn": "2001:db8:1234:5678::1/", "n": "temperature", "u": "Cel", "v": 25.2 }, { "n": "humidity", "u": "%RH", "v": 30 } ]
In the first phase, the decompressor inspects the payload carried through the SCHC packet. All the keys and values of the key-value pairs are stored in a map-like, or list-like, data structure. For example, if the payload contains the key-value pair {"n": "temperature"}
the associated data structure may be a 'map' or a 'list'.¶
Below is an example of a map-like data structure that records each key-value pair from the payload along with the count of how many times each key or value appears, e.g., {"n": 1, "temperature": 1}
.¶
Alternatively, a list-like data structure can be used to store only the keys and values from the payload's key-value pairs, such as ["n", "temperature"]
.¶
Additional methods for storing payload information may be used, e.g., a timestamp of the received payload. The information recorder in this phase mainly depends on the selection of the fields that are going to be compressed.¶
After payload analysis on Section 3.1, the decompressor entity manipulates the data structure to create an unique key for each value.¶
The key creation mechanism involves encoding different pieces of information in a structured format. This is achieved by separating components with a special character, such as a dot. For instance, consider the format info1.info2
.¶
The segment before the dot represents a standard content type, such as application/json+senml
. The segment after the dot includes the name of the payload field, which is the key in a key-value pair, followed by a unique identifier number, like n1
, to ensure uniqueness. Consequently, a key might be formatted as application/senml+json.n1
.¶
This structured key informs the decompressor entity that: (i) the payload is compressed using JSON-encoded SenML, (ii) each field should be decompressed into a "key": "value" format, (iii) fields sharing the same numeric identifier, such as n1
, should be enclosed within curly brackets, and (iv) the entire structure should be wrapped in square brackets (as per [RFC8428]).¶
The criteria for compressing key-value pairs can differ based on specific use cases. This section outlines some triggers that determine whether a key-value pair should be compressed, such as:¶
A counter keeps track of how many instances of a specific key-value pair have been received in adjacent messages. The payload compression may be configured to trigger compression for all the key-value pairs that have been sent at least 5 times in adjacent messages.¶
The internal state of the decompressor entity keeps track of the length of the key-value pairs and, based on a predefined threshold, e.g., 100 bytes, decides on whether the pair is to be compressed or not.¶
A condition accounting for multiple variables can be defined; for example, a timer after which all the key-value pairs have appeared in the payload more than 5 times are to be selected for compression.¶
Following the field selection process in Section 3.2, the decompressor SHALL generate a SCHC context with rules targeting the selected key-value pairs. For each pair, the decompressor SHALL select SCHC rule fields that align with its policies, such as minimizing payload length and using the CDA "elide" where applicable.¶
While payload size reduction is a common goal, other objectives, such as incorporating redundancy to reduce transmission errors, MAY also be considered.¶
The server produces SCHC payload compression rules like the ones showed in Figure 4. In SenML, JSON, and key-value data structures at large, only two fields can be directly mapped to a SCHC rule, namely, FID (the key) and TV (the target value). However, a SCHC rule is composed by 7 to 9 fields. Such fields are populated according to specific needs of the client. As an example, the configuration may be set to operate in a way to minimize the transmitted payload, rather than minimize the overhead of storing the SCHC context for each client. Next, it is explained an example of how the SCHC payload rules generation could be implemented. Rules can be mapped as following:¶
The Field Identifier (FID) can be directly derived from the keys shown in Figure 4, such as application/json+senml.v.1
. In this example, the key's structure carries semantics. The segment before the delimiter character specifies the payload's content type, guiding the decompressor on how to handle the compressed key-value pair. The segment after the delimiter includes the key's name in the key-value pair, followed by a number that groups related fields. For instance, FIDs with the number 1, like n.1
, u.1
, and v.1
, are part of the same SenML record and will be enclosed in curly brackets.¶
The FL is computed by the system evaluating the size of the value to compress.¶
In the case of SenML, the FP is not useful, meaning that the placement of the field would mostly follow the same pattern. However, if for any specific reason the position of the field is of relevance, this field can be used.¶
The DI expresses whether the field for compression/decompression appears at upstream or downstream. In the case of IoT scenarios, the direction is almost always upstream since it is critical to reduce transmission time to extend as much as possible devices' battery life.¶
The TV is the value on which to perform the Matching Operation (MO). Essentially, it is the value, or part of the value, to be compressed.¶
The MO is the operation that is applied to every payload value and, when there is a match between such value and the TV, a de/compression action (CDA) will be applied. In a nutshell, if there is a match, the payload value will be de/compressed.¶
This is the action to be taken to the payload value in case of matching with the TV. Common actions are not-sent (the value will be omitted in the sent payload), value-sent (the value will be sent), L(ess)S(significant)B(it) (only the least significant part of the payload value will be sent).¶
Example:¶
{ "ruleID": 12, "ruleLength": 4, "compression": [ { "FID": "application/senml+json.bn.1", "FL": 22, "FP": 1, "DI": "Up", "TV": "2001:db8:1234:5678::1/", "MO": "equal", "CDA": "not-sent" }, { "FID": "application/senml+json.n.1", "FL": 11, "FP": 2, "DI": "Up", "TV": "temperature", "MO": "MSB", "MOa": "10", "CDA": "LSB" }, { "FID": "application/senml+json.n.2", "FL": 8, "FP": 3, "DI": "Up", "TV": "humidity", "MO": "MSB", "MOa": "7", "CDA": "LSB" }, { "FID": "application/senml+json.u.1", "FL": 3, "FP": 4, "DI": "Up", "TV": "Cel", "MO": "equal", "CDA": "not-sent" }, { "FID": "application/senml+json.u.2", "FL": 33, "FP": 5, "DI": "Up", "TV": "%RA", "MO": "equal", "CDA": "not-sent" }, { "FID": "application/senml+json.v.1", "FL": 4, "FP": 6, "DI": "Up", "TV": "", "MO": "ignore", "CDA": "value-sent" }, { "FID": "application/senml+json.v.2", "FL": 4, "FP": 7, "DI": "Up", "TV": "", "MO": "ignore", "CDA": "value-sent" } ] }
In the above rule, the fields application/senml+json.u.1/2
are never sent because are known a-priori and will probably never change (a temperature sensor will always return Celsius or Fahrenheit). Fields application/senml+json.n.1/2
are shortened only to their last character due to MSB and LSB, e.g., e
and y
. Finally, application/senml+json.v.1/2
, are ignored by the compressor and will be sent without compression, given that they may change. After the dynamic payload compression, the transmitted payload may look as follows:¶
0x0C657941C9999A0000001E¶
In the above example, 0C
is the hexadecimal representation of 12, which indicates that whatever follows is compressed according to rule 12. That is, the SenML base name, 2001:db8:1234:5678::1/
, is omitted. Then, the last letter of the names is compressed, namely, 65
and 79
, representing ASCII for e
(from temperature
) and y
(from humidity
). Moreover, the measurement units Cel
and %RH
are omitted (not-sent CDA applies). Finally, the remaining values are encoded as 4 bytes.¶
This section describes how to manipulate the FID field so to provide hints to the decompressor on how to reconstruct the original payload.¶
As an example, the content-type of the payload, the name of the field associated with the value, and a group identifier may be encoded in the FID. It is possible to separate these pieces of information using a delimiter, e.g., a single character such as a dot. For example, the left part of the first dot is a standard content type, e.g., application/json+senml
, while the right part of the first dot, the name of the payload field followed by a number used to group fields together, e.g., n.1
. As a result, the name of such a key would be application/senml+json.n.1
.¶
This way, the decompressor knows that the compressed payload is of SenML type encoded in JSON, thus knows that every field must be decompressed as "key": "value", every field with the same numeric identifier, e.g., n.1
, u.1
and v.1
, enclosed by curly brackets, and finally everything enclosed into square brackets, as per SenML encoding. More generally, the hints for the decompressor are encoded as: <content-type>.<key>.<group-number>
¶
The SCHC decompressor starts by reading the first byte, 0C
, which indicates the SCHC RuleID, 12. It retrieves and applies the rules sequentially. The first rule, application/senml+json.bn.1
, has a CDA of not-sent, meaning the bn value was omitted during compression and must be added back. The decompressor reconstructs this as "bn": "2001:db8:1234:5678::1/"
.¶
Next, for the rule application/senml+json.n.1
, the decompressor identifies an MSB CDA, indicating a 1-byte value was added during compression. It retrieves 0x65
(representing "E") and appends it to the target value, forming the JSON pair: "n": "temperature"
. This process is repeated for the remaining existing rules.¶
The decompressor must know the payload's content-type to accurately reconstruct the packet. While this example uses SenML JSON, other formats like YAML or XML can also be supported for greater flexibility.¶
There may be several ways to build the compression map exchanged between the server and the client. This section provides one such instance based on a threshold. That is, when a value is repeated more than a predefined number of times (3 in this particular example), it is considered redundant and will be compressed. More elaborated techniques may be used, e.g., size of a value.¶
The server inspects the payload sent from the devices. For every message, in key-value formats, e.g., JSON, the server counts the occurrences of each value. Considering the payload from the example above, after receiving the payload for the first time, the server produces such map:¶
{ "2001:db8:1234:5678::1/temperature": 1, "2001:db8:1234:5678::1/humidity": 1, "Cel": 1, 25.2: 1, "%RH": 1, 30: 1 }¶
The above map counts how many times a particular value has been sent as payload. For example, after 3 iterations the map may look as follows:¶
{ "2001:db8:1234:5678::1/temperature": 3, "2001:db8:1234:5678::1/humidity": 3, "Cel": 3, 25.2: 1, 25.1: 1, 24.8: 1, "%RH": 3, 30: 1, 31: 1, 35: 1 }¶
In the above map, the values of the temperature (25.2, 25.1, 24.8) and humidity (30, 31, 35) have changed over time, and their counter is set to 1. However, the sensor names and measurement units have not changed and their counter value increased to 3. At this point, the server detects that some values in the payload are redundant (sent more than 3 times) and could be compressed by assigning to them a single-digit character. As a result, the server builds and share with the client the following compression map:¶
{ 0: "2001:db8:1234:5678::1/temperature", 1: "2001:db8:1234:5678::1/humidity", 2: "Cel", 3: "%RH", }¶
When the server identifies another value to be redundant, the sender will send another compression map including only the new element (associated to a new unique number). Assuming that the temperature has been constant (25 Celsius) for the last three updates, the new map will look as follows:¶
{ 4: 25 }¶
If an element is no longer used, it can be removed from the compression map. For instance, if a sensor's name changes, the old name can be invalidated, allowing the client to free memory by deleting it. To remove the value 2001:db8:1234:5678::1/temperature
with ID 0, assign a negative value to the ID:¶
{ 0: -1 }¶
This document has the following actions for IANA.¶
Note to RFC Editor: Please replace all occurrences of "[RFC-XXXX]" with the RFC number of this specification and delete this paragraph.¶
The authors would like to thank Rajat Kandoi and Ari Keränen for their support in the inception and review of this draft. Their ideas and thinking are also reflected in the text of the draft.¶