We have identified a new buffer overflow vulnerability in Samsung’s baseband implementation (mainly used in Exynos chipsets). The vulnerability can be exploited to achieve arbitrary code execution in the baseband runtime.
The vulnerability we are disclosing in this advisory affected a wide range of Samsung devices, including phones on the newest Exynos chipsets. The November 2023 issue of the Samsung Semiconductor Security Bulletin contains this vulnerability as CVE-2023-41111.
In GPRS, an LLC layer PDU can be up to 1560 bytes long, but the maximum size for an RLC data block is 22/32/38/52 for the GPRS coding schemes CS-1/2/3/4, respectively.
The segmentation and re-assembly procedures for RLC data blocks is described in 3GPP 44.060. 9.1.11. and 9.1.12.
After a fix sized (1 byte long) MAC header, each RLC data block starts with 2 bytes: the first includes the Traffic Block Flow identifier (TFI) and a Final Block Identifier bit (FBI), whereas the second (“BSN_E”) includes the 7 bit BSN (block sequence number) and the Extension bit (E).
A 0 value for the E
bit means the rest of the PDU is all actual RLC data. Otherwise (E==1
), these 2 header bytes are followed by an optional number of LI_M_E
octets, which consist of the 6 bit Length Indicator, the More bit and the Extension bit fields.
After the optional LI_M_E
octets, the rest of the RLC data block is the actual RLC data.
The FBI
, E
, and LI_M_E
fields all play a role in the re-assembly process.
The LI
indicates the length of a fragment, the M
(more bit) says whether there is yet another LLC PDU whose fragment will be present in the current block, and the E
(extension) bit is the inverse of whether this LI_M_E
field is the last one or not (0 means it is not the last one, 1 means it is the last one).
What we can see is that there is no such thing as an “LLC PDU identifier”: within a given RLC Traffic Block Flow (“session”), we can collect and re-assemble only one LLC PDU’s fragments at one time, in other words, any fragment that comes next is considered part of the ongoing LLC PDU.
According to the specification, only the last fragment of an LLC PDU may have a Length Indicator value (i.e. an LI_M_E
field corresponding to it). This makes sense: as long as a fragment is not the last fragment of an LLC PDU, it must fill the (remainder) of the current RLC data block, consequently it doesn’t need a byte wasted on an LI_M_E
octet, the preceding “E” and/or “M” values are already able to signal its presence.
However, there is an exception to this, as 44.060. 10.4.14 explains:
A singular case occurs when the end of the Upper Layer PDU would fit within the RLC data block but the addition of the Length Indicator octet (to indicate the Upper Layer PDU boundary) causes the Upper Layer PDU to extend into the next RLC data block. In this case, this additional LI field shall take the value 0 whatever is the length of the last but one Upper Layer PDU segment.
That optimization sounds like infinitesimal gains, but it’s a quirk of the algorithm that is crucial to this vulnerability chain.
The summarized algorithm of handling an RLC data block is as follows:
LI_M_E
field, recursively reading a next one until E == 1
LI
in the LI_M_E
field is non-0, we have the last fragment of the current LLC PDU, concatenate it with any already collected ones and send the PDU to the upper layer, then based on the combination of the values in LI_M_E
, continue onto a next fragment within the RLC data block:
M==0 && E==0
is not valid, should be ignoredM==0 && E==1
means no more LLC PDUs and no more Extensions to parse afer this one, finished with this RLC data blockM==1 && E==1
means the rest of the data if a new LLC PDU’s start, but no more Extensions to parse, the new LLC PDU will finish in a later RLC data block, so just save this first fragment of the new LLC PDU and then finished with this RLC data blockM==1 && E==0
means move on to the next LI_M_E
which is the first of the next LLC PDU and process it based on the same logicLI
is 0, store this fragment (calculating its size based on the number of data bytes left in the RLC data block after the optional number of LI_M_E
s and any data bytes that have been matched by preceding LI_M_E
fields), on the assumption that the first data byte within the next arriving RLC data block will complete the current LLC PDU, so process any LI_M_E
header byte(s) of that next RLC data block by accounting for the first non-LI_M_E
byte being the last byte of the current LLC PDUThe firts key observation is that this logic, when followed correctly, guarantees that any fragment’s data size is at least block_size-3
(3 being the size of the mandatory headers), with only the final fragment being an exception from this. In other words, given that the smallest block size with Coding Scheme 1 is 23, we get the following equation for the maximum possible valid fragment count in RLC: max_llc_size / (min_block_size - 3) + 1 = 1560 / 20 + 1 = 79
The second key observation is that an implementation must take care that:
LI_M_E
field with LI==0
is allowed per LLC PDULI_M_E
header with LI==0
is present in any given RLC data block (since it must be followed by all data bytes, ergo there is no room for additional fragments)LI_M_E
field with LI==0
shall have M==0
and E==1
values (to be precise, the specification states that for LI==0
case M==0
shall be sent, but the receiver shall simply ignore its value)In Samsung’s case, however, these requirements were not enforced!
Instead, the implementation parsed the header fields in two rounds:
LI_M_E
headers and, since it can be necessary, calculate and store the value of the “remaining data bytes”The problem was a mismatch in how the two rounds handled LI_M_E
headers with the value LI==0
.
In the first round, the logic that looped over the header bytes did not enforce the “only once” rule on the special case, instead allowing it to occur any number of times. This can be seen below from the decompiled pseudocode snippet of the RLC_handle_DATA_IND
function:
…
rlc_data_block_ptr = blk_p;
block_offset_new = 3;
rlcmac_size = uVar8;
if (-1 < (int)mcs_or_cs_encoded) {
rlcmac_size = (int)RLCMAC_SIZE_BY_CS[mcs_or_cs_encoded];
rlcmac_size_ = rlcmac_size;
}
RLC_CONTEXT[sim_].rlc_lens[(int)bsn_00] = rlcmac_size;
if (e_param == 0) {
remaining_data_size = rlcmac_size - 3;
max_data_size = rlcmac_size + -4;
block_offset_from_start = 4;
data_ptr_ = blk_p->data;
LI = *data_ptr_ >> 2;
rlcmac_size_ = remaining_data_size;
if (((*data_ptr_ & 1) == 0) || ((int)(uint)LI <= max_data_size)) {
is_error = false;
}
else {
/* Invalid RLC block %d %x */
pdStack_3c = &dbt_msg_434d0544;
uStack_38 = uVar7;
pal_dbgLog(&pdStack_3c,(uint)LI,data_ptr_,&SUB_fecdba98,iVar3,puVar9);
is_error = true;
}
if (remaining_data_size != 3) {
LI_M_E_byte = *data_ptr_;
/* block_offset_new is 4 so addressing with it has to start at +4 */
block_offset_real = 0;
do {
LI_ = (uint)LI;
/* EXT == 1 -> break, no more headers */
if ((LI_M_E_byte & 1) != 0) {
if (!is_error) {
block_offset_real = block_offset_real + 4;
goto RLC_HDR_CALC_DONE;
}
break;
}
/* LI points beyond data block */
if (max_data_size < (int)LI_) {
data_ptr_ = rlc_data_block_ptr->data + block_offset_real;
INVALID_BLK:
/* Invalid RLC block %d %x */
pdStack_3c = &dbt_msg_434d057c;
uStack_38 = uVar7;
pal_dbgLog(&pdStack_3c,LI_,data_ptr_,&SUB_fecdba98,iVar3,puVar9);
break;
}
/* this means M == 0 and E == 0, since E == 0 was already checked */
if (-1 < (int)((uint)LI_M_E_byte << 30)) {
data_ptr_ = rlc_data_block_ptr->data + block_offset_real + 1;
goto INVALID_BLK;
}
/* +5 as in 4+1 because it has to start at +4.*/
block_offset_from_start = block_offset_real + 5;
next_block_offset_real = block_offset_real + 1;
LI_M_E_byte = rlc_data_block_ptr->data[block_offset_real + 1];
/* BUG: if LI == 0, the max_data_size is only decremented by 1 and there is no detection that this can't be stacked! */
max_data_size = (max_data_size + -1) - LI_;
LI = LI_M_E_byte >> 2;
block_offset_real = next_block_offset_real;
} while (rlcmac_size - 6 != (undefined *)next_block_offset_real);
}
(...)
RLC_HDR_CALC_DONE:
/* Block offset %d */
dStack_3c.ptr = &dbt_msg_434d05f8;
dStack_3c.val = uVar7;
pal_dbgLog(&dStack_3c,block_offset_real,&SUB_fecdba98);
/* this is the same structure as ctx->rlc_offset[bsn] */
g_rlc_cxt[sim_].rlc_offset[bsn] = (char)block_offset_real;
(...)
However, in the second round, the logic (correctly) assumed that an LI==0
value immediately means that there can be no more extension headers to parse and no more fragments to extract: instead the code pivoted to treating all the rest of the RLC data block as the next fragment of the already ongoing LLC PDU and then either returned if the total still fit under 1560, or directly triggered the concatenation otherwise.
This behavior can be seen in the code snippets below, from the function RLC_DecodeDLDataEGPRS
, which is called by RLC_DecodeDLData
, which in turn is called by the above function after the data offset calculation:
if (rlc_frags_desc->state == 0) {
/* S%d:%d */
pdStack_30 = &dbt_msg_434d0eec;
uStack_2c = uVar8 | 0x345;
pal_dbgLog(&pdStack_30,(uint)ctx->rlc_offset[bsn],bsn,&SUB_fecdba98,puVar19);
rlc_frags_desc->state = 1;
rlc_frags_desc->bsn = (byte)bsn;
rlc_frags_desc->LI_h_offset = ctx->rlc_offset[bsn];
}
if (rlc_type == 0x1) {
RLC_DecodeDLDataEGPRS(sim,bsn,ctx,rlc_frags_desc);
}
else {
RLC_DecodeDLDataGPRS(sim,bsn,ctx,rlc_frags_desc);
}
void RLC_DecodeDLDataGPRS(uint sim,int bsn,big_ctx *ctx,rlc_fragms_desc *rlc_frags_desc)
{
byte new_state;
big_ctx *ctx_by_sim;
uint new_pdu_len_;
uint is_state_zero;
uint LI;
int data_offset;
bool bVar1;
uint state;
rlcmac_struct *frame_ptr;
int rlc_len;
rlcmac_struct **frame_ptr_ptr;
byte *data_ptr;
dbt_cmt_t dStack_30;
undefined *puStack_28;
byte LIME;
byte rlc1;
short sim_;
frame_ptr = ctx->rlc_ptrs[bsn];
frame_ptr_ptr = ctx->rlc_ptrs + bsn;
rlc_len = ctx->rlc_lens[bsn];
rlc1 = frame_ptr->rlc1;
sim_ = (short)sim;
sim_ = (short)sim;
/* Byte 2 of RLC header has LSB bit of E: if it is E, it means that there ARE LI_M_E extension(s) to handle */
if ((frame_ptr->rlc2 & 1) == 0) {
ctx_by_sim = g_L2_cxt + sim_;
LIME = frame_ptr->data[0];
new_pdu_len_ = (uint)ctx->rlc_offset[bsn];
frag_state = rlc_frags_desc->state == 1;
/* if LI != 0 */
if (LIME >> 2 != 0) {
data_ptr = frame_ptr->data;
/* loop to handle until there are no more LI_M_E extensions to handle */
do {
data_ptr = data_ptr + 1;
LI = (uint)(LIME >> 2);
/* we know that LI != 0 must be the final fragment of an LLC PDU, so we concatenate it, then move on to potential other LI_M_E headers */
if (frag_state) {
/* !!! Notice how the LI value here is not verified yet, this is why rlc_DLPduConcatenate must take care to check total size, it could be over 1560 with it, even without games with LI_M_E header field stacking in fragments */
rlc_frags_desc->pdu_len = rlc_frags_desc->pdu_len + LI;
new_pdu_len_ = rlc_DLPduConcatenate(sim,LI,bsn,rlc_frags_desc);
rlc_frags_desc->state = 2;
LI = 0;
}
/* after concatenation on a non-0 LI fragment, check if there is nothing left; if M==0 and E==0, that is the case, so exit */
if ((LIME & 2) == 0) {
/* M=0 */
RLC_freePdusByBsn(sim,bsn,ctx_by_sim);
rlc_frags_desc->state = 0;
goto RETURN;
}
/* otherwise, check if the FSB is 1, in this case process the rest of it as one LLC PDU even if no extensions */
/* adjust the data offset based on the just concatenated LLC PDU's last fragment's LI size */
data_offset = LI + new_pdu_len_;
rlc_frags_desc->bsn = (byte)bsn;
rlc_frags_desc->LI_h_offset = (byte)data_offset;
/* rlc byte1 LSB is the FBI -> if FBI is true and there are no extensions -> we concatenate */
if ((rlc1 & 1) != 0 && (LIME & 1) != 0) {
rlc_frags_desc->pdu_len = rlc_len - data_offset;
rlc_DLPduConcatenate(sim,rlc_len - data_offset,bsn,rlc_frags_desc);
rlc_frags_desc->state = 2;
goto LAB_4243ad66;
}
/* no more Extensions: add fragment based on calc'd data offset and exit! */
if ((LIME & 1) != 0) {
rlc_frags_desc->state = 1;
rlc_frags_desc->pdu_len = rlc_len - data_offset;
RLC_addPDUFragm(sim,bsn,ctx,rlc_frags_desc);
ctx->rlc_type[bsn] = 0;
*frame_ptr_ptr = (rlcmac_struct *)0x0;
goto RETURN;
}
frag_state = 1;
rlc_frags_desc->state = 1;
LIME = *data_ptr;
} while (LIME >> 2 != 0);
}
/* simplest case: no extensions. Either concatenate the whole thing, or just add as a fragment the whole thing and return */
if (frag_state) {
new_total_len = rlc_frags_desc->pdu_len + (rlc_len - new_pdu_len_);
rlc_frags_desc->pdu_len = new_total_len;
/* if the LI is not under 1560 anymore, then always trigger concatenation */
if (1560 < LI) {
rlc_DLPduConcatenate(sim,rlc_len - new_pdu_len_,bsn,rlc_frags_desc);
rlc_frags_desc->state = 2;
(...)
}
/* else: simply add the fragment and exit! */
RLC_addPDUFragm(sim,bsn,ctx,rlc_frags_desc);
ctx->rlc_type[bsn] = 0;
*frame_ptr_ptr = (rlcmac_struct *)0x0;
}
}
/* this is the case where `BSN_E` has the Extension bit set to 1, meaning that are no LI_M_E headers at all*/
else {
/* E=1 */
pdStack_30 = &dbt_msg_434d1814;
uStack_2c = sim * 0x40000 + 0x40000 | 0x3c2;
pal_dbgLog(&pdStack_30,&SUB_fecdba98);
if (rlc_frags_desc->state != 2) {
/* ctx->rlc_offset[bsn] is the data offset we calculated in the first loop */
rlc_fragm_len = rlc_len - (uint)ctx->rlc_offset[bsn];
new_pdu_len_ = rlc_frags_desc->pdu_len + rlc_fragm_len;
rlc_frags_desc->pdu_len = new_pdu_len_;
if (g_rlc_cxt[sim_id] == 2) {
/*store if the max length is not reached, otherwise concatenate */
if (new_pdu_len_ < 1560) {
ADD_FRAGM_AND_RETURN:
RLC_addPDUFragm(sim,bsn,ctx,rlc_frags_desc);
ctx->rlc_type[bsn] = 0;
*frame_ptr_ptr = (rlcmac_struct *)0x0;
goto RETURN;
}
rlc_DLPduConcatenate(sim,rlc_len,bsn,rlc_frags_desc);
RLC_freePdusByBsn(sim,bsn,SOMETHING_BIG_CONTEXT + sim_);
new_state = 0;
}
else {
/* if max len is reached OR the LSB in first RLC header, i.e. the FBI (Final Block Indicator) is 1, then concatenate, otherwise just store */
if ((new_pdu_len_ < 1560) && ((rlc1 & 1) == 0)) goto ADD_FRAGM_AND_RETURN;
rlc_DLPduConcatenate(sim,rlc_len,bsn,rlc_frags_desc);
RLC_freePdusByBsn(sim,bsn,SOMETHING_BIG_CONTEXT + sim_);
new_state = 2;
}
rlc_frags_desc->state = new_state;
}
}
RETURN:
if (l_stack_cookie == &SUB_d1e4c0de) {
return;
}
SSP_ABORT:
/* WARNING: Subroutine does not return */
stack_smash_abort();
}
The issue with this mismatch of course was that the second round used the calculated data size stored away in the first round, where the stacking of LI_M_E
fields with LI==0
enabled the calculated “remaining data block length” to be as small as 3.
Therefore, fragment count was not maxed out at 79: it was possible to “spray” RLC data blocks with LI_M_E == 0 0 1
bytes and therefore create fragment addition events that would increase the total LLC PDU size by as small as 3 bytes at a time.
By itself, this would have only meant specification non-compliant behavior. However, the RLC_addPDUFragm
function omitted an explicit check of the used up slot count of the array it stored fragments into, instead relying on the previous assumptions holding that the check for the maximum accumulated size (1560) will implicitly enforce a fragment count of maximum 79.
As we can see from below, the increment happens on every call without a check, therefore leading to a straightforward array overflow. Since this array was stored in global memory, the two vulnerabilities together resulted in a BSS buffer overflow.
void RLC_addPDUFragm(uint sim,int bsn,big_ctx *ctx,rlc_fragms_desc *fragm_desc)
{
/* EGPRS RECV */
if (ctx->rlc_type[bsn] == 5) {
(...)
}
else {
/* GPRS RECV */
if (ctx->rlc_type[bsn] == 1) {
fragm_desc->fragms[index] = ctx->rlc_ptrs[bsn];
ctx->rlc_ptrs[bsn] = (rlcmac_struct *)0x0;
/* Logging */
(...)
}
}
/* set:
- block offset
- data_block_size
- is_heap_allocated flag
into the fragm descriptor, from the bsn descriptor */
/* comes from LI looping in RLC_handle_DATA_IND directly, can be >= 3 */
fragm_desc->block_offs[index] = ctx->rlc_offset[bsn];
fragm_desc->block_sizes[index] = ctx->rlc_lens[bsn];
fragm_desc->is_alloced_fragm[index] = ctx->rlc_allocated[bsn];
/* n_blks number of fragments increase - no check! OVERFLOW ! */
fragm_desc->n_blks = fragm_desc->n_blks + 1;
}
For completeness, here is the definition of the overflown structure - as we can see, multiple arrays within the structure get overflown if the fragment count increases beyond 79, causing multiple simultanious intra-struct memory corruptions and, eventually, memory corruption beyond the structure as well.
byte state
byte bsn
byte LI_h_offset
char pad
int pdu_len
char[79] block_offs
char[79] is_alloced_fragm
char pad2
char pad3
int[79] block_sizes
rlcmac_struct *[79] fragms
int[79] egprs_plus_fragms
int n_blks
All Samsung chipsets containing Samsung’s baseband implementation, including all Exynos chipsets.
Samsung OTA images, released after October 2023, contain the fix for the vulnerability.