Implement Control Message handling between V-SMF and H-SMF
during Home Routed Roaming process
Completed the implementation of control messages exchanged
between V-SMF and H-SMF as part of the Home Routed Roaming process
SMF selection according to 4.3.2.2.3 of TS23.502.
V-SMF makes discovery in the V-NRF according to V-NSSF.
H-SMF makes discovery in the H-NRF according to H-NSSF.
(The AMF goes through the V-NSSF and forwards the message seeking the NRF to the H-NSSF.)
According to TS29.503, we can choose whether or not to allow LBO roaming
on a per-session basis.
To this end, we have made changes to allow us to set this via the WebUI.
NF should accept 204 No Content for Update Subscription requests.
According to 3GPP 29.510 NRF specification document in figure 5.2.2.5.6.1
NRF may return 204 or 200 for success update operations.
2a. On success, if the NRF accepts the extension of the lifetime
of the subscription, and it accepts the requested value for the "validityTime"
attribute, a response with status code "204 No Content" shall be returned.
2b. On success, if the NRF accepts the extension of the lifetime
of the subscription, but it assigns a validity time different than
the value suggested by the NF Service Consumer, a "200 OK" response code shall
be returned. The response shall contain the new resource representation
of the "subscription" resource, which includes the new validity time,
as determined by the NRF, after which the subscription becomes invalid.
I changed it so that all NFs can receive both 200 and 204 STATUS.
I also changed the default behavior of NRFs to respond with 204,
which is NO CONTEXT.
TS24.008
10.5.6.12 Traffic Flow Template
Table 10.5.162: Traffic flow template information element
Number of packet filters (octet 3)
The number of packet filters contains the binary coding
for the number of packet filters in the packet filter list.
The number of packet filters field is encoded in bits 4
through 1 of octet 3 where bit 4 is the most significant
and bit 1 is the least significant bit.
For the "delete existing TFT" operation and
for the "no TFT operation", the number of packet filters shall be
coded as 0. For all other operations, the number of packet filters
shall be greater than 0 and less than or equal to 15.
The array of TLV messages is limited to 16.
So, Flow(PDI.SDF_Filter) in PDR is limited to 16.
Therefore, we defined the maximum number of flows as 16.
TS24.008
10.5.6.12 Traffic Flow Template
Table 10.5.162: Traffic flow template information element
Number of packet filters (octet 3)
The number of packet filters contains the binary coding
for the number of packet filters in the packet filter list.
The number of packet filters field is encoded in bits 4
through 1 of octet 3 where bit 4 is the most significant
and bit 1 is the least significant bit.
For the "delete existing TFT" operation and
for the "no TFT operation", the number of packet filters shall be
coded as 0. For all other operations, the number of packet filters
shall be greater than 0 and less than or equal to 15.
The array of TLV messages is limited to 16.
So, Flow(PDI.SDF_Filter) in PDR is limited to 16.
Therefore, we defined the maximum number of flows as 16.
* [UPF/SGW-U] Optimizing data-path (#3306)
In ogs_pfcp_up_handle_pdr, there is a copy operation performed on recvbuf,
which can reduce the sending performance in the data path. Personally,
We believe that this copy operation can be eliminated.
Of course, if it is canceled, the recvbuf does not need to be released again
at the location where ogs_pfcp_up_handle_pdr is called. After testing,
it has indeed shown an improvement in performance of approximately 15-18%.
/*
sendbuf = ogs_pkbuf_copy(recvbuf);
if (!sendbuf) {
ogs_error("ogs_pkbuf_copy() failed");
return false;
}*/
sendbuf = recvbuf;</div>
* update it
Update Bearer Request
Modify Bearer Context Request
Modify Bearer Context Accept
Update Bearer Response
In the process above, we incorrectly used the Timer
that the MME uses to wait for the eNB.
We used xact's holding timer, which continues to hold the transaction
for further exception handling even after sending the Update Bearer Response.
This timer should end exactly when the Update Bearer Response is sent
by the MME to the SGW-C. Therefore, we have added a new peer timer
in xact for this purpose.
I created ogs_sbi_xact_find_by_id() with a hash
to replace ogs_sbi_xact_cycle().
Modified to find the xact via xact->id
when making an HTTP request with the SBI client function
and waiting for the HTTP response.
Pool library has the following issues with XXX_cycle,
including mme_enb_cycle()/amf_ue_cycle()
```
INIT POOL(SIZE:5)
Alloc Node1
Alloc Node2
Alloc Node3
Alloc Node4
Alloc Node5
Free Node4
Free Node3
PoolCycle(Node4) is NULL (Freed...OK!)
PoolCycle(Node3) is NULL (Freed...OK!)
Alloc Node6
Alloc Node7
PoolCycle(Node4) is Not NULL (Freed...but NOK!)
PoolCycle(Node3) is Not NULL (Freed...but NOK!)
PoolCycle(Node6) is Not NULL (Allocated...OK!)
PoolCycle(Node7) is Not NULL (Allocated...OK!)
```
If we use ogs_poll_alloc() to create and allocate a node,
the correct behavior of calling ogs_pool_free() on this node
and then later calling ogs_pool_cycle() on this node must always return NULL.
However, the behavior of calling ogs_pool_cycle() on this node
in the future may return a “valid” pointer.
To solve the problem, we added hash id to the pool memory and
ogs_pool_find_by_id() function is added.
Consider the following situation.
```
1. SMF->SGW-C->MME: First Update Bearer Request
2. MME->UE: First Modify EPS bearer context request
3. SMF->SGW-C->MME: Second Update Bearer Request
4. MME->UE: Second Modify EPS bearer context request
5. UE->MME: First Modify EPS bearer context accept
6. MME->SGW-C->SMF: First Update Bearer Response
7. UE->MME: Second Modify EPS bearer context accept
8. MME->SGW-C->SMF: Second Update Bearer Response
```
Until now, only one GTP transaction was managed for one bearer.
Therefore, if the UE does not send an EPS Modify bearer accept to the MME,
and the SMF/SGW-C sends an Update Bearer Request to the MME,
The NEW update bearer request overwrites the OLD that was previously managed.
So we modified it to manage them simultaneously.
However, we don't know if this is the right way to implement it.
So if the SMF/SGW-C sends 5 MMEs of Update Bearer Request and
the UE sends only 3 MMEs of Modify EPS bearer context accept,
we have no way to associate it.
Therefore, it's implemented so that we just process them sequentially and
2 of them are just timeout.
Fixed to not change the session information stored in the DB
when transferring context from GERAN to EUTRAN.
Note that the Tracking Area Update Procedure differs
from the Attach Procedure in 5.3.2 in the point
at which HSS and ULR/ULA are performed.
3GPP TS 23.401
Ch 5.3.3 Tracking Area Update procedures
<Attach Procedure>
1. Security-mode complete
2. Update Location Request/Answer
3. Create Session Request/Response
<Tracking Area Update Procedure>
1. Security-mode complete
2. Create Session Request/Response
3. Update Location Request/Answer
When TAU creates a Create Session Request message,
there is no session type information in the Subscriber DB
that is received from HSS in the Update Location.
Therefore, TAU does not reflect the Session Type
but creates PDN Type by reflecting the information
in the Request Type as it is.
If the UE continuously attempts to Attach while changing PDN Type,
it will cause the wrong IP to be assigned.
(e.g PDU-Type : IPv4v6 -> IPv4 -> IPv4v6)
This is because we use two variables at the same time,
one to read and store the Static IP from the Subscriber DB and
one to store the IP assigned from SMF, called session->paa.
When the UE attaches with PDN-Type set to IPv4v6,
MME saves the allocated IP in session->paa.
However, MME thinks it has been assigned a static IP based on the information
in session->paa, so changing the PDN-Type may result in the wrong IP
being assigned.
To solve this problem, I separated the variable(session->paa) that stores
the allocated IP received from SMF and the variable(session->ue_ip) that stores
the Static IP read from the Subscriber DB.
Therefore, the information read from the Subscriber DB
(session->session_type and session->ue_ip) should not be modified.
The validity time for NF Instances obtained through NF Discovery was
not properly implemented. Since the validity was 3600 seconds(1 hour),
which caused 5G Core to not work properly after 3600 seconds(1 hour).
There was an issue where an NF Instance should be deleted
when its validity time expired, but it was not working correctly
due to incorrect use of reference count.
Therefore, I have modified the Validity of NF Instances obtained
through NF Discovery to work properly.
I also changed the default value of valdityPeriod to 30 seconds.
Fixed not using Reference Count for adding/deleting NF Instances.
Up until now, NF Instances have been managed by referencing the Reference Count.
Initially, when an NF Instance is added, the Reference Count is incremented and
when it is deleted, the Reference Count is decremented.
If a UE discovers another NF Instance through the NF Discovery function,
the Reference Count is incremented. And if a UE de-registers,
the Reference Count of the discovered NF is decremented.
However, there's a problem with this approach.
When other NF is de-registered,
there is no guarantee that it will be 100% notified.
For example, if a UDM is de-registered, but an SCP is de-registered before it,
the AMF will not be notified that the UDM has been de-registered.
In situations where this is not clear, Reference Count cannot be used.
Therefore, we have modified it to not use the Reference Count method.
Also, when a UE connects, it is modified to always search
whether an NF Instance exists by NF Instance ID whenever it is discovered.
To do this, we modified lib/sbi/path.c as shown below.
```diff
@@ -281,13 +281,15 @@ int ogs_sbi_discover_and_send(ogs_sbi_xact_t *xact)
}
/* Target NF-Instance */
- nf_instance = sbi_object->service_type_array[service_type].nf_instance;
+ nf_instance = ogs_sbi_nf_instance_find(
+ sbi_object->service_type_array[service_type].nf_instance_id);
if (!nf_instance) {
nf_instance = ogs_sbi_nf_instance_find_by_discovery_param(
target_nf_type, requester_nf_type, discovery_option);
- if (nf_instance)
- OGS_SBI_SETUP_NF_INSTANCE(
- sbi_object->service_type_array[service_type], nf_instance);
+ if (nf_instance) {
+ OGS_SBI_SETUP_NF_INSTANCE_ID(
+ sbi_object->service_type_array[service_type], nf_instance->id);
+ }
}
```
A friend in the community was trying to connect an SMF made by another
manufacturer with an SBI interface and found a big problem with Open5GS.
All of the code in the part that generates the Resource URI
from HTTP.location is invalid.
For example, suppose we create a Resource URI with SMContext as below.
{apiRoot}/nsmf-pdusession/<apiVersion>/sm-contexts/{smContextRef}
In this case, Open5GS extracted the {smContextRef} part of the HTTP.location
and appended it to the beginning
{apiRoot}/nsmf-pdusession/<apiVersion>/sm-contexts/.
This implementation may not work properly if the apiRoot changes.
Consider a different port number as shown below.
<HTTP.location>
127.0.0.4:9999/nsmf-pdusession/v1/sm-contexts/1
The SMF may send an apiRoot to the AMF with a changed port number,
in which case the AMF must honor it.
Therefore, instead of extracting only the smContextRef from HTTP.location,
we modified it to use the whole thing to create a Resource URI.
We modified all NFs that use HTTP.location in the same way, not just SMFs.
Add an option to disable printing the timestamp. This is useful to not
have duplicate timestamps, when stderr is piped into a logging system
that adds timestamps on its own. For example with systemd's journald:
$ journalctl -u open5gs-smfd
Apr 10 13:25:18 hostname open5gs-smfd[1582]: 04/10 13:25:18.274: [app] INFO: Configuration: '/etc/open5gs/smf.yaml' (../lib/app/ogs-init.c:130)
Configuration change:
```
<OLD Format>
logger:
file: /var/log/open5gs/smf.log
<NEW Format>
logger:
file:
path: /var/log/open5gs/smf.log
```
Example config, to have no timestamps on stderr:
```
logger:
default:
timestamp: false
file:
path: /var/log/open5gs/smf.log
timestamp: true
```
The way subnet is set up has changed as shown below.
```
<OLD Format>
smf:
session:
- subnet: 10.45.0.1/16
<NEW Format>
smf:
session:
- subnet: 10.45.0.0/16
gateway: 10.45.0.1
```
For more information, please refer to Pull Request #2975.
If eg. PCRF or AAA diameter link is not yet ready (eg. PCRF crashed),
and a client sends a CreateSessionRequest announcing its ow F-TEID,
then open5gs-smfd answers with Create Session Response Cause=
"Remote peer not responding", but it is not setting the received F-TEID
in the header of the response, instead it sends with TEI=0.
As a result, the peer cannot match the CreateSessionResponse,
and needs to rely on its own timeout timer to figure out
that specific request failed.
To address this issue, I modified the GTP Response message to check
the Sender F-TEID and send it accordingly, setting the destination TEID
to the value of the Sender F-TEID.
I've made this modification only for SMF, but MME and SGW-C have not done so;
if you need to, you can work from the examples in SMF.
Similarly, the same situation can happen with PFCP. If anyone needs to do this
in the future, I think you can work on it this way.
Cause is set according to particular NF standard.
Additionally:
- OGS_SBI_HTTP_STATUS_MEHTOD_NOT_ALLOWED typo fixed.
- [PCF] Fixed SM Policy establishment error handling
a cryptographic vulnerability in the SUCI decryption routines
of Open5GS 5G—specifically Profile B, which uses P-256 (secp256r1)
for its elliptic curve routines.
If a mobile device user passes a public key within its SUCI
that does not correspond to a valid point on the P-256 elliptic curve,
the Open5GS UDM will not check the point
before running elliptic curve operations with it and returning a response
to the mobile device user.
If the public key is not checked to be a valid point, an attacker can leverage
this behavior to extract the Profile B private key from the UDM,
as has been done in other domains
(https://owasp.org/www-pdf-archive/Practical_Invalid_Curve_Attacks_on_TLS-ECDH_-_Juraj_Somorovsky.pdf).
Note that Profile A is not similarly vulnerable to this, as it is impossible
to construct an invalid point on a curve25519 elliptic curve.
There was some work that went into developing a practical proof of concept
of this kind of attack against free5gc last year; it can be found here:
https://www.gsma.com/security/wp-content/uploads/2023/10/0073-invalid_curve.pdf
And here is the free5gc security advisory:
https://github.com/advisories/GHSA-cqvv-r3g3-26rf
To mitigate this issue in Open5GS, the public key of the UE must be validated
by the UDM prior to use. Adding a validation function such as the following
should work:
I designed this code based on information from https://crypto.stackexchange.com/questions/90151/verify-that-a-point-belongs-to-secp256r1.
'node_timeout' and some other functions can remove a smf_sess_t
while that session is still waiting for a PFCP reply
and has an active PFCP xact.
In this case, xact->data points to the deleted session
and xact's timeout function (sess_5gc_timeout for example)
eventually refers to this already freed session.
This fix prevents duplicate deletes from occurring by checking to see
if the session context has already been deleted when the timeout occurs.
Additionally, it moves session deletions out of timer callbacks into
state machine by reselect_upf().
Due to the way 'ogs_timer_mgr_expire' calls timer callbacks,
one must not stop or expire timers from within a timer callback.
And now one must not remove sessions from within a timer callback.
If eg. PCRF or AAA diameter link is not yet ready (eg. PCRF crashed), and
a client sends a CreateSessionRequest announcing its ow F-TEID,
then open5gs-smfd answers with Create Session Response Cause=
"Remote peer not responding", but it is not setting the received F-TEID
in the header of the response, instead it sends with TEI=0.
As a result, the peer cannot match the CreateSessionResponse, and needs
to rely on its own timeout timer to figure out that specific request failed.
This also happens in PFCP, so to solve this problem, I added teid/seid_presence
to the interface that sends the error message as shown below.
void ogs_gtp2_send_error_message(ogs_gtp_xact_t *xact,
int teid_presence, uint32_t teid, uint8_t type, uint8_t cause_value);
void ogs_pfcp_send_error_message(
ogs_pfcp_xact_t *xact, int seid_presence, uint64_t seid, uint8_t type,
uint8_t cause_value, uint16_t offending_ie_value);