Troubleshooting¶
This section covers common problems seen in eduroam deployments and day-to-day operations.
Quick Diagnostic Checklist¶
Run through this checklist first — it resolves the majority of issues:
| Check | Command / Action |
|---|---|
| Is FreeRADIUS running? | systemctl status freeradius |
| Is the server certificate valid? | openssl x509 -in /etc/ssl/certs/radius.crt -noout -enddate |
| Can we reach the NRO RADIUS server? | radtest testuser@realm <NRO_IP> 1812 <secret> |
| Can we reach LDAP/AD? | ldapsearch -x -H ldap://server -b 'dc=inst,dc=edu' |
| Does local auth work? | radtest localuser password localhost 0 testing123 |
| Are there errors in the RADIUS log? | tail -100 /var/log/freeradius/radius.log \| grep -i error |
| Is the eduroam SSID broadcasting? | Check AP/WLC management interface |
Common User Issues¶
Repeated password prompts¶
Typical causes:
- incorrect username format, for example missing
@realm - expired or changed password
- stale device profile
- server certificate not trusted
Recommended actions:
- confirm the username is in the format
user@institution.ac.ke - remove the old eduroam profile and install a fresh profile
- use eduroam CAT where available
- verify date and time on the device
Certificate warning during connection¶
Users must not ignore certificate warnings. A warning usually means:
- the device was not configured with the correct trust settings
- the institution changed its RADIUS server certificate
- the user connected to a misconfigured or rogue network
If certificate validation fails, reconnect only after installing the correct profile or confirming the expected server name and CA.
Connected but no Internet access¶
Check:
- DHCP scope availability
- user VLAN assignment
- firewall policy
- DNS reachability
- AP/controller policy for the eduroam SSID
FreeRADIUS Troubleshooting¶
Run in debug mode¶
Use debug mode when validating configuration changes:
For packaged FreeRADIUS 3.x on some platforms, radiusd -X is equivalent.
Debug mode shows:
- incoming requests
- realm matching and proxy decisions
- EAP state machine details
- LDAP, SQL, and policy processing
- accept or reject reasons
Validate configuration before restart¶
Use this after editing virtual servers, modules, clients, realms, or policy files.
Test local authentication¶
Simple PAP test against a local user or backend:
Use eapol_test for realistic 802.1X/EAP testing where available.
SP-Side Problems¶
Requests not leaving the SP¶
Check:
clients.confentries for APs/controllers- source IP and shared secret
- firewall rules permitting UDP
1812and1813 - realm proxy configuration
- home server reachability
Requests proxied but no reply received¶
Check:
- upstream federation IPs and shared secrets
- NAT or firewall state expiry
- duplicate or incorrect home server definitions
- packet filtering between SP and federation servers
Users authenticate but land in the wrong VLAN¶
Check:
- whether the AP/controller honors standard tunnel attributes
- authorization policy order in
sites-enabled/default - group lookup results in LDAP or SQL
- controller-side VLAN mapping and role policy
Expected RADIUS attributes usually include:
IdP-Side Problems¶
Inner authentication fails¶
Check:
- PEAP or TTLS inner method configuration
- LDAP bind account and search filter
- Active Directory group policy restrictions
- password expiry or account lockout
Realm is not routed correctly¶
Check:
- local realm definitions
nostripversusstripbehavior- federation registration for the institution realm
- whether the user is typing the correct realm
Certificate-related failures¶
Check:
- certificate chain completeness
- server name matching
- expiration dates
- EKU and KU fields
- whether the issuing CA is distributed to clients
Failure Scenarios¶
Scenario 1 — All users cannot connect (institution-wide outage)¶
This typically means the RADIUS server is down or the server certificate has expired.
Step 1 — Check the FreeRADIUS service
systemctl status freeradius
journalctl -u freeradius --since '1 hour ago'
freeradius -X # run in debug mode to see startup errors
Step 2 — Check certificate expiry
openssl x509 -in /etc/ssl/certs/radius.crt -noout -enddate -subject
openssl verify -CAfile /etc/ssl/certs/ca-bundle.crt /etc/ssl/certs/radius.crt
If the certificate is expired: request emergency replacement from your CA, install it, run service freeradius reload, and update CAT profiles immediately.
Step 3 — Check configuration syntax
Scenario 2 — Visitors cannot connect, locals work¶
The proxy chain is broken. The local RADIUS server is working but cannot forward foreign-realm requests to the NRO.
Step 1 — Test NRO connectivity
radtest testing@nrotest.eduroam.org Radius1 <NRO_IP> 1812 <shared_secret>
# Expected: Access-Accept or Access-Reject (not a timeout)
Step 2 — Check proxy configuration
grep -A5 'realm ~' /etc/freeradius/3.0/proxy.conf
grep -A8 'home_server eduroam' /etc/freeradius/3.0/proxy.conf
Step 3 — Check shared secret mismatch
A mismatch produces invalid Message-Authenticator errors in the debug log. Verify the shared secret matches the NRO's configuration exactly.
Step 4 — Review proxy log entries
Scenario 3 — One specific user cannot connect¶
A user-specific issue: account problem, expired password, or device misconfiguration.
Step 1 — Search the RADIUS log for this user
grep 'username@domain' /var/log/freeradius/radius.log | tail -20
# Look for: Access-Reject, EAP failure, LDAP error
Step 2 — Test LDAP authentication directly
ldapsearch -x -H ldap://your-ad.institution.edu \
-D 'uid=username,dc=inst,dc=edu' -W \
-b 'dc=inst,dc=edu' '(uid=username)'
Step 3 — Common user-specific causes
- account locked or disabled in Active Directory/LDAP
- password expired — user must change it before eduroam works again
- user not in the required group for eduroam access
- device has an outdated or incorrect CAT profile
- device manually configured with wrong EAP type or wrong CA
Scenario 4 — Device or OS-specific failures¶
| Affected devices | Likely cause | Fix |
|---|---|---|
| iOS / macOS only | Strict SAN validation; CN alone not accepted | Reissue cert with SAN matching server hostname; reinstall CAT profile |
| Android only | Missing intermediate CA in chain | Include full chain in certificate_file in eap.conf |
| Windows only | Group policy blocking 802.1X or NPS conflict | Check local network policy; test with manual 802.1X profile |
| All mobile, desktop fine | EAP fragmentation — large packets dropped | Increase MTU on AP; check frag_size in eap.conf |
| New device only | Device using manual config, not CAT profile | Redeploy CAT installer |
Scenario 5 — Intermittent or slow authentication¶
Slow authentication (5–30 second delays)
- DNS reverse lookup failure: add
hostname_lookups = noinradiusd.conf - LDAP query slow: check LDAP server load; optimize the search filter
- NRO response slow: increase
response_windowinproxy.conf; check the network path - EAP session timeout too short: check
eap { timer_expire = 60 }
Intermittent failures
- RADIUS server overloaded: check CPU/memory; increase
max_requestsinradiusd.conf - Packet loss to NRO: run
mtr --udp --port 1812 <NRO_IP>to identify the path - LDAP connection pool exhausted: increase
pool.maxin the LDAP module configuration - Auth race condition on roam: expected behavior; add
retry_delay = 5in client config
Log Reference — Common Error Messages¶
| Log message | Meaning and action |
|---|---|
ERROR: SSL: error:0200100D: fopen: Permission denied |
FreeRADIUS cannot read cert/key file. Fix: chmod 640, chown freerad |
TLS Alert read: fatal: certificate expired |
RADIUS server cert has expired. Replace immediately |
proxy: Marking home server X as zombie |
NRO RADIUS not responding. Check connectivity and shared secret |
mschap: MS-CHAP2-Response is incorrect |
Wrong password or LDAP/AD not returning the correct NT hash |
invalid Message-Authenticator |
Shared secret mismatch between RADIUS peers. Verify secrets on both ends |
No Auth-Type found: rejecting the user |
User matched no authentication module. Check users file and module config |
LDAP: Could not connect to server |
FreeRADIUS cannot reach LDAP/AD. Check network, firewall, and LDAP service health |
Escalation Path¶
| Level | When to escalate | Contact |
|---|---|---|
| L1 | User cannot connect, basic checks pass | Local helpdesk → sysadmin |
| L2 | Proxy chain issue, NRO unreachable | Contact your NRO technical support (KENET: helpdesk@kenet.or.ke) |
| L3 | Foreign institution's users failing at their home server | NRO contacts the foreign institution's NRO |
| L4 | International routing failure | NRO escalates to GÉANT NOC |
Logging and Monitoring¶
Useful locations vary by distribution, but common paths include:
/var/log/freeradius/radius.log/var/log/freeradius/radacct/journalctl -u freeradius
Monitor at least:
- authentication success and failure rate
- proxy latency
- certificate expiry
- AP/controller RADIUS timeout rate
- accounting volume and SQL write failures
Operational monitoring schedule¶
| Check item | Frequency | Tool / method |
|---|---|---|
| RADIUS server certificate expiry | Weekly | openssl x509 -enddate |
| NRO RADIUS reachability | Daily | radtest or monitoring system |
| Authentication success rate | Daily | RADIUS accounting logs |
| LDAP/AD service health | Daily | Health check script |
| FreeRADIUS error log review | Weekly | grep ERROR /var/log/freeradius/ |
| CAT profile currency | Monthly | cat.eduroam.org admin portal |
| Shared secret rotation review | Annually | Internal audit |