
Background#
A few weeks ago I moved the last of my local Obsidian Vaults to my NAS. Almost immediately after moving the vault, while editing a note on my desktop, Obsidian unceremoniously closed the vault and quit to the vault management screen.
Odd, I thought, until it happened again minutes later; and many times, unrelentingly in the following hours and days after.
This was disappointing, not just because a computer was misbehaving under mysterious circumstances; but I had just migrated both the vault I share with my partner to the NAS, and their personal vault, after laboriously extolling the virtues of the NAS I had built1. The NAS was supposed to make sharing more reliable and straightforward, to be the perfect system, not to bring my systems into question and just generally harsh the vibes, man.
It was curious that I had only encountered this issue now. I built the NAS several months ago and have been using it for all sorts of things and had not seen instability before. Although it was Obsidian that triggered this behaviour for the first time, I immediately turned my suspicions to the layer between my Windows speaking desktop and the NAS: Samba.
Error#
Neatly, Samba retains individual logs for each connected client.
I tailed the log for my desktop and fidgeted around Obsidian waiting for another crash, and sure enough: observed a segfault in smbd!
[2026/01/20 18:45:30.577215, 0] lib/util/fault.c:192(smb_panic_log)
PANIC (pid 1168495): Signal 11: Segmentation fault in 4.19.5-Ubuntu
[2026/01/20 18:45:30.577462, 0] lib/util/fault.c:303(log_stack_trace)
BACKTRACE: 29 stack frames:
#0 /usr/lib/x86_64-linux-gnu/samba/libgenrand-samba4.so.0(log_stack_trace+0x37) [0x7084668a4517]
#1 /usr/lib/x86_64-linux-gnu/samba/libgenrand-samba4.so.0(smb_panic+0x15) [0x7084668a4d25]
#2 /usr/lib/x86_64-linux-gnu/samba/libgenrand-samba4.so.0(+0x2dca) [0x7084668a4dca]
#3 /lib/x86_64-linux-gnu/libc.so.6(+0x45330) [0x708466645330]
#4 /lib/x86_64-linux-gnu/libc.so.6(+0x18b79d) [0x70846678b79d]
#5 /lib/x86_64-linux-gnu/libsmbconf.so.0(volume_label+0x53) [0x708466bad963]
#6 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so.0(smbd_do_qfsinfo+0xa2) [0x708466c7fc82]
#7 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so.0(smbd_smb2_request_process_getinfo+0x276) [0x708466ce2096]
#8 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so.0(smbd_smb2_request_dispatch+0x10c7) [0x708466cc7d07]
#9 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so.0(smbd_smb2_request_dispatch_immediate+0x59) [0x708466cc9439]
#10 /lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_invoke_immediate_handler+0x170) [0x708466837080]
#11 /lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_immediate+0x22) [0x7084668370e2]
#12 /lib/x86_64-linux-gnu/libtevent.so.0(+0xed92) [0x70846683ad92]
#13 /lib/x86_64-linux-gnu/libtevent.so.0(+0x6004) [0x708466832004]
#14 /lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x9b) [0x708466833bab]
#15 /lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x2b) [0x708466833cfb]
#16 /lib/x86_64-linux-gnu/libtevent.so.0(+0x6084) [0x708466832084]
#17 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base-samba4.so.0(smbd_process+0x7eb) [0x708466cb534b]
#18 smbd: client [10.10.8.6](+0xa5d6) [0x5f68623115d6]
#19 /lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_invoke_fd_handler+0x98) [0x708466836e48]
#20 /lib/x86_64-linux-gnu/libtevent.so.0(+0xefda) [0x70846683afda]
#21 /lib/x86_64-linux-gnu/libtevent.so.0(+0x6004) [0x708466832004]
#22 /lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x9b) [0x708466833bab]
#23 /lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x2b) [0x708466833cfb]
#24 /lib/x86_64-linux-gnu/libtevent.so.0(+0x6084) [0x708466832084]
#25 smbd: client [10.10.8.6](main+0x1432) [0x5f686230f032]
#26 /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x70846662a1ca]
#27 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) [0x70846662a28b]
#28 smbd: client [10.10.8.6](_start+0x25) [0x5f686230fa95]
[2026/01/20 18:45:30.577562, 0] source3/lib/util.c:691(smb_panic_s3)
smb_panic(): calling panic action [/usr/share/samba/panic-action 1168495]
[2026/01/20 18:45:30.579285, 0] source3/lib/util.c:698(smb_panic_s3)
smb_panic(): action returned status 0
It would appear that in this volume_label function, we’re calling strlen on something that is NULL; widely regarded as a crime worthy of a segmentation fault.
But here is where things get interesting, volume_label deferences pointers in such a way that it clearly expects that neither lp_volume or lp_servicename can return NULL, so what happened?
Security contexts#
Following the Samba team’s guidance on bug reporting, I dialled up logging to 10 in smb.conf and an hour later I’d captured several gigabytes of indecipherable text, dotted with the occasional segmentation fault.
Here’s one, ironically encountered while trying to edit my Obsidian note about these very errors.
[2026/01/22 23:50:56.079345, 10, pid=1225061, effective(8000, 8000), real(8000, 0), class=smb2] source3/smbd/smb2_getinfo.c:277(smbd_smb2_getinfo_send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
smbd_smb2_getinfo_send: obsidian/miffy/investigations/2026-01_smbd_segfautls.md - fnum 1247396707
[2026/01/22 23:50:56.079392, 0, pid=1225061, effective(8000, 8000), real(8000, 0)] lib/util/fault.c:192(smb_panic_log)
PANIC (pid 1225061): Signal 11: Segmentation fault in 4.19.5-Ubuntu
My attention was drawn to the real and effective permissions that are not presented in the regular log level.
I can see that smbd switches through root and user security contexts in the log via calls to change_to_root_user and change_to_user_impersonate in my log.
Now, I can imagine that the intermittent nature of the segfault could be explained by the changing security contexts.
Permission errors#
Here’s where things get interesting. While inspecting the log for clues, I discovered I’d also been encountering non-fatal permission denied errors:
[2026/01/22 23:50:48.084740, 0, pid=1225061, effective(8000, 8000), real(8000, 0)] source3/param/loadparm.c:3480(process_usershare_file)
process_usershare_file: stat of /var/lib/samba/usershares/a_b failed. Permission denied
Not only that, but crucially, these permission errors only occur when smbd has used change_to_user_impersonate to execute in the non-root security context:
$ grep -B1 --no-group-separator 'stat of /var/lib/samba/usershares/a_b' _client.log | grep -v 'Permission denied' | awk '{ print $5,$6,$7,$8 }' | sort | uniq -c
147 effective(65534, 65534), real(65534, 0)]
113 effective(8000, 8000), real(8000, 0)]
Facepalm.
Indeed, my Samba user really had insufficient permissions to read any file in the /var/lib/samba/usershares/ directory, because I had not added them to the sambashare group.
I had created my Samba users with adduser and provided access to SMB with pdbedit, but not realised that ZFS’ sharesmb option would require users to be in the sambashare group too.
I added our Samba users to the sambashare group two weeks ago and have not had a segfault since.
As an experiment, I reverted this change and immediately triggered a segfault in smbd that caused data loss, necessitating a successful unscheduled test of our ZFS snapshot system2.
Segfault: Redux#
So I made the segfault go away, Obsidian has stopped crashing and my reputation as a system administrator is largely unblemished, of course, this was not good enough for me. This curious bun had to know why a seemingly innocuous permission error leads to a segmentation fault.
So here’s the fun part, thanks to the detailed logging, I was able to take a good look at how this crash may occur. I had no familiarity with the Samba codebase until investigating this issue, so the obvious caveats aside, here is the chain of events that I believe leads to the segfault:
Business as usual, and we are drop down to a non-root security context
During some request, we process the path with
get_referred_pathto split out the path components withparse_dfs_path_strictand determine the service number related to the SMB path withlp_servicenumberlp_servicenumberchecks the usershare still exists withusershare_existsBut
usershare_existssilently fails ifsys_lstatfails when reading the usershare file, which it would if we do not have permission to stat itusershare_existsexits false, causing the service to be freed bylp_servicenumberusingfree_service_byindexAs logged, the service corresponding to the usershare we are trying to access is freed by
free_servicecalled viafree_service_byindexAs a result the
ServicePtrs[idx]entry corresponding to the usershare has been freed and is nowNULLWe’re still in
get_referred_path, but failed to lookup the service withlp_servicenumber, so callfind_serviceto… find the service we just freedfind_servicetries to determine if this is a usershare withload_usershare_servicewhich in turn attempts to parse the usershare file into a service withprocess_usershare_fileBut the usershare file cannot be stat and
find_servicefails to find our usershare service, and we exitget_referred_pathwith nothing but aNULLwhere our service used to beSome time passes… A new request from a client comes in and
smbd_do_qfsinfocallsvolume_label, presumably using a stale service number as the client has no idea the service has been freed3As part of
qfsinfo, thevolume_labelfunction interrogateslp_volumeandlp_servicenameto have a nice human label for the client’s requestMy now quite stretched understanding is that
lp_volumeis aFN_LOCAL_SUBSTITUTED_STRINGmacro, and its definition is automatically generated at compile-time intoparam_table_gen.cfrom the corresponding docsThe
FN_LOCAL_SUBSTITUTED_STRINGmacro checks for a valid service and either returns the service’s field, or falls back to the default value; before handling any substituting for its final value. As our service was freed, theLP_SNUM_OK(i)check for our service will befalse, and forlp_volumewe’ll fallback tosDefault.volumeAt first glance the default for the
volumefield in the default service definition isNULL, butinit_globalsloops over the previously mentioned autogenerated compile-time definitions and sets any NULL string defaults in the default service definition tolpcfg_string_empty("") withlpcfg_string_set. It took me quite a while to figure outlp_parm_ptris howinit_globals’ call tolpcfg_string_setwrites into the default service definition.So
lp_volumeis an empty string for our freed service after all, andvolume_labelmoves on to trylp_servicenameinsteadlp_servicenameis similarly anFN_LOCAL_SUBSTITUTED_STRINGmacro but does not have any autogenerated compile-time entries inparm_tableand is not initialised byinit_globalslikevolumeeither. I believe it therefore retains its compile time default ofNULLFalling back to the
NULLdefault,lpcfg_substituted_stringusesloadparm_s3_global_substitution_fnto determine the final output value forlp_servicename; which still returnsNULLif its input isNULLAnd so,
volume_labelfatally callsstrlenon theNULLlp_servicenameresult, causing the segmentation fault
tl;dr#
- Obsidian began crashing on Windows clients after migrating vaults to my NAS
- My partner questioned why they were with someone who could not operate Samba
- It turned out that Samba was segfaulting, seemingly due to permission errors
- I made it stop segfaulting by granting permissions previously not granted
- I couldn’t leave things alone and spent several evenings constructing a plausible chain of events to share with the Samba mailing list and my internet animal friends
I’ve presented my evidence to the Samba developers (a little more dryly than I have here), and will update this post if I hear anything back. If you’ve encountered this issue and it was user permissions all along (or not), please let me know and update the Samba ticket!
I’ll write about it some time, I keep telling myself. ↩︎
I’ll write this up one day too, but a deserved shoutout to
httmfor making the exploration and restoration of files from ZFS snapshots very easy. ↩︎This is the part I am a little fuzzy on, but I’m compelled by the rest of the chain of events to believe this is plausible. ↩︎