Thanks a lot, I just tried out your code and it does work.Hi rgz,
I didn't dive too deep into your issue, but instead put something together to achieve what you were trying to do. Perhaps by comparing mine vs yours, you'll work out what was wrong.
I created the following uart.s file:Then I built it as follows:Code:
.global _start.align 2_start: mrs x0, mpidr_el1 // x0 = Multiprocessor Affinity Register value. and x0, x0, #0x3 // x0 = core number. cbz x0, 2f // Put all cores except core 0 to sleep.1: wfi b 1b2:# ------------------------------------------------------------------------------# Initialise the Mini UART interface for logging over serial port.# Note, this is Broadcomm's own UART, not the ARM licenced UART interface.# ------------------------------------------------------------------------------ ldr x1, =0xfe215000 // x1 = aux_base ldr w2, [x1, #0x4] // w2 = [AUX_ENABLES] (Auxiliary enables) orr w2, w2, #1 str w2, [x1, #0x4] // [AUX_ENABLES] |= 0x00000001 => Enable Mini UART. str wzr, [x1, #0x44] // [AUX_MU_IER] = 0x00000000 => Disable Mini UART interrupts. str wzr, [x1, #0x60] // [AUX_MU_CNTL] = 0x00000000 => Disable Mini UART Tx/Rx mov w2, #0x6 // w2 = 6 str w2, [x1, #0x48] // [AUX_MU_IIR] = 0x00000006 => Mini UART clear Tx, Rx FIFOs mov w3, #0x3 // w3 = 3 str w3, [x1, #0x4c] // [AUX_MU_LCR] = 0x00000003 => Mini UART in 8-bit mode. str wzr, [x1, #0x50] // [AUX_MU_MCR] = 0x00000000 => Set UART1_RTS line high. mov w2, 0x0000021d str w2, [x1, #0x68] // [AUX_MU_BAUD] = 0x0000021d // => baudrate = system_clock_freq/(8*([AUX_MU_BAUD]+1)) // (as close to 115200 as possible) ldr x4, =0xfe200000 // x4 = [gpio_base] = 0xfe200000 ldr w2, [x4, #0x4] // w2 = [GPFSEL1] and w2, w2, #0xfffc0fff // Unset bits 12, 13, 14 (FSEL14 => GPIO Pin 14 is an input). // Unset bits 15, 16, 17 (FSEL15 => GPIO Pin 15 is an input). orr w2, w2, #0x00002000 // Set bit 13 (FSEL14 => GPIO Pin 14 takes alternative function 5). orr w2, w2, #0x00010000 // Set bit 16 (FSEL15 => GPIO Pin 15 takes alternative function 5). str w2, [x4, #0x4] // [GPFSEL1] = updated value => Enable UART 1. str wzr, [x4, #0x94] // [GPPUD] = 0x00000000 => GPIO Pull up/down = OFF mov x5, #0x96 // Wait 150 instruction cycles (as stipulated by datasheet). 3: subs x5, x5, #0x1 // x0 -= 1 b.ne 3b // Repeat until x0 == 0. mov w2, #0xc000 // w2 = 2^14 + 2^15 str w2, [x4, #0x98] // [GPPUDCLK0] = 0x0000c000 => Control signal to lines 14, 15. mov x0, #0x96 // Wait 150 instruction cycles (as stipulated by datasheet). 4: subs x0, x0, #0x1 // x0 -= 1 b.ne 4b // Repeat until x0 == 0. str wzr, [x4, #0x98] // [GPPUDCLK0] = 0x00000000 => Remove control signal to lines 14, 15. str w3, [x1, #0x60] // [AUX_MU_CNTL] = 0x00000003 => Enable Mini UART Tx/Rx# ------------------------------------------------------------------------------# Send 'ABC' over Mini UART# ------------------------------------------------------------------------------ mov w0, #0x41 // start char 'A' mov w3, #0x43 // end char 'C' 5: 6: ldr w2, [x1, #0x54] // w2 = [AUX_MU_LSR] tbz x2, #5, 6b // Repeat last statement until bit 5 is set. strb w0, [x1, #0x40] // [AUX_MU_IO] = w0 add w0, w0, #1 cmp w0, w3 b.le 5b b 1bNote, the kernel image I created is called uart.img. Then I created the following config.txt:Code:
$ aarch64-none-elf-as -o uart.o uart.s$ aarch64-none-elf-ld --no-warn-rwx-segments -N -Ttext=0x80000 -o uart.elf uart.o$ aarch64-none-elf-objcopy --set-start=0x80000 uart.elf -O binary uart.imgI deployed this to my rasbperry pi 400, and it worked. Hopefully it should for you too!Code:
# Stop start.elf from filling in ATAGS (memory from 0x100) before launching the# kernel.disable_commandline_tags=1# Disable initial turbo since it affects Mini UART timing.# See https://github.com/raspberrypi/firmware/issues/1945initial_turbo=0# Dynamic frequency clocking minimum value needed for Mini UART to function correctly.core_freq_min=500# Ensure the bootloader loads a 64-bit kernel (not entirely necessary but a# good safety check).arm_64bit=1# Kernel image to loadkernel=uart.img
One note: in your code, I see you are setting x29 and x30 to peripheral addresses. This may work, but is somewhat unconventional. Normally x29 is used as the frame pointer, and x30 is the link register. When you call a function with a bl instruction, x30 will be set to the return address, so that ret instruction functions correctly. So as soon as you execute a bl instruction, the x30 register will be overwritten. For this reason, you might prefer to use different registers for storing those peripheral addresses.
I'll try to understand where I went wrong and post here when I figure it out.
Also thanks about the description on the x29 and x30 registers, when I wrote that code I didn't really dig into AAPCS64 yet but now I've gotten familiar with how to setup and use frame records on the stack.
Statistics: Posted by rgz — Fri Mar 28, 2025 12:03 pm