When the loop is set to 1, the minimum delay is 21 uS. In this case you can better use a NOP that generates 1 clock cycle delay.
At 4 MHz the minimum delay is 5 uS. So a waitus 3 will also generate 5 uS delay.
Above these values the delay will become accurate.....
daran liegt es > waitus 833 sind keine 833 sondern mehr.

eine routine für 8mhz :

Sub Wait_us_833()
$asm
ldi R17, $0A
Wgloop1:
ldi R18, $DD
Wgloop2:
dec R18
brne WGLOOP2
dec R17
brne WGLOOP1
ldi R17, $01
Wgloop3:
dec R17
brne WGLOOP3
Nop
$end Asm
End Sub