How to write two bytes to a chunk of RAM repeatedly in Z80 asm

mel*_*bok 6 assembly z80 texas-instruments

I'm trying to write two bytes (color values) to the VRAM of my TI-84 Plus CE-T calculator, which uses the Zilog eZ80 CPU. The VRAM starts at 0xD40000 and is 0x25800 bytes long. The calculator has a built in syscall called MemSet, which fills a chunk of memory with one byte, but I want it to alternate between two different values and store these in memory. I tried using the following code:

#include "includes\ti84pce.inc"

    .assume ADL=1
    .org userMem-2
    .db tExtTok,tAsm84CeCmp

    call  _homeup
    call  _ClrScrnFull
    ld    hl,13893632     ; = D40000, vram start
    ld    bc,153600       ; = 025800, count/vram length
j1:
    ld    (hl),31         ; set first byte
    inc   hl
    dec   bc
    jr    z,j2            ; jump to end if count==0
    ld    (hl),0          ; set second byte
    inc   hl
    dec   bc
    jr    z,j2            ; jump to end if count==0
    jp    j1              ; loop
j2:
    call  _GetKey
    call  _ClrScrnFull
    ret
Run Code Online (Sandbox Code Playgroud)

I want it to output 31 00 31 00 31 00... into memory starting at 0xD40000, but instead it seems to change only the first byte and jump to the end after doing so. Any ideas on how to fix this?

har*_*old 6

这不起作用:

dec   bc
jr    z,j2
Run Code Online (Sandbox Code Playgroud)

只有8位decinc修改标志。可以通过正确检测是否bc为零来解决。

这是不使用手动循环的另一种技术:

ld    hl,$D40000
ld    (hl),31
inc   hl
ld    (hl),0
dec   hl
ld    de,$D40002
ld    bc,$25800 - 2
ldir
Run Code Online (Sandbox Code Playgroud)


DrD*_*nar 6

首先,如果要移动SP,则需要保存和还原它。其次,您需要禁用中断,否则将出现竞争条件错误:如果中断在副本末尾附近触发,则堆栈将扩展到其下方的任何内容,恰好是VAT。

; Index registers are actually fast on the eZ80
    ld   ix, 0
    add  ix, sp
    di
; Do some hack using SP here
    ld   sp, ix
    ei
Run Code Online (Sandbox Code Playgroud)

@ Ped7g eZ80将缓存任何-IR / -DR后缀指令;与Z80不同,它不会在每次迭代时从内存中重新读取操作码。因此,诸如LDIR之类的指令可以仅在2个总线周期(一次读取和一次写入)中执行每次迭代。因此,SP hack不仅不必要,而且实际上更慢。 SP hack仍然最好留给更有经验的程序员。

The eZ80 is very well pipelined and its performance is limited by its lack of any cache and 1-byte-wide bus. The only instruction that runs slower than the bus is MLT, a 2-bus-cycle instruction that needs 5 clock cycles. For every other instruction, just count the number of bytes in the opcode, and the number of read and write cycles, and you've got its execution time. It's a huge pity that in the TI-84+CE series, TI decided to pair the fast eZ80 with an SRAM that somehow needs four clock cycles for each read and write (at 48 MHz)! Yes, TI, a world leader in semiconductor design, managed to design a slow SRAM. Getting on-die SRAM to perform poorly is an engineering feat.

@harold has the right answer, though I prefer optimizing for size instead of speed outside of inner loops.

#include "includes\ti84pce.inc"

    .assume ADL=1
    .org userMem-2
    .db tExtTok,tAsm84CeCmp

    call  _homeup
    call  _ClrScrnFull
; Initialize registers
    ld    hl, vRam
    ld    bc, lcdWidth * lcdHeight * 2 - 2
    push  hl
    pop   de
; Write initial 2-byte value
    ld    (hl), 31
    inc   hl
    ld    (hl), 0
    inc   hl
    ex    de, hl
; Copy everything all at once.  Interrupts may trigger while this instruction is processing.
    ldir
    call  _GetKey
    call  _ClrScrnFull
    ret
Run Code Online (Sandbox Code Playgroud)

On EFnet, #ez80-dev is a good place to ask questions. cemetech.net is also a good place.


Ped*_*d7g 5

tum_ 答案的变化与更快的dec bc循环零测试机制。

    LD   SP,$D65800    ; <end of VRAM>: 0xD40000+0x25800
    LD   BC,$004B      ; 0x4B many times (in C) the 256x inner loop (B=0)
        ; that results into 0x4B00 repeats of loop, which when 8 bytes per loop
        ; are set makes the total 0x25800 bytes (VRAM size)
        ; (if you would unroll it for more than 8 bytes, it will be a bit more
        ; tricky to calculate the initial BC to get correct amount of looping)
        ; (not that much tricky, just a tiny bit)
    LD   HL,31         ; H <- 0, L <- 31
.L1
    PUSH HL            ; (SP – 2) <- L, (SP – 1) <- H, SP <- SP - 2
    PUSH HL            ; set 8 bytes in each iteration
    PUSH HL
    PUSH HL
    DJNZ .L1           ; loop by B value (in this example it starts as 0 => 256x loop)
    DEC  C             ; loop by C ("outer" counter)
    JR   NZ,.L1        ; btw JP is faster than JR on original Z80, but not on eZ80
.END
Run Code Online (Sandbox Code Playgroud)

(BTW我没做过eZ80编程,也没在调试器里验证过,所以这个有点假设。。。其实想想,不是pusheZ80 32位的吗?init的hl应该是ld hl,$001F001F设置4字节为 single push,并且循环的内部主体应该只有两个push hl)

(但我没有Z80编程的,所以这就是为什么我即使评论打扰这个话题,即使我还没有看到EZ80代码以往任何时候)

编辑:原来 eZ80 推送是 24 位的,即上面的代码会产生不正确的结果。它当然可以轻松修复(因为问题是实现细节,而不是主体),例如:

    LD   SP,$D65800    ; <end of VRAM>: 0xD40000+0x25800
    LD   BC,$0014      ; 0x14 many times (in C) the 256x inner loop (B=0)
        ; that results into 0x1400 repeats of loop, which with 30 bytes per
        ; loop set makes the total 0x25800 bytes (VRAM size)
    LD   HL,$1F001F    ; will set bytes 31,  0, 31
    LD   DE,$001F00    ; will set bytes  0, 31,  0
.L1
    PUSH DE
    PUSH HL
        ; here SP = SP-6, and 6 bytes 31, 0, 31, 0, 31, 0 were set
    PUSH DE
    PUSH HL
    PUSH DE
    PUSH HL
    PUSH DE
    PUSH HL
    PUSH DE
    PUSH HL            ; unrolled 5 times to set 30 bytes in total
    DJNZ .L1           ; loop by B value (in this example it starts as 0 => 256x loop)
    DEC  C             ; loop by C ("outer" counter)
    JR   NZ,.L1
Run Code Online (Sandbox Code Playgroud)

  • @melbok你添加了`sp`保存吗?就像在数据部分中的某处保留 24 位内存空间(不确定你的汇编语法是什么,也许是 `OldSp: ds 3` ?),并在 `ld sp,...` 之前的代码部分中先执行 `ld (OldSp ),sp` 并在填充 `ld sp,(OldSp)` 之后将其恢复。(就像你了解什么是 `sp` 以及它在这个内存填充器中的“通常”用途是如何“滥用”的吗?)(可能不知道,因为你确实在它后面直接添加了 `call _GetKey` :) ... 在 ` call` 你需要已经恢复了 `sp`,否则你会覆盖 VRA​​M 下的内存(如果有可写的话) (2认同)
  • @tum_如果他刚刚开始使用汇编,那么再多的免责声明也无济于事,有这么多新的东西和细节,我可以想象它是相当压倒性的......还有汇编的新手,如果他们之前用高级语言进行过一些编程,通常没有意识到组装所需的精度水平,只是跳过一两个词,甚至整个免责声明,因为“不可能那么重要,对吧?”... :D ... 所以它是我们俩都有点讨厌不提供完整的工作代码,包括`sp`保存......然后OP甚至没有指定汇编程序(语法):) (2认同)