The string instructions facilitate operations on sequences of bytes or words. None of them take an explicit operand; instead, they all work implicitly on the source and/or destination strings. The current element (byte or word) of the source string is at DS:SI, and the current element of the destination string is at ES:DI. Each instruction works on one element and then automatically adjusts SI and/or DI; if the Direction flag is clear, then the index is incremented, otherwise it is decremented (when working with overlapping strings it is sometimes necessary to work from back to front, but usually you should leave the Direction flag clear and work on strings from front to back).
To work on an entire string at a time, each string instruction can be accompanied by a repeat prefix, either REP or one of REPE and REPNE (or their synonyms REPZ and REPNZ). These cause the instruction to be repeated the number of times in the count register, CX; for REPE and REPNE, the Zero flag is tested at the end of each operation and the loop is stopped if the condition (Equal or Not Equal to zero) fails.
The MOVSB and MOVSW instructions have the following forms:
MOVSB REP MOVSB MOVSW REP MOVSWThe first form copies a single byte from the source string, at address DS:SI, to the destination string, at address ES:DI, then increments (or decrements, if the Direction flag is set) both SI and DI. The second form performs this operation and then decrements CX; if CX is not zero, the operation is repeated. The effect is equivalent to the following pseudo-C code:
while (CX != 0) { *(ES*16 + DI) = *(DS*16 + SI); SI++; DI++; CX--; }(recall that ES*16 + DI is the physical address corresponding to the segment and offset ES:DI). The remaining two forms move a word at a time, instead of a single byte; correspondingly, SI and DI are incremented or decremented by 2 each time through the loop.
The STOSB and STOSW instructions are similar to MOVSB and MOVSW, except the source byte or word comes from AL or AX instead of the memory address in DS:SI. For example, the following is a very fast way to initialize the block of memory from ES:1000h to ES:4FFFh with zeroes:
MOV DI, 1000h ;Starting address MOV CX, 2000h ;Number of words MOV AX, 0 ;Word to store at each location CLD ;Make sure direction is increasing REP STOSW ;Perform the initializationCorrespondingly, the LODSB and LODSW instructions are variations on the move instructions where the destination is the accumulator (instead of the memory address in ES:DI). These are not very useful operations with the repeat prefix; instead, they are used as part of larger loops to perform more complex string processing. For example, here is a program fragment that will convert the NUL-terminated string starting at the address in DX to be all lower-case (there is a faster way to do the conversion of each character, using the XLATB instruction, but that is not the point here):
MOV SI, DX ;Initialize source MOV DI, DX ; and destination indices MOV AX, DS ;Copy DS (source segment) MOV ES, AX ; into ES (destination segment) CLD NextCh LODSB ;Load next character into AL CMP AL, 'A' JB NotUC ;Jump if below 'A' CMP AL, 'Z' JA NotUC ; or above 'Z' ADD AL, 'a' - 'A' ;Convert UC to lc NotUC STOSB ;Store modified character back CMP AL, 0 JNE NextCh ;Do next character if not at end of stringNone of the preceding string operations have any effect on the status flags. By contrast, the remaining two string operations are executed solely for their effect on the status flags, just like the CMP operation on numbers. The CMPSB and CMPSW operations compare the current bytes or words of the source and destination strings by subtracting the destination from the source and recording the properties of the result in FLAGS. The SCASB and SCASW operations are the variants of this that use the accumulator (AL or AX) for the source. Each of these may be preceded by either of the repeat prefixes REPE or REPNE, which cause the operation to be repeated up to CX times, as long as the condition holds true after each iteration. Here is the corresponding pseudo-C for REPE CMPSB:
while (CX != 0) { SetFlags(*(DS*16 + SI) - *(ES*16 + DI)); SI++; DI++; CX--; if (!ZeroFlag) break; }A common use of the REPNE SCASB instruction is to find the length of a NUL-terminated string. Here is an example:
MOV DI, DX ;Starting address in DX (assume ES = DS) MOV AL, 0 ;Byte to search for (NUL) MOV CX, -1 ;Start count at FFFFh CLD ;Increment DI after each character REPNE SCASB ;Scan string for NUL, decrementing CX for each char MOV AX, -2 ;CX will be -2 for length 0, -3 for length 1, ... SUB AX, CX ;Length in AX