MID, MIDB functions

The MID and MIDB functions in Excel are used to extract a specific portion of a text string, starting from a given position. The key difference between these two functions is how they handle single-byte and double-byte characters (DBCS, such as Chinese, Japanese, or Korean characters).

1. MID Function

The MID function returns a substring from a text string, starting at a specified position and extracting a defined number of characters.

Syntax:

=MID(text, start_num, num_chars)

Parameters:

  • text (required): The text string from which you want to extract characters.
  • start_num (required): The position of the first character you want to extract. The first character in the string has a start_num of 1.
  • num_chars (required): The number of characters to extract, starting from start_num.

Key Points:

  • The function works with characters, so it handles both single-byte and double-byte characters.
  • It counts from the left side of the text string.

Example Usage of MID

1. Extracting a Substring

If cell A1 contains the text “Hello World” and you want to extract the 6th through 10th characters (i.e., “World”), the formula:

=MID(A1, 6, 5)

This will return:

World

The extraction starts at position 6 and extracts 5 characters.

2. Extracting a Single Character

If you want to extract only the 1st character from the string “Hello”, the formula:

=MID("Hello", 1, 1)

This will return:

H

3. Using MID on Longer Text

If cell A2 contains “Excel Functions”, and you want to extract from the 7th character to the 10th character (i.e., “Func”), the formula:

=MID(A2, 7, 4)

This will return:

Func

2. MIDB Function

The MIDB function is similar to the MID function, but it counts bytes rather than characters. It is designed for use with double-byte character sets (DBCS), such as languages like Chinese, Japanese, or Korean, where each character may take more than one byte.

Syntax:

=MIDB(text, start_num, num_bytes)

Parameters:

  • text (required): The text string from which you want to extract bytes.
  • start_num (required): The position of the first byte to extract.
  • num_bytes (required): The number of bytes to extract.

Key Points:

  • MIDB counts bytes, so it treats double-byte characters (commonly used in DBCS languages) as two bytes.
  • The function behaves similarly to MID but operates with byte-level accuracy, which may result in extracting a different number of characters when working with DBCS.

Example Usage of MIDB

1. Extracting Bytes from a DBCS String

If cell A1 contains the text “こんにちは World” (Japanese, where each character in “こんにちは” is a double-byte character), and you want to extract the first 6 bytes, the formula:

=MIDB(A1, 1, 6)

This will return:

こん

Explanation:

  • “こんにちは” (Japanese) uses double-byte characters, so MIDB counts in bytes, where each Japanese character occupies 2 bytes.
  • Extracting 6 bytes will give the first 3 characters: “こん”.

2. Extracting Bytes from a String with Single-Byte Characters

If cell A2 contains the text “Hello”, and you want to extract the first 4 bytes, the formula:

=MIDB(A2, 1, 4)

This will return:

Hell

In this case, since all characters are single-byte, the number of bytes equals the number of characters, so “Hell” is returned.

3. Extracting More Bytes Than the String Contains

If cell A3 contains “Hi” (2 single-byte characters), and you request 6 bytes:

=MIDB(A3, 1, 6)

This will return:

Hi

Since the string only contains 2 characters (2 bytes), the function will return the entire string.


Key Differences Between MID and MIDB

  • Character vs Byte Counting:
    • MID operates with characters and is typically used with single-byte characters (e.g., English text).
    • MIDB operates with bytes and is designed for double-byte character sets (DBCS), which are common in East Asian languages (e.g., Chinese, Japanese, Korean).
  • Use Case:
    • Use MID for standard text or when you want to work with characters, especially in single-byte character sets.
    • Use MIDB when dealing with double-byte characters in languages that use a larger byte size for each character (like Chinese, Japanese, or Korean).

Notes:

  • Byte Handling: In MIDB, if the text contains mixed single-byte and double-byte characters, the byte count will include the total number of bytes, not characters. For instance, a single-byte character will count as 1 byte, while a double-byte character counts as 2 bytes.
  • Character Handling: The MID function is suitable for text strings where characters are stored in a standard encoding, such as English. If you’re working with non-Latin scripts or languages using DBCS, you may need to use MIDB to handle the text accurately.

Related Functions:

  • LEFT and LEFTB: Extract a substring from the left of a text string (character-based or byte-based).
  • RIGHT and RIGHTB: Extract a substring from the right of a text string (character-based or byte-based).
  • LEN and LENB: Count the number of characters (or bytes) in a text string.

The MID and MIDB functions allow for flexible substring extraction, with MID focusing on characters and MIDB focusing on bytes, making them crucial for handling text in different languages and character encodings.

Leave a Reply 0

Your email address will not be published. Required fields are marked *