CLEAN function

The CLEAN function in Excel is used to remove all non-printable characters from a text string. These characters, often created by copying and pasting data from different sources or applications, can cause issues with your data analysis and formatting. The CLEAN function helps ensure that only printable characters remain in the text.

Syntax

=CLEAN(text)

Parameters

  • text (required): The text string or cell reference containing the text you want to clean. This could be a string with unwanted non-printable characters.

Key Points

  • Non-printable characters: These are characters that are not visible when displayed in a cell. They might include things like line breaks, carriage returns, tabs, or other special characters that do not have a visual representation in Excel.
  • Range of characters removed: The CLEAN function removes characters with ASCII codes 1 through 31, as well as the DEL character (ASCII code 127). These are typically control characters like line breaks, carriage returns, or other formatting characters.
  • Formatting issues: Often, non-printable characters may cause issues like unexpected behavior when copying data to other applications or when processing text for analysis. Using the CLEAN function helps in resolving these issues.

Example Usage

1. Removing Non-Printable Characters

Suppose cell A1 contains the text "Hello" & CHAR(10) & "World", where CHAR(10) represents a line break.

=CLEAN(A1)

This will return:

HelloWorld

It removes the line break (ASCII code 10) from the text.

2. Cleaning Text with Multiple Non-Printable Characters

If a cell contains various non-printable characters that might not be immediately visible, such as carriage returns or tabs, using CLEAN will remove them all.

=CLEAN("Hello" & CHAR(9) & "World")

In this example, CHAR(9) is the tab character, and CLEAN will remove it, resulting in:

HelloWorld

3. Using CLEAN with a Cell Reference

If cell B1 contains some text with unwanted non-printable characters, you can apply CLEAN to remove them.

=CLEAN(B1)

This will return the text from B1, but with all non-printable characters removed.


Notes

  • Visible vs. Non-Printable Characters: The characters that are removed by the CLEAN function are not visible in the Excel sheet, but they can interfere with copying, pasting, or processing the data.
  • Compatibility: The CLEAN function is useful when working with data that has been imported from external sources, especially from systems that might insert non-printable characters (e.g., when importing data from web scraping or other systems).
  • No Impact on Printable Characters: The CLEAN function only affects non-printable characters, so any regular letters, numbers, punctuation marks, and spaces remain intact.

Related Functions

  • TRIM: The TRIM function removes extra spaces from a text string, except for single spaces between words. It can be used in combination with CLEAN to clean text that has both non-printable characters and extra spaces.

    Example:

    =TRIM(CLEAN(A1))
    
  • SUBSTITUTE: If you need to replace specific non-printable characters (or any characters) with something else, you can use the SUBSTITUTE function. For instance, if you want to replace all line breaks with a space, you can use:
    =SUBSTITUTE(A1, CHAR(10), " ")
    
  • TEXT: While not directly related, the TEXT function can be used to format numbers or other values as text, ensuring consistent output when working with clean data.

The CLEAN function is especially useful for cleaning up text data by removing non-printable characters that can cause issues in formatting or further processing. It is commonly used in data cleaning and preparation steps.

Leave a Reply 0

Your email address will not be published. Required fields are marked *