how-to-remove-html-tags-from-a-string-using-javascript?

You can utilize the replace() function alongside a regular expression to eliminate HTML tags from a string using JavaScript.

In the context of web development, you may find yourself needing to extract unformatted text from HTML strings by eliminating all associated tags. There are various techniques available for stripping these tags, contingent on the complexity and specifications of the input. This blog will delve into these techniques.

Table of Contents:

Techniques to Remove HTML Tags from a String Using JavaScript

To eliminate HTML tags from a string utilizing JavaScript, you can apply the replace() function, DOMParser, innerText, or textContent with document.createElement. Let’s examine these techniques in detail below.

Technique 1: Using replace() Function

This method allows you to swiftly eliminate HTML tags from a string. It works by searching for the enclosed angle brackets (<>) and discarding them. It is useful for straightforward scenarios but may not be effective with malformed HTML or deeply nested structures. As regex is not a comprehensive HTML parser, it cannot interpret HTML entities like & to &.

Example:

<!DOCTYPE html
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Remove HTML Tags</title>
    <script>
        function removeHTMLTags(str) {
            return str.replace(/<[^>]*>/g, '');
        }

        function processText() {
            const input = document.getElementById("htmlInput").value;
            document.getElementById("output").innerText = removeHTMLTags(input);
        }
    </script>
</head>
<body>

    <h2>Remove HTML Tags Example</h2>
    <textarea id="htmlInput" rows="4" cols="50"><p>Intelli<strong>paat</strong>!</p></textarea>
    <br>
    <button onclick="processText()">Remove Tags</button>
    <h3>Output:</h3>
    <p id="output"></p>
</body>
</html>

Output:

Using replace() Function Output

Explanation: The code eliminates the tags and presents the unformatted content in the output section upon clicking the “Remove Tags” button. The HTML tags are substituted with an empty string using a regular expression.

Technique 2: Utilizing the DOMParser API

This approach transforms the string into a temporary HTML document. It allows for the accurate parsing of all tags and their subsequent removal while preserving the text content. Additionally, it effectively manages malformed HTML.

Example:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Remove HTML Tags</title>
    <script>
        function removeHTMLTagsUsingDOMParser(str)  "";
        

        function processText() {
            const input = document.getElementById("htmlInput").value;
            document.getElementById("output").innerText = removeHTMLTagsUsingDOMParser(input);
        }
    </script>
</head>
<body>
    <h2>Remove HTML Tags Example</h2>
    <textarea id="htmlInput" rows="4" cols="50"><p>Intelli<strong>paat</strong>!</p></textarea>
    <br>
    <button onclick="processText()">Remove Tags</button>
    <h3>Output:</h3>
    <p id="output"></p>
</body>
</html>

Output:

Using the DOMParser API Output

Explanation: This code employs the DOMParser to transform HTML strings into plain text, showcasing the output section when the Remove Tags button is clicked.

Technique 3: Applying innerText or textContent with document.createElement

This technique enables the removal of HTML tags by utilizing the innerText or textContent properties of a temporary div element. You set the inner HTML of a temporary element and then retrieve the text content. This approach is compatible across various browsers.

Example:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Remove HTML Tags</title>
    <script>
        function removeHTMLTagsUsingElement(str) 

        function processText() {
            const input = document.getElementById("htmlInput").value;
            document.getElementById("output").innerText = removeHTMLTagsUsingElement(input);
        }
    </script>
</head>
<body>
    <h2>Remove HTML Tags Example</h2>
    <textarea id="htmlInput" rows="4" cols="50"><p>Intelli <strong>paat</strong>!</p></textarea>
    <br>
    <button onclick="processText()">Remove Tags</button>
    <h3>Output:</h3>
    <p id="output"></p>
</body>
</html>

Output:

Using innerText or textContent with document.createElement

Explanation: This code creates a temporary div element to capture the HTML tags and display the plain text content upon clicking the Remove Tags button.

Summary

The techniques described above represent the most effective means of removing HTML tags from a string using JavaScript. You can utilize the replace() function, DOMParser, innerText, or textContent in conjunction with documents to eliminate HTML tags. Stripping HTML tags not only safeguards against cross-site vulnerabilities but also enables the display of unformatted content, thus enhancing user experience on the website.

How to Remove HTML Tags from a String in JavaScript? – FAQs

The article How to Remove HTML Tags from a String using JavaScript? first appeared on Intellipaat Blog.


Leave a Reply

Your email address will not be published. Required fields are marked *

Share This