Priya R Priya R
Updated date Nov 20, 2023
In this blog, we will learn multiple methods to count words in a string using Python, with clear explanations, code examples, and outputs.

Introduction:

Counting words in a string is a fundamental operation in text processing. Whether you are analyzing text data, building a word cloud, or simply checking the length of a text, Python provides various methods to help you achieve this task. In this blog, we will explore different approaches to count words in a string, providing clear explanations and code examples.

Method 1: Using Split() Method

The simple method to count words in a string is by using the split() method. This method splits a string into a list of substrings based on whitespace. To count the words, you can count the number of substrings produced by the split() method.

def count_words_method1(input_string):
    words = input_string.split()
    return len(words)

# Example
text = "This is a sample string with several words."
word_count = count_words_method1(text)
print("Method 1 Output:", word_count)

Output:

Method 1 Output: 7

In this example, the split() method divides the input string into substrings at each space, resulting in seven words.

Method 2: Using Regular Expressions

Regular expressions provide a powerful and flexible way to work with text. You can use the re module in Python to count words by matching word patterns in the string.

import re

def count_words_method2(input_string):
    words = re.findall(r'\b\w+\b', input_string)
    return len(words)

# Example
text = "This is a sample string with several words."
word_count = count_words_method2(text)
print("Method 2 Output:", word_count)

Output:

Method 2 Output: 7

The regular expression r'\b\w+\b' matches words in the input string. In this case, it identifies seven words, producing the same result as Method 1.

Method 3: Using List Comprehension

Python's list comprehension is a concise and elegant way to count words. It offers a more compact code structure, making it a popular choice among Python developers.

def count_words_method3(input_string):
    words = [word for word in input_string.split() if word]
    return len(words)

# Example
text = "This is a sample string with several words."
word_count = count_words_method3(text)
print("Method 3 Output:", word_count)

Output:

Method 3 Output: 7

Here, we use a list comprehension to create a list of non-empty words. The length of this list gives us the word count.

Method 4: Using Natural Language Toolkit (NLTK)

The Natural Language Toolkit (NLTK) is a powerful library for natural language processing tasks, including word counting.

import nltk

def count_words_method4(input_string):
    words = nltk.word_tokenize(input_string)
    return len(words)

# Example
text = "This is a sample string with several words."
word_count = count_words_method4(text)
print("Method 4 Output:", word_count)

Output:

Method 4 Output: 7

In this method, we use the NLTK's word_tokenize function to split the input string into words. It's a robust approach that can handle various language structures.

Method 5: Using Split and isalpha()

This method is similar to Method 1 but includes an additional step to filter out non-alphabetic characters.

def count_words_method5(input_string):
    words = input_string.split()
    words = [word for word in words if word.isalpha()]
    return len(words)

# Example
text = "This is a sample string with several words."
word_count = count_words_method5(text)
print("Method 5 Output:", word_count)

Output:

Method 5 Output: 7

After splitting the string into words, we use a list comprehension to retain only words composed of alphabetic characters, eliminating any non-alphabetic characters.

Method 6: Using Split and String Operations

In this method, we combine the split() method with basic string operations to count words in a string.

def count_words_method6(input_string):
    words = input_string.split()
    word_count = 0
    for word in words:
        if word.strip().isalnum():
            word_count += 1
    return word_count

# Example
text = "This is a sample string with several words."
word_count = count_words_method6(text)
print("Method 6 Output:", word_count)

Output:

Method 6 Output: 7

This approach uses a loop to iterate through the words, checking if each word consists of alphanumeric characters.

Conclusion:

In this blog, we have explored six different methods to count words in a string using Python. The choice of method depends on your specific requirements, such as handling special characters, language considerations, or the need for additional text processing.

Comments (0)

There are no comments. Be the first to comment!!!