How to title case in Pyspark

Keeping text in right format is always important. The data coming out of Pyspark eventually helps in presenting the insights. In case the texts are not in proper format, it will require additional cleaning in later stages. Fields can be present as mixed case in the text. The objective is to create proper case column, to achieve this Pyspark has title function. Pyspark string function str.title() helps in creating title case or proper case in Pyspark. In this article we will learn how to do title case in Pyspark with the help of an example.

Emma has customer data available with her for her company. She has full name field available. The field is all lower case. She wants to create proper case field from the same. For example, for “nicol farherty” new full name should look like “Nicol Farherty”

How to title case in Pyspark

Pyspark Title Case Example

  • Step 1: Import all the necessary modules.
import pandas as pd
import findspark
findspark.init()
import pyspark
from pyspark import SparkContext
from pyspark.sql import SQLContext 
sc = SparkContext("local", "App Name")
sql = SQLContext(sc)
import pyspark.sql.functions as func
  • Step 2: Use sql.functions initcap function to convert text to proper case or title case. To use this function, pass the column name along with Dataframe which helps to identify column for upcase. Here is the syntax to upcase ‘Full Name’ column.
Customer_Data = Customer_Data.withColumn("Full Name Updated",func.initcap(func.col("Full Name")))
  • Step 3: Check the output data quality to assess the observations in final Dataframe.
Customer_Data.show()
How to title case in Pyspark

Thus, Emma is able to create column in Dataframe as per her requirement in Pyspark.

To get top certifications in Pyspark and build your resume visit here. Additionally, you can read books listed here to build strong knowledge around Pyspark. 

Visit us below for video tutorial:

📬 Stay Ahead in Data Science & AI – Subscribe to Newsletter!

  • 🎯 Interview Series: Curated questions and answers for freshers and experienced candidates.
  • 📊 Data Science for All: Simplified articles on key concepts, accessible to all levels.
  • 🤖 Generative AI for All: Easy explanations on Generative AI trends transforming industries.

💡 Why Subscribe? Gain expert insights, stay ahead of trends, and prepare with confidence for your next interview.

👉 Subscribe here:

Related Posts