Python: удалить переносы строк и лишние пробелы из строки?
Всем привет. Подскажите плз, как решить задачу с минимальным изобретанием велосипедов. Нужно очистить строку от символов переноса (заменить на пробелы) и убрать лишние пробелы и пустые строки.
Сейчас это делается вот так:
‘ ‘.join(filter(None, map(unicode.strip, input_string.splitlines())))
Может есть более стандартный способ?
Попытки привлечь либу textwrap приводят только к раздутию кода… Может, я не умею ее готовить?
Python: Remove Newline Character from String
In this tutorial, you’ll learn how to use Python to remove newline characters from a string.
Working with strings in Python can be a difficult game, that often comes with a lot of pre-processing of data. Since the strings we find online often come with many issues, learning how to clean your strings can save you a lot of time. One common issue you’ll encounter is additional newline characters in strings that can cause issues in your work.
The Quick Answer: Use Python string.replace()
Table of Contents
What are Python Newline Characters
Python comes with special characters to let the computer know to insert a new line. These characters are called newline characters. These characters look like this: \n .
When you have a string that includes this character, the text following the newline character will be printed on a new line.
Let’s see how this looks in practice:
Now that you know how newline characters work in Python, let’s learn how you can remove them!
Use Python to Remove All Newline Characters from a String
Python’s strings come built in with a number of useful methods. One of these is the .replace() method, which does exactly what it describes: it allows you to replace parts of a string.
Let’s see what we’ve done here:
- We passed the string.replace() method onto our string
- As parameters, the first positional argument indicates what string we want to replace. Here, we specified the newline \n character.
- The second argument indicates what to replace that character with. In this case, we replaced it with nothing, thereby removing the character.
In this section, you learned how to use string.replace() to remove newline characters from a Python string. In the next section, you’ll learn how to replace trailing newlines.
Tip! If you want to learn more about how to use the .replace() method, check out my in-depth guide here.
Use Python to Remove Trailing Newline Characters from a String
There may be times in your text pre-processing that you don’t want to remove all newline characters, but only want to remove trailing newline characters in Python. In these cases, the .replace() method isn’t ideal. Thankfully, Python comes with a different string method that allows us to to strip characters from the trailing end of a string: the .rstrip() method.
Let’s dive into how this method works in practise:
The Python .rstrip() method works by removing any whitespace characters from the string. Because of this, we didn’t need to specify a new line character.
If you only wanted to remove newline characters, you could simply specify this, letting Python know to keep any other whitespace characters in the string. This would look like the line below:
In the next section, you’ll learn how to use regex to remove newline characters from a string in Python.
Tip! If you want to learn more about the .rstrip() (as well as the .lstrip() ) method in Python, check out my in-depth tutorial here.
Use Python Regex to Remove Newline Characters from a String
Python’s built-in regular expression library, re , is a very powerful tool to allow you to work with strings and manipulate them in creative ways. One of the things we can use regular expressions (regex) for, is to remove newline characters in a Python string.
Let’s see how we can do this:
Let’s see what we’ve done here:
- We imported re to allow us to use the regex library
- We use the re.sub() function, to which we passed three parameters: (1) the string we want to replace, (2), the string we want to replace it with, and (3) the string on which the replacement is to be done
It may seem overkill to use re for this, and it often is, but if you’re importing re anyway, you may as well use this approach, as it lets you do much more complex removals!
Python Remove Newline From String
There are times where we need to remove the newline from string while processing massive data. This tutorial will learn different approaches to strip newline characters from string in Python.
Python Remove Newline From String
In Python new line character is represented with “ \n .” Python’s print statement by default adds the newline character at the end of the string.
There are 3 different methods to remove the newline characters from the string.
- strip() method
- replace() method
- re.sub() method
Using strip() method to remove the newline character from a string
The strip() method will remove both trailing and leading newlines from the string. It also removes any whitespaces on both sides of a string.
If the newline is at the end of the string, you could use the rstrip() method to remove a trailing newline characters from a string, as shown below.
Using replace() method to remove newlines from a string
The replace() function is a built-in method, and it will replace the specified character with another character in a given string.
In the below code, we are using replace() function to replace the newline characters in a given string. The replace() function will replace the old character and substitute it with an empty one.
Similarly, if we need to replace inside newline characters in a list of strings, we can iterate it through for loop and use a replace() function to remove the newline characters.
We can also use the map function in Python to iterate the list of strings and remove the newline characters, as shown below. It would be a more optimized and efficient way of coding when compared to the for a loop.
Using regex to remove newline character from string
Another approach is to use the regular expression functions in Python to replace the newline characters with an empty string. The regex approach can be used to remove all the occurrences of the newlines in a given string.
The re.sub() function is similar to replace() method in Python. The re.sub() function will replace the specified newline character with an empty character.
report this ad
How can I remove a trailing newline?
How can I remove the last character of a string if it is a newline?
27 Answers 27
Try the method rstrip() (see doc Python 2 and Python 3)
Python’s rstrip() method strips all kinds of trailing whitespace by default, not just one newline as Perl does with chomp .
To strip only newlines:
In addition to rstrip() , there are also the methods strip() and lstrip() . Here is an example with the three of them:
And I would say the «pythonic» way to get lines without trailing newline characters is splitlines().
The canonical way to strip end-of-line (EOL) characters is to use the string rstrip() method removing any trailing \r or \n. Here are examples for Mac, Windows, and Unix EOL characters.
Using ‘\r\n’ as the parameter to rstrip means that it will strip out any trailing combination of ‘\r’ or ‘\n’. That’s why it works in all three cases above.
This nuance matters in rare cases. For example, I once had to process a text file which contained an HL7 message. The HL7 standard requires a trailing ‘\r’ as its EOL character. The Windows machine on which I was using this message had appended its own ‘\r\n’ EOL character. Therefore, the end of each line looked like ‘\r\r\n’. Using rstrip(‘\r\n’) would have taken off the entire ‘\r\r\n’ which is not what I wanted. In that case, I simply sliced off the last two characters instead.
Note that unlike Perl’s chomp function, this will strip all specified characters at the end of the string, not just one: